ip phone design considerations

Apparatus · Jun 27, 2004

Hello,

I am planning on designing an IP phone for a student project. I would
like some advice on details that I need to take into consideration.

My basic thought is this: Use an ADC to take 12-bit samples of the
microphone at 8kHz. Next encode the samples into u-law g.711 PCM. Next
transmit over UDP packet. Next decode from u-law. Next send to 12-bit
DAC connected to PGA or op amp, then to speaker. Note that I am
planning on using a DSP on both ends, probably a Microchip dsPIC due
to the affordability of the development tools.

The concerns I have are mainly on the analog ends.

1) How much of a problem will noise be with this scheme? Should I add
a BPF before encoding the remove frequencies outside 400Hz to 4kHz?
How else should I remove noise?

2) Is interpolation on the receiving end needed to achieve a toll
quality signal? Can this interpolation be a simple capacitor or coil?
Will the speaker be a good enough interpolator?

3) In general, how is a microphone connected to a ADC? A DAC to a
speaker?

4) Is UDP a reliable enough transmit method? Should I add a 100ms
buffer for frame delays? Should I repeat the last frame if a frame is
omitted?

5) What else should I consider? Any other suggestions?

Cheers.

Ian Stirling · Jun 27, 2004

Apparatus said:
Hello,

I am planning on designing an IP phone for a student project. I would
like some advice on details that I need to take into consideration.

My basic thought is this: Use an ADC to take 12-bit samples of the
microphone at 8kHz. Next encode the samples into u-law g.711 PCM. Next
transmit over UDP packet. Next decode from u-law. Next send to 12-bit
DAC connected to PGA or op amp, then to speaker. Note that I am
planning on using a DSP on both ends, probably a Microchip dsPIC due
to the affordability of the development tools.

The concerns I have are mainly on the analog ends.

1) How much of a problem will noise be with this scheme? Should I add
a BPF before encoding the remove frequencies outside 400Hz to 4kHz?
How else should I remove noise?

For a proof of concept, it'll probably be fine.

2) Is interpolation on the receiving end needed to achieve a toll
quality signal? Can this interpolation be a simple capacitor or coil?
Will the speaker be a good enough interpolator?

Yes, for a proof of concept.
Though you probably ideally want a filter.
A digital filter and output at 16Khz and a very simple r/c filter will
work well.

3) In general, how is a microphone connected to a ADC? A DAC to a
speaker?

Appropriate pre/power amp.
Search on http://www.ti.com/ for
speaker DAC
and similar, to get ideas.

4) Is UDP a reliable enough transmit method? Should I add a 100ms
buffer for frame delays? Should I repeat the last frame if a frame is
omitted?

UDP is perfectly reliable on some networks, and utterly useless on
others, as there is no guarantee.
You need to look at the RFC for TCP/IP, and look at implementing a
simple protocol.

Tim Wescott · Jun 27, 2004

Apparatus said:
Hello,

I am planning on designing an IP phone for a student project. I would
like some advice on details that I need to take into consideration.

My basic thought is this: Use an ADC to take 12-bit samples of the
microphone at 8kHz. Next encode the samples into u-law g.711 PCM. Next
transmit over UDP packet. Next decode from u-law. Next send to 12-bit
DAC connected to PGA or op amp, then to speaker. Note that I am
planning on using a DSP on both ends, probably a Microchip dsPIC due
to the affordability of the development tools.

The concerns I have are mainly on the analog ends.

1) How much of a problem will noise be with this scheme? Should I add
a BPF before encoding the remove frequencies outside 400Hz to 4kHz?
How else should I remove noise?

An anti-aliasing filter would be good. Any frequency components seen at
the ADC will be aliased around 8kHz, so a 6kHz signal will come out at
2kHz. You can either use a multiple-pole lowpass (probably with at
cutoff around 3 or 3.5kHz), or you can oversample by a bunch, use a
simple anti-aliasing filter in analog, and use a better one in digital.

You should be able to use a cookbook filter for this.

2) Is interpolation on the receiving end needed to achieve a toll
quality signal? Can this interpolation be a simple capacitor or coil?
Will the speaker be a good enough interpolator?

Yes, interpolation is necessary for a toll quality signal. The
interpolation should be a filter more or less like the one on your
input, except that if you sample out at 8kHz you should also take the
zero-order hold rolloff into account.

A coil won't be simple at 8kHz, because it needs to be on the order of
50-100mH (that's millihenry), and fairly high quality. Stick to active
filters.

Even with the rolloff taken into account this should be cookbook stuff
-- just find the right cookbook!

3) In general, how is a microphone connected to a ADC? A DAC to a
speaker?

Dig out a schematic for a PA amplifier -- the mic gets preamplified and
applied to the ADC, the DAC gets amplified to the speaker. If you have
time on your project you should apply AGC to the mic signal -- this is
probably more important than anti-aliasing and reconstruction, even.

4) Is UDP a reliable enough transmit method? Should I add a 100ms
buffer for frame delays? Should I repeat the last frame if a frame is
omitted?

I don't know UDP from UPS, but you should remember that sending voice is
a real-time endeavor. A buffer would probably be good, as long as you
have reliable flow control to keep it from overflowing. _If_ you have
time before you need the data you could repeat a frame, but keep in mind
that time waits for no one, a late frame is worse than no frame at all.

5) What else should I consider? Any other suggestions?

Have fun. And get the oldest, funkiest handset you can find for your
demo

.

Allan Herriman · Jun 28, 2004

Hello,

I am planning on designing an IP phone for a student project. I would
like some advice on details that I need to take into consideration.
[snip]

4) Is UDP a reliable enough transmit method? Should I add a 100ms
buffer for frame delays? Should I repeat the last frame if a frame is
omitted?

There are already protocols for transmitting Voice Over IP, such as
VoIP. Use it, as the designers of that protocol have already worked
out solutions for problems you don't even know you have yet.

VoIP uses RTP over UDP; TCP is too slow.

Regards,
Allan.

Richard · Jun 28, 2004

Apparatus said:
4) Is UDP a reliable enough transmit method? Should
I add a 100ms buffer for frame delays? Should I
repeat the last frame if a frame is omitted?

Yes, UDP is exactly the right protocol. And even if a packet gets lost
or clobbered, a re-transmission is useless by the time it's detected and
acted upon - in the case of TCP, the transmission stream has probably
paused while this occurs too, causing more problems.

At a minimum, look at RTP (real-time protocol) which run on UDP. It
timestamps the packets at the source so playback is properly paced.
Instead of re-transmitting, figure out how you can recover the missing
packet, or approximate its content.

100mS is too much buffer. A goal with VoIP is to keep the round-trip
latency below 150mS, including the codec time, otherwise there's an
irritating delay in the speech.

Like you'll see RealPlayer do, try to buffer enough (for better
reliability), but as little as possible (for better performance / less
delay). Adapt to the performance of the data stream.

I'd shy away from VoIP interoperability unless you can source the code
somewhere easily (or use a chip implementation). It's a good model for
reference, but I think you'll find it'd make your project too complex.

5) What else should I consider? Any other suggestions?

* I think you are over-simplifying the non-analog parts, even if you use
an existing design. But for a student project, it needn't be
carrier-grade, and a workable solution could be hacked out.

* When demoing it, have the two subjects separated where they can
neither see nor hear the other party, except through the phone.
Otherwise, transmission delays become very noticable.

* Other ideas - change the codec real-time, based on the quality of the
connection. G.711 when you can (LAN), compression otherwise (WAN).

* Power over Ethernet is the hot marketing rage among power chip
makers. Pickup a copy of EETimes and look for the ads. Power the phone
across the LAN - not a lot of folks have seen this done yet, so it'll
have a coolness factor.

Good luck!
Richard

Clifford Heath · Jun 28, 2004

Richard said:
A goal with VoIP is to keep the round-trip
latency below 150mS, including the codec time, otherwise there's an
irritating delay in the speech.

With TCP, dropped packets cause a delay, but not loss. TCP is
more firewall-friendly than UDP, so I've been wondering whether a
catchup technique could be used. Something like realising that
you've fallen behind, waiting silently until you receive more
samples, then playing 20ms out of every 25ms, dropping 5 ms and
splicing the missing pieces together (say fading the 15-20ms across
to the 20-25 to avoid pops), until you catch up. The pitch would
stay largely the same, only 5ms is lost from any consonant, and any
residual 40Hz component can easily be filtered out. Dropping
sections like this won't affect the pitch, and though the rhythm
changes might be disconcerting and would be unacceptable for music,
it should work alright in conversation and better than losing every
second syllable as in a bad phone connection.

Might this work ok?

Clifford Heath.

Richard · Jun 28, 2004

Clifford said:
With TCP, dropped packets cause a delay, but not
loss.

True, but this is an application that tolerates lossy transmission. And
the added delay for re-transmission actually lowers the overall quality
while providing an unusable degree of perfection.

With TCP, the delay for detection & recovery will vary based on the
connection's characteristics. On the surface, TCP will seem to work
fine in a controlled environment with no loss, only to fail miserably
(in this application) when an unstable connection is encountered.

Consider a cross-US link with typical 70msec round trip time (RTT). If
a packet is instantly detected as lost, then retransmitted, you would
need to buffer at least 70msec at the receiving end to allow for this
add'l 70msec retransmission (plus ~20msec for jitter, regardless).
Before you begin, you've consumed 160 of your 150msec delay budget (70ms
normal, 70 for re-tx, 20 for jitter), and the coast-to-coast call starts
sounding trans-Pacific.

Also, depending on the TCP Receive Window Size of the receiver, it's
possible the re-transmissions will cause normal traffic to be
suspended. And unless a newer feature called Selective ACK is enabled
on the endpoints, the loss of one packet can trigger the re-transmission
of many subsequent packets until the TCP acknowledgements catch up
(worse during higher RTTs).

TCP is more firewall-friendly than UDP, so I've
been wondering whether a catchup technique could be
used.

I expect there are a variety of ways to handle this, including
forward-error-correction schemes. Or extending tones by an msec here
and there to invisibly build the buffer back up (similar to your
suggestion).

Of course, these schemes apply equally to either TCP or UDP transport.
But TCP re-transmits the lost frames, adding traffic you won't be using
(a problem on slow links).

Yes, TCP is generally more firewall-friendly. I'd suggest this is more
an issue with corporate firewalls where rules have been explicitly set.
Consumer firewalls are highly adaptable now, and they'll automatically
permit a reciprocal stream for UDP.

Cheers,
Richard

Richard · Jun 28, 2004

Clifford,

Another twist I just recalled - the local TCP stack delivers data to the
application in sequence, of course. If a packet is lost, subsequent
packets are held in the receiver's TCP buffer and are not delivered to
the application until the re-transmission is received. I don't think
there's a way to disable this behavior, because it's pretty core to
TCP's feature set.

So, if a small loss occurs with TCP, you won't have the subsequent data
and can't approximate (or rebuild) the missing piece - you have to wait
for the re-tx. With UDP, you'd get all the bits as they arrive
(whatever sequence) and have full control of the error recovery behavior
in "real time".

TCP's a great protocol, but there are some places it doesn't belong.
This is a great example of one such application.

Tim Auton · Jun 28, 2004

Clifford Heath said:
With TCP, dropped packets cause a delay, but not loss.

Which is exactly what you don't want. Data loss is more tolerable than
delays for this kind of application (unless you have a very fast
network and can re-transmit in an imperceptibly small time frame - but
then if your network performance is that good you're probably not
dropping many packets anyway). You can interpolate lost packets (if
there are only a few) in software and let the human brain do the rest,
it's pretty good at that kind of thing. Re-transmitting also increases
the probability of later packets being lost (as you are using the
bandwidth for old data instead of timely new data).

TCP is
more firewall-friendly than UDP, so I've been wondering whether a
catchup technique could be used. Something like realising that
you've fallen behind, waiting silently until you receive more
samples, then playing 20ms out of every 25ms, dropping 5 ms and
splicing the missing pieces together (say fading the 15-20ms across
to the 20-25 to avoid pops), until you catch up.

You'd have to write your own TCP stack, which is non-trivial. If
you're going to write your own protocol it's easier to layer it on top
of UDP than write a bastardised TCP with all the additional complexity
that brings which you don't need or want

A decent stateful firewall can deal with reciprocal UDP streams
anyway. Besides, for a student project I don't think compatibility
with the widest range of firewalls is a major concern.

Detecting lost packets and asking the sender to reduce the data rate
(with some minimum of course) is probably desirable, but
re-transmitting is probably not much use. Overlapping data to help the
receiver interpolate for lost packets would probably be more use (each
packet contains its own high-fidelity data and a low-fidelity version
of the next packet, for example).

Tim

Paul Hovnanian P.E. · Jun 28, 2004

Tim said:
Which is exactly what you don't want. Data loss is more tolerable than
delays for this kind of application (unless you have a very fast
network and can re-transmit in an imperceptibly small time frame - but
then if your network performance is that good you're probably not
dropping many packets anyway). You can interpolate lost packets (if
there are only a few) in software and let the human brain do the rest,
it's pretty good at that kind of thing. Re-transmitting also increases
the probability of later packets being lost (as you are using the
bandwidth for old data instead of timely new data).

You'd have to write your own TCP stack, which is non-trivial. If
you're going to write your own protocol it's easier to layer it on top
of UDP than write a bastardised TCP with all the additional complexity
that brings which you don't need or want

A decent stateful firewall can deal with reciprocal UDP streams
anyway. Besides, for a student project I don't think compatibility
with the widest range of firewalls is a major concern.

I'm not sure if this will work. Its been a while since I got this deep
into protocol stacks:

Negotiate a connection between two peers using TCP. This will allow the
firewall to recognize the connection between the IP

ort pairs for that
connection. Then, have each application switch to UDP using the same
IP

ort numbers without closing the TCP socket (until the end of the
conversation). Unless the firewall examines the TCP headers, it will
pass both protocols back and forth and remap IPs (for private networks).

Wim Lewis · Jun 29, 2004

I am planning on designing an IP phone for a student project. I would
like some advice on details that I need to take into consideration. [...]
4) Is UDP a reliable enough transmit method? Should I add a 100ms
buffer for frame delays? Should I repeat the last frame if a frame is
omitted?

"Reliable enough" depends on the specific network that's carrying
the IP traffic. As others have pointed out, if you're just going over
one ethernet segment, it's pretty reliable; if you're transmitting to
another continent, expect to have packets dropped regularly.

Rather than repeating the last piece of sound if you have a gap
in the data, it's probably better to fill it with silence. This
is less distracting to a human listener (expecially if you have
a whole bunch of gaps in a row, i.e., the transmitter has stopped).

In addition to lost packets, you might also have to deal with:
- packets arriving out of order
- packets getting duplicated and arriving twice (this is rare)
- the transmitter and receiver's sample clocks being of slightly
different frequencies
- the network delay between the transmitter and the receiver
can change over time

The solutions to these are left as an exercise to the student

but you'll find lots of discussion about it on the net. In general,
you end up having a tradeoff between solving these problems and
introducing more end-to-end delay. The longer the delay is, the
harder it is to hold a conversation over the link. Even fairly short
delays can have a subliminal effect on how the person at one end
perceives the other person's emotions.

(Re TCP --- the thing to realize about TCP is that a TCP packet
(that is, an IP packet carrying a piece of a TCP session) is no
more reliable than a UDP packet. TCP simply notices when a packet
has been dropped and [oversimplification] requests that it be
retransmitted. In the meantime, the receiver doesn't see any data
--- the connection pauses for a bit. This is the right behavior
for, e.g., a file transfer, but it's not what you want for a real-time
application.)

Moore's Lobby Podcast

Menu

Categories

Platforms

Content

Connect With Us

Network

ip phone design considerations

ip phone design considerations

Apparatus

Ian Stirling

Tim Wescott

Allan Herriman

Richard

Clifford Heath

Richard

Richard

Tim Auton

Paul Hovnanian P.E.

Wim Lewis

Similar threads