Maker Pro
Maker Pro

Anyone for Mux?

P

Paul Burridge

Jan 1, 1970
0
Hi,

In multiplex comms systems, what's the minimum sampling percentage of
plain speech necessary to make it intelligible at the other end?

Thanks,

p.
 
A

artie

Jan 1, 1970
0
Paul Burridge said:
Hi,

In multiplex comms systems, what's the minimum sampling percentage of
plain speech necessary to make it intelligible at the other end?

Thanks,

p.

I'll confess to being confused by the question -- *percentage* of plain
speech.

If you use a 2400Hz bandwidth (used in ham radio SSB, not hi-fi, but it
works), then Nyquist says 4800 samples/second. The sampling percentage
goes to zero as the aperture time of the s/h goes to zero.

So, 4800 samples/second, before any games (coding, compression,
modeling, etc). You can make the most of available bandwidth, or
reduce bits/second using coding techniques (mu-law, adpcm), compression
(all sorts of techniques), modeling (12-point LPC is great for one
voice at a time).

War story:

Once upon a time, a long time ago, the (DARPA) Network Speech
Compression project investigated sending speech over the (then
ARPA)net. We used a 12-point LPC (Linear-predictor-Corrector) running
on a very advanced DEC PDP 11/45 with a butterfly box attached to do
the hard work. The LPC modeled the vocal tract (see, for example,
Markel and Gray). The digitized audio waveform (recorded in a very
quiet sound booth, more on that shortly) goes into the model. Rather
than sending the digitized audio over the net, the model parameters are
transmitted, resulting in an amazing drop in data rate. The LPC
modeling approach was also used by the TI Speak-N-Spell toys; a LOT of
preprocessing gives you a very low data rate. Reconstruction is fairly
easy.

But there's a price to be paid, and it isn't just in the amount of
preprocessing required (which is trivial today, but a pain in the ass
in the 70's). Remember, the LPC models the vocal tract. So if someone
comes into the booth and knocks a book into a metal trashcan, or slams
the door while the system is live, what the listener on the other end
gets is that sound -- as imitated by the human vocal tract! Two people
speaking at the same time? Comes out as one vocal tract trying to
create that sound (and not doing too well at it)!

Conference calls? No "easy" way to sum multiple LPC datastreams. Oh,
you can reconstruct them all to audio samples, and then sum, but you
can't re-encode and transmit as one LPC -- as all those sounds are
going to be modeled coming from *one* vocal tract. And when you
reconstruct multiple LPC streams and sum, it gets very confusing, since
a lot of what makes individual voices individual gets lost through the
LPC filtering process.

Still, it was a fun project and kept a bunch of us off the streets.
 
J

Joerg

Jan 1, 1970
0
Hi Paul,
In multiplex comms systems, what's the minimum sampling percentage of
plain speech necessary to make it intelligible at the other end?

Probably a cell phone network designer can answer that best since in
that business profits are inversely proportional to the number of data
packets sent per second for each connection. But they sometimes seem to
believe it can be even lower than the percentage required for
'intelligible'. I had conversations were I am certain I wouldn't have
understood much had I not known the person at the other end and the way
he or she usually speaks. For example, once I didn't even recognize who
was on the phone or what was said to me but after she handed the cell
phone to my wife I could understand. My wife had to repeat it all and
that way the provider could clock another minute. Ka-ching.

Anyway, it might make sense to use Google to obtain some numbers for
TDMA or GSM networks.

Regards, Joerg
 
C

Clarence

Jan 1, 1970
0
Joerg said:

8K samples of 2 bits will give reasonable speech. As little as 8K sample of one
bit will work, but gets pretty rough. Lower sample rates will work, but limit
the band width even more than fewer bits. Normal (ISDN) digital phones use
64Kb/sec.
 
C

Chris Carlen

Jan 1, 1970
0
Joerg said:
Hi Paul,


Probably a cell phone network designer can answer that best since in
that business profits are inversely proportional to the number of data
packets sent per second for each connection. But they sometimes seem to
believe it can be even lower than the percentage required for
'intelligible'. I had conversations were I am certain I wouldn't have
understood much had I not known the person at the other end and the way
he or she usually speaks. For example, once I didn't even recognize who
was on the phone or what was said to me but after she handed the cell
phone to my wife I could understand. My wife had to repeat it all and
that way the provider could clock another minute. Ka-ching.

Anyway, it might make sense to use Google to obtain some numbers for
TDMA or GSM networks.

Regards, Joerg


Yeah, cell phones really rot, IMHO. There might be a market for
"medium-fi" cell phones doing a solid 8ksps * 8bits.

Oh what the heck, let's start a 48ksps * 16 bits stereo cell phone network!


--
_______________________________________________________________________
Christopher R. Carlen
Principal Laser/Optical Technologist
Sandia National Laboratories CA USA
[email protected] -- NOTE: Remove "BOGUS" from email address to reply.
 
J

Joerg

Jan 1, 1970
0
Hi Chris,
Yeah, cell phones really rot, IMHO. There might be a market for
"medium-fi" cell phones doing a solid 8ksps * 8bits.


I believe it has gotten worse over the last few years. I used to never
have any problems understanding someone on a cell phone or even
recognizing the voice. Nowadays, when someone doesn't start with their
name but says "Hello Joerg" I often have to ask back "Who is it?".
Oh what the heck, let's start a 48ksps * 16 bits stereo cell phone
network!


Or three different plans. Basic: You'll be able to notice that someone
is telling you something. Intermediate: Your can understand more than
50% or your money back. Premium: You can actually experience the voice
quality of a land line. Say, for just $9.95 more a month...

Just FYI: Your email address shows up non-munged at the bottom of your posts. That could invite spam crawlers to pick it up.


Regards, Joerg
 
C

Charles Schuler

Jan 1, 1970
0
My age is 63 and I find cell phones dicey, at best. Hearing tests show me
to be sort of average in acuity for my age. However, there are "notched
frequencies" that vary from one individual to another. Those darned hairs
in the transducer in our inner ear are sharply tuned! I'm guessing the
current number of bits is OK, but a higher sampling rate is in order. I'd
think a user boost in the frequencies around 2 to 3 kHz might be very
helpful for older folks.
 
P

Paul Burridge

Jan 1, 1970
0
My age is 63 and I find cell phones dicey, at best. Hearing tests show me
to be sort of average in acuity for my age. However, there are "notched
frequencies" that vary from one individual to another. Those darned hairs
in the transducer in our inner ear are sharply tuned! I'm guessing the
current number of bits is OK, but a higher sampling rate is in order. I'd
think a user boost in the frequencies around 2 to 3 kHz might be very
helpful for older folks.

Cheer up, Charles. You may not be able to hear what your interlocutor
is saying, but you'll be able to see him/her in much finer detail. All
the extra available bandwidth is going into the video side of it! :-/
 
C

Charles Schuler

Jan 1, 1970
0
Cheer up, Charles. You may not be able to hear what your interlocutor
is saying, but you'll be able to see him/her in much finer detail. All
the extra available bandwidth is going into the video side of it! :-/
--

Yeah, well I've known for several years now that most markets are driven by
youth and that youth is wasted on the young. Cheers.
 
Top