Connect with us

Voice Compression

Discussion in 'Electronic Design' started by Kunal, Jun 17, 2004.

Scroll to continue with content
  1. Kunal

    Kunal Guest

    I am building a digital dictation machine and needed information on
    chips that could compress the voice. I found Voice Band Audio
    Processors on the TI website which seem to be the right choice, but im
    not sure.
    But these chips dont have an external memory interface and I need to
    store the voice data on a external flash chip. The TMS320VC55x series
    does have a external memory interface but then the voice comprassion
    needs to be coded in. Where can I get the program for that.
    Also, which is the latest technique to compress voice. I know of a-law
    u-lam celp, any other out there?

    Kunal
     
  2. Tim Wescott

    Tim Wescott Guest

    A-law and mu-law are quite primitive, having been invented in the 40-s
    or 50-s. ADPCM is much better, and may be your best trade off for
    design time vs. product cost. The best voice compression algorithms
    from the standpoint of compression ratio are the model-based ones used
    for digital cell phones -- these algorithms can really crunch the data
    down, but they require lots of processing power and possibly engineering
    as well.

    Post the algorithm question on comp.dsp; there's a lot of cell-phone
    people over there.

    If you're going to use a DSP and code the algorithm you can use just
    about any DSP; there may be more cost-effective ones than the TI
    processor, unless there's an algorithm that's already written
    specifically for that chip.
     
  3. Ian Stirling

    Ian Stirling Guest

    But then, do you care?
    If flash chips are $.17/megabyte, then simple mu/a-law and 8 bits will
    get you around 20 minutes high quality speech for a couple of dollars.
    And the DSP needed is really, really minimal.
    I'm assuming that the price of small flash chips is proportional to
    the costs of larger compactflash ones.
     
  4. Tim Auton

    Tim Auton Guest

    What a ludicrous assumption.


    Tim
     
  5. How times have changed. I remember when the Intel 28F256's were over
    $80 each. I fried about $600 worth in 2 days for for not getting
    write timing loops tight enough.

    -Chaud Lapin-
     
  6. Michael

    Michael Guest


    To me, "voice compression" conjurs up "audio compression", i.e.
    near-constant loudness. Data compression, on the other hand, means to
    me making a digitized audio file smaller. Is the object of your quest
    near-constant loudness of your audio files or, rather, smaller data
    files? Maybe it's implied in the mention of "a-law" and "u-lam" but
    those terms are new to me.
     
  7. Ian Stirling

    Ian Stirling Guest

    Probably, I was hoping someone will post more accurate figures.
     
  8. speex i said ! <speex.org> very well documented and easy to embed on
    various platforms.

    Habib.
     
  9. onestone

    onestone Guest

    Unfortunately ADPCM gives pretty shitty quality.

    The best voice compression algorithms
    Not as much as you might think. When I last worked commercially on voice
    compression GSM whatever version at the time would run on an ADSP 2105
    DSP, with just 1k code and 0.5k data space.
    The DSP chip suggested by the OP is total overkill in my opinion.
    Although large memory modules, a la SD card etc are fairly cheap, the
    price doesn't scale down to smaller parts. It really depends on how much
    speech you want to store. Given the cost of DSP development tools I'd
    personally look at something simpler. For example you can run many
    simple algorithms on a low end low cost micro. I often embed speech
    record playback into an MSP430F149. You require about 2k per second of
    speech for decent quality. You could use an external chip, but you would
    need to make sure that you can write/read the data in readl time. A very
    simple algorithm for the MSP430 can be found on the yahoo MSP430 group
    ftp site. Compress/Decompress require a total of 51 instructions.
    Whoops, it's gone. So here it is in line. This runs on the MSP430F149
    with 8MHz clock, and gives very tolerable results. You can modify this
    for just about any similarly capable micro. If you are interested then
    email me privately and I'll send you the documentation for it as well.
    This routine does NOT include the FLASH write routines.

    DIV2 EQU 8000H ;MULTIPLIERS TO EQUATE TO A DIVIDE
    DIV4 EQU 4000H
    DIV8 EQU 2000H
    DIV16 EQU 1000H
    DIV32 EQU 0800H
    DIV64 EQU 0400H
    MAXAD EQU 4096 ;EQUIVALENT TO THE MAXIMUM A/D VALUE DIVIDED
    MAX2 EQU 2048 ;BY THE BIT TIME CONSTANT (2-64)
    MAX4 EQU 1024
    MAX8 EQU 512
    MAX16 EQU 256
    MAX32 EQU 128
    MAX64 EQU 64
    SAMPLERATE EQU 20000
    PERIOD EQU 8000000/SAMPLERATE

    ENCODE:
    BIC #0001H,R11 ;PRE-CLEAR RESULT REGISTER
    MOV #MAX8,R9
    MOV R8,R10
    MOV R8,&MPY
    MOV #DIV8,&OP2 ;RESULT HOLDS MODEL/8
    BIT #8000H,&RESLO ;CHECK FOR ROUNDING
    JZ SQUARE
    INC &RESHI
    SQUARE:
    SUB &RESHI,R9 ;512-MODEL/8
    ADD R8,R9 ;(4096-MODEL)/8 + MODEL
    SUB &RESHI,R10 ;MODEL-MODEL/8
    SUB R9,R7 ;GET DIFFERENCE BETWEEN POS AND SAMPLE
    JC NOTMI ;IF CARRY IS SET RESULT WAS POSITIVE
    AND #0FFFFH,R7 ;CONVERT TO POSITIVE NUMBER
    DEC R7
    NOTMI:
    SUB R10,R6 ;CALCULATE DIFFERENCE TO NEGATIVE PREDICTION
    JC NOTNEG ;RESULT WAS POSITIVE
    AND #0FFFFH,R6 ;2'S COMPLEMENT TO GET ABSOLUTE VALUE
    DEC R6
    NOTNEG:
    CMP R6,R7 ;COMPARE R7 TO R6
    JLO ISPOS ;IF R7 < R6 CLOSEST GUESS IS POSITIVE PREDICTION
    RLA R11 ;USE WORD STORAGE FOR COMPACTNESS
    MOV R10,R8
    RET
    ISPOS:
    BIS #0001H,R11
    RLA R11
    MOV R9,R8
    RET

    /************************************************************************

    VOICE CODEC BASED UPON ROMAN BLACKS SINGLE BIT STUFF

    THIS IS THE PLAYBACK ROUTINE

    ON ENTRY:-
    R5 CONTAINS A POINTER TO THE CURRENT WORD IN THE DATA
    STORE
    R6 CONTAINS THE WORD CURRENTLY BEING OUTPUT
    R7 CONTAINS THE BITCOUNT OF SHIFTED BITS

    ************************************************************************/
    DATA_START EQU 04000H ;START OF STORED DATA IN MEMORY
    DATA_END EQU 0FE00H ;END OF STORED DATA

    MOV #DATA_START,R5 ;INITIALISE DATA POINTER
    BIS #CCIE,&CCTLB0 ;INITIALISE TB0 COMPARE INTERRUPT
    ADD #PERIOD,&CCRB0 ;SET TIMING INTERVAL
    BIC #CCIFG,&CCTLB0
    SPEAK:
    JMP SPEAK

    TB0_ISR:
    PUSH R8
    ADD #SAMPLERATE,&CCRB0 ;SET NEXT INT TIME
    TST R7 ;IS THIS FIRST SHIFT
    JNZ GOT_SINE ;NO, SO NO NEED TO LOAD
    MOV @R5+,R6 ;LOAD NEXT WORD
    GOT_SINE:
    RLC R6 ;GET THE CURRENT MSB
    JC COSINE
    BIC.B #BUZZER,&P4OUT ;SET THE PORT PIN
    JMP DOSINE
    COSINE:
    BIS.B #BUZZER,&P4OUT ;AS REQUIRED
    DOSINE:
    INC R7 ;INCREMENT BIT COUNT
    AND #0FH,R7 ;BOUND LIMIT THE BIT COUNTER TO 16 BITS
    JNZ NOTBYTE ;STILL SHIFTING SAME BYTE
    CMP #DATA_END,R5 ;LAST BYTE?
    JNZ NOTBYTE
    BIS #CCIE,&CCTLB0 ;IF SO STOP PLAYBACK
    BIC #BUZZER,&P4OUT
    NOTBYTE:
    RETI

    Cheers

    Al
     
  10. Nico Coesel

    Nico Coesel Guest

    Dump the DSPs. Get a PC (compatible) platform and compress the data
    using readily available software. For instance, Windows comes with all
    major compression algorithmes for free (or a little extra charge), but
    it saves you the hassle of dealing with several licensees.
    A Pentium 4 processor can do heavy compression on more than 100
    channels easely with a far from optimised algorithm while the fastest
    DSPs get stuck at 20 to 30 channels max with an optimised algorithm.
     
  11. onestone

    onestone Guest

    48 channels on an ADSP2105 in 1993. Full duplex system as well.

    Al
     
Ask a Question
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.
Electronics Point Logo
Continue to site
Quote of the day

-