DSPRelated.com
Forums

700 bps Speech Vocoder

Started by Unknown September 28, 2015
I recently wanted to use this vocoder in my own project, so I extracted it from the original program. I also separated out the coherent PSK frequency-division multiplex (FDM) modem. Which is a neat 7 carrier design with another 7 carriers for diversity reception (HF).

I think the speech vocoder is quite understandable for communications quality voice. The original is in the FreeDV application (Amateur Radio).

https://github.com/k5okc/Vocoder700
On Monday, September 28, 2015 at 10:07:29 AM UTC-4, coupay...@gmail.com wrote:
> I recently wanted to use this vocoder in my own project, so I extracted it from the original program. I also separated out the coherent PSK frequency-division multiplex (FDM) modem. Which is a neat 7 carrier design with another 7 carriers for diversity reception (HF). > > I think the speech vocoder is quite understandable for communications quality voice. The original is in the FreeDV application (Amateur Radio). > > https://github.com/k5okc/Vocoder700
Perhaps you should know that no one uses FFT for pitch estimation in speech nowadays: CPU wasteful, very unreliable and horrible time resolution to name just a few reasons.
I think he tried a lot of different pitch algorithms, and chose that path based on background noise issues. (from what I remember). The original author has a lot of historical information in his blog.

http://www.rowetel.com/blog/

I think he's on a different path now, but still in the matlab/octave stage.
On Monday, September 28, 2015 at 11:12:58 AM UTC-4, coupay...@gmail.com wrote:
> I think he tried a lot of different pitch algorithms, and chose that path based on background noise issues. (from what I remember). The original author has a lot of historical information in his blog. > > http://www.rowetel.com/blog/ > > I think he's on a different path now, but still in the matlab/octave stage.
Nice dude, but who pays his mortgage and bills ?
On Monday, September 28, 2015 at 11:12:58 AM UTC-4, coupay...@gmail.com wrote:
> I think he tried a lot of different pitch algorithms, and chose that path based on background noise issues. (from what I remember). The original author has a lot of historical information in his blog. > > http://www.rowetel.com/blog/ > > I think he's on a different path now, but still in the matlab/octave stage.
Also, why 700 bps ? Strange... usually its 2400, 1200, 600... He might be better off "borrowing" source code for some "standard" speech codec and "creatively rewriting" it for his purposes
> Why 700?
It worked out he needed 27 bits, but he threw in a spare bit to be used as a text bit (slow speed scrolling text, like callsign, or commands. Since the vocoder has a 40ms frame (25 Hz) this results in 25 x 28 = 700. The text bit is interesting, in that it uses a varicode.
On 09/28/2015 11:42 PM, angrydude wrote:
> On Monday, September 28, 2015 at 11:12:58 AM UTC-4, coupay...@gmail.com wrote: >> I think he tried a lot of different pitch algorithms, and chose that path based on background noise issues. (from what I remember). The original author has a lot of historical information in his blog. >> >> http://www.rowetel.com/blog/ >> >> I think he's on a different path now, but still in the matlab/octave stage. > > Also, why 700 bps ? Strange... usually its 2400, 1200, 600... > > He might be better off "borrowing" source code for some "standard" speech codec and "creatively rewriting" it for his purposes >
How could he borrow from other sources, when there is nothing out there that does nearly as well at such low bit rates as his codec?
On Monday, September 28, 2015 at 12:57:34 PM UTC-4, Steve Underwood wrote:
> On 09/28/2015 11:42 PM, angrydude wrote: > > On Monday, September 28, 2015 at 11:12:58 AM UTC-4, coupay...@gmail.com wrote: > >> I think he tried a lot of different pitch algorithms, and chose that path based on background noise issues. (from what I remember). The original author has a lot of historical information in his blog. > >> > >> http://www.rowetel.com/blog/ > >> > >> I think he's on a different path now, but still in the matlab/octave stage. > > > > Also, why 700 bps ? Strange... usually its 2400, 1200, 600... > > > > He might be better off "borrowing" source code for some "standard" speech codec and "creatively rewriting" it for his purposes > > > How could he borrow from other sources, when there is nothing out there > that does nearly as well at such low bit rates as his codec?
Nothing really ? How about MELPe codec at 600/1200/2400 bps ? (No, I am not sending him the source code but it's in relatively wide distribution) Original MELP standard is well documented too Tons of other well documented speech coding standards or open source alternatives as well, at higher bit rates though... He is much better off interpolating between frames of a "standard" well-tested e.g. 1200 or 2400 bps codec than developing his own 700 bps codec from scratch... especially when noticing how he uses FFT for pitch estimation... Just "objective" subjective testing and comparison of speech intelligibility at such low bit rates is a daunting proposition.
One of the problems with Speech Vocoders, is that it heavily infested with lawyers. MELP was used early on for some radio projects and the lawyers came out like a swarm of yellow jackets.

The original author (he's not responsible for my antics with the code here), made a personal decision to go with Open Source, Open Hardware. He sells a low bit-rate Digital Voice modem for radio that is open to the public to do with as they see fit.

Burn your own PC boards, create your own alternative designs with it, etc.

The motto being: "We own the Stack"

Can't get that with patented, proprietary code.

Obviously this is not very interesting to people who just want to buy a chip for $10 and solder it in. It's a different tribe. Sort of like a bunch of hippies, only they are still contributing to society :-)
> ...CPU wasteful
You can get an STM32F4 development board for $15, and never run out of CPU or memory doing speech coding. Don't have to worry about fixed point math, etc. The only requirement, is that the analysis/synthesis code executes in less than 10ms. We're no longer in the Z-80 days kids! Blow the lid off your IEEE floating point paint cans...