comp.dsp | Newbie Q: Goertzel threshold testing, speech detection

Hi, 

I'm currently using the Goertzel algorithm to do DTMF detection. 
Currently, I get the magnitude for each DTMF frequency, save the sums
of the magnitudes that correspond to each number, then I assume
whichever sum is the highest must the be the most likely DTMF key. 
This works, but seems kludgy.  I do it this way because I'm having a
hard time determining an appropriate threshold value to use to
eliminate frequencies that aren't present (and I'm not even having to
add any robustness against noise yet!).  And although it probably
won't be a problem for my specific implementation, it also seems that
the threshold scale changes pretty dramatically when you move from one
block size to another.  How do you guys traditionally do threshold
testing for DFT coefficients?

I'm also interested in doing speech detection with Goertzel. I've read
that in addition to getting the 8 coefficients for DTMF detection, you
can also detect the presence of speech with an additional 8. Does
anyone know what the additional 8 frequencies are, and are there any
specific ways I'd have to modify my decision logic to do this (i.e.,
do all eight frequencies have to be present, and what frequencies
would I have to compare against if I were doing relative threshold
testing)?

Thanks in advance,
Zack Angelo

Reply by Jim Thomas ●April 12, 20042004-04-12

Zack Angelo wrote:
> Hi, 
> 
> I'm currently using the Goertzel algorithm to do DTMF detection. 
> Currently, I get the magnitude for each DTMF frequency, save the sums
> of the magnitudes that correspond to each number, then I assume
> whichever sum is the highest must the be the most likely DTMF key. 
> This works, but seems kludgy.  I do it this way because I'm having a
> hard time determining an appropriate threshold value to use to
> eliminate frequencies that aren't present (and I'm not even having to
> add any robustness against noise yet!).  And although it probably
> won't be a problem for my specific implementation, it also seems that
> the threshold scale changes pretty dramatically when you move from one
> block size to another.  How do you guys traditionally do threshold
> testing for DFT coefficients?

The way I've seen it done (and done it myself) is by making sure that 
only one row frequency and one column frequency are "present" - 
"present" being defined as "above a threshold."  The ITU specs (Q.21 & 
Q.23? there are two DTMF specs) will tell you what that threshold is 
(probably in dBm's), and you'll have to figure out what that translates 
to in your codec (G.711 might help in that regard too, if you're using 
A-law or mu-law).

The DTMF specs also define a value for "twist" - I can't remember the 
value in the spec, but twist means that the two tones that make up the 
DTMF pair must be close in amplitude to one another.

> 
> I'm also interested in doing speech detection with Goertzel. I've read
> that in addition to getting the 8 coefficients for DTMF detection, you
> can also detect the presence of speech with an additional 8. Does
> anyone know what the additional 8 frequencies are, and are there any
> specific ways I'd have to modify my decision logic to do this (i.e.,
> do all eight frequencies have to be present, and what frequencies
> would I have to compare against if I were doing relative threshold
> testing)?

I think the speech detection you're talking about here is really for 
false-DTMF tone rejection.  The app notes from ADI and TI will have you 
check not only for the presense of two DTMF tones (one row, one column), 
but also the absense of their second harmonics.  If the DTMF frequency 
is present and if its second harmonic is also present, it should not be 
considered DTMF because it is more likely speech.  This isn't so much a 
measure of the likelyhood of speech as it is a measure of the 
unlikelyhood of DTMF.

-- 
Jim Thomas            Principal Applications Engineer  Bittware, Inc
jthomas@bittware.com  http://www.bittware.com          (703) 779-7770
Nothing is ever so bad that it can't get worse. - Calvin

Reply by Steve Underwood ●April 13, 20042004-04-13

Jim Thomas <jthomas@bittware.com> wrote in message news:<107l9h4eupiboaf@corp.supernews.com>...
> Zack Angelo wrote:
> > Hi, 
> > 
> > I'm currently using the Goertzel algorithm to do DTMF detection. 
> > Currently, I get the magnitude for each DTMF frequency, save the sums
> > of the magnitudes that correspond to each number, then I assume
> > whichever sum is the highest must the be the most likely DTMF key. 
> > This works, but seems kludgy.  I do it this way because I'm having a
> > hard time determining an appropriate threshold value to use to
> > eliminate frequencies that aren't present (and I'm not even having to
> > add any robustness against noise yet!).  And although it probably
> > won't be a problem for my specific implementation, it also seems that
> > the threshold scale changes pretty dramatically when you move from one
> > block size to another.  How do you guys traditionally do threshold
> > testing for DFT coefficients?
> 
> The way I've seen it done (and done it myself) is by making sure that 
> only one row frequency and one column frequency are "present" - 
> "present" being defined as "above a threshold."  The ITU specs (Q.21 & 
> Q.23? there are two DTMF specs) will tell you what that threshold is 
> (probably in dBm's), and you'll have to figure out what that translates 
> to in your codec (G.711 might help in that regard too, if you're using 
> A-law or mu-law).
> 
> The DTMF specs also define a value for "twist" - I can't remember the 
> value in the spec, but twist means that the two tones that make up the 
> DTMF pair must be close in amplitude to one another.

These specs actually vary a bit from country to country. Some
countries believe their phone networks are so bad they need greater
twist tolerance :-)

> > 
> > I'm also interested in doing speech detection with Goertzel. I've read
> > that in addition to getting the 8 coefficients for DTMF detection, you
> > can also detect the presence of speech with an additional 8. Does
> > anyone know what the additional 8 frequencies are, and are there any
> > specific ways I'd have to modify my decision logic to do this (i.e.,
> > do all eight frequencies have to be present, and what frequencies
> > would I have to compare against if I were doing relative threshold
> > testing)?
> 
> I think the speech detection you're talking about here is really for 
> false-DTMF tone rejection.  The app notes from ADI and TI will have you 
> check not only for the presense of two DTMF tones (one row, one column), 
> but also the absense of their second harmonics.  If the DTMF frequency 
> is present and if its second harmonic is also present, it should not be 
> considered DTMF because it is more likely speech.  This isn't so much a 
> measure of the likelyhood of speech as it is a measure of the 
> unlikelyhood of DTMF.

The 2nd harmonic test is a good way to do things if you are making a
detector for PSTN use, where it must tolerate some of your dial tone
spilled back from the far end. If its more like an IVR app., with an
echo canceller, a better test is to check the energy in the row and
column hits is a large percentage of the total signal energy. That is
extremely speech immune.

Regards,
Steve

Newbie Q: Goertzel threshold testing, speech detection

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group