comp.dsp | frequency detection (goertzel) - comparing detected tone energy vs. total signal energy

hi,

i'm trying to implement the goertzel algorithm to detect a small
number of frequencies in an audio signal (16-bit, 8kHz).

i'm using the approach/code outlined in this article:
http://www.embedded.com/story/OEG20020819S0057

i've seen a post on comp.dsp which suggests to compare the sum of
squares of the samples for the block used in the goertzel iteration
and compare it to the energy of the tone being detected in order to
calculate if the frequency represents the 'majority' of the signal's
energy...however, i'm having trouble making this calculation.

here's the snippet from a posting located at: http://www.dsprelated.com/showmessage/38661/1.php

[snip]
> Parseval's theorem relates the
> time domain and frequency domain energies associated with Fourier
> transforms. So you compute the time domain energy for a block of data -
> i.e., it is the sum of squares of the samples, and then you compare it
> with the energies in the two tones. If the agreement is good, then you
> have pure tones and not some wideband source...
[snip]

however, when i tried to implement this, i get a goertzel result that
is several times larger than the sum of squares of the samples in the
block. ...i was thinking that the goertzel result (relative magnitude
squared, as outlined in the article) divided by the sum of squares for
the block would lead to a value between 0.0 and 1.0.

i followed the source code from the article directly. a few things to
note about it (i'm not sure if this matters):

* the signal is not normalized (does this matter? the signal is 8kHz,
16bit signed values, and i'm operating w/float values when computing
the goertzel, etc.);
* i'm not doing any sort of windowing function on the signal;
* each time i calculate the goertzel, i slide the block of N by N/2
samples (does this matter?);
* i've tested the code with various generated signals of a fixed
frequency (e.g., from 500Hz - 1000Hz, with a duration of 1-2 seconds,
with pre- and post-silence of 1-2 seconds)) and varied my N value
(e.g., from 100 to 1000) and use a target frequency for the goertzel
of the same value i used to generate the signal;

...however, i can't seem to calculate the 'proportion' that the target
frequency's energy is vs. the total signal energy (e.g., a value
between 0.0 and 1.0).

i've played around with other signal energy/power calculations to see
if i could stumble on a result, but they don't see to be much help
when comparing to the goertzel 'relative magnitude squared' result
(though they are helpful for other purposes):

* the teager-kaiser energy calcuation, which i believe is (in pseudo-
code): energy[i] = (signal[i] * signal[i]) - (signal[i + 1] * signal[i
- 1]);

...where 'i' is the index into the signal's array of non-normalized 16-
bit values (stored as floats).

* signal power, which i believe is: power[i] = signal[i] * signal[i];

* a windowed power equation i came across: power[i] = power[i - 1] +
(signal[i] * signal[i]) - (signal[i - windowLength] * signal[i -
windowLength]);

* a 'smoothed power' equation: smoothedPower[i] = (weight *
smoothedPower[i - 1] + ((1 - weight) * power[i]);

any help would be greatly appreciated. sorry if there are novice
questions...

thanks, tom

Reply by Clay ●May 25, 20072007-05-25

On May 24, 7:52 pm, tj <tomjohns...@gmail.com> wrote:
> hi,
>
> i'm trying to implement the goertzel algorithm to detect a small
> number of frequencies in an audio signal (16-bit, 8kHz).
>
> i'm using the approach/code outlined in this article:http://www.embedded.com/story/OEG20020819S0057
>
> i've seen a post on comp.dsp which suggests to compare the sum of
> squares of the samples for the block used in the goertzel iteration
> and compare it to the energy of the tone being detected in order to
> calculate if the frequency represents the 'majority' of the signal's
> energy...however, i'm having trouble making this calculation.
>
> here's the snippet from a posting located at:http://www.dsprelated.com/showmessage/38661/1.php
>
> [snip]> Parseval's theorem relates the
> > time domain and frequency domain energies associated with Fourier
> > transforms. So you compute the time domain energy for a block of data -
> > i.e., it is the sum of squares of the samples, and then you compare it
> > with the energies in the two tones. If the agreement is good, then you
> > have pure tones and not some wideband source...
>
> [snip]
>
> any help would be greatly appreciated. sorry if there are novice
> questions...
>
> thanks, tom

Hello Tom,

Let Et be the sum of the squares of your data in the time domain.
Let Ew be the sum of the magnitudes squared of your data in the
frequency domain.
Let N be the number of samples in your data block.

Then the purity is simply (2*Ew)/(N*Et) (for f != 0 or 1/2 f_sample if
N is even)


For example let your data be xi=cos(2*pi*k*i/N) where i,k are in
[0,1,2,3,...,N-1]

Then Et = sum cos^2(2*pi*k*i/N) = N/2

For the frequency k, Hk=sum(xi*exp(2*pi*k*i/N)) = N/2, Thus Ew=(N/2)^2
= N^2 /4

p=2*((N^2)/4) / N*(N/2) = 1.00000

If the frequency of your data does not have an integral k, then the p
< 1.

I hope this helps,

Clay

Reply by Tim Wescott ●May 26, 20072007-05-26

On Thu, 24 May 2007 16:52:39 -0700, tj wrote:

> hi,
> 
> i'm trying to implement the goertzel algorithm to detect a small
> number of frequencies in an audio signal (16-bit, 8kHz).
> 
> i'm using the approach/code outlined in this article:
> http://www.embedded.com/story/OEG20020819S0057
> 
> i've seen a post on comp.dsp which suggests to compare the sum of
> squares of the samples for the block used in the goertzel iteration
> and compare it to the energy of the tone being detected in order to
> calculate if the frequency represents the 'majority' of the signal's
> energy...however, i'm having trouble making this calculation.
> 
> here's the snippet from a posting located at: http://www.dsprelated.com/showmessage/38661/1.php
> 
> [snip]
>> Parseval's theorem relates the
>> time domain and frequency domain energies associated with Fourier
>> transforms. So you compute the time domain energy for a block of data -
>> i.e., it is the sum of squares of the samples, and then you compare it
>> with the energies in the two tones. If the agreement is good, then you
>> have pure tones and not some wideband source...
> [snip]
> 
> however, when i tried to implement this, i get a goertzel result that
> is several times larger than the sum of squares of the samples in the
> block. ...i was thinking that the goertzel result (relative magnitude
> squared, as outlined in the article) divided by the sum of squares for
> the block would lead to a value between 0.0 and 1.0.

Indeed it should.  Chances are that you have a scaling problem in your
algorithm -- have you tried running a pure tone through it to get the
scaling factor?
> 
> i followed the source code from the article directly. a few things to
> note about it (i'm not sure if this matters):
> 
> * the signal is not normalized (does this matter? the signal is 8kHz,
> 16bit signed values, and i'm operating w/float values when computing the
> goertzel, etc.);

If you're using floats then you don't have to normalize, just keep track
of your magnitudes.

> * i'm not doing any sort of windowing function on the signal; * each
> time i calculate the goertzel, i slide the block of N by N/2 samples
> (does this matter?);

It's not clear what you mean here.  If you do a Goertzel algorithm for a
block of N points, then start out fresh on another block of N points you
should be OK.  If those two blocks should happen to overlap by N/2 that's
OK, too.  If you're somehow using the leftover state from the Goertzel
algorithm on the newly slid block -- then you're in weird math land, and
I'd have to expend effort to figure out if it's OK or not.

> * i've tested the code with various generated signals of a fixed
> frequency (e.g., from 500Hz - 1000Hz, with a duration of 1-2 seconds,
> with pre- and post-silence of 1-2 seconds)) and varied my N value (e.g.,
> from 100 to 1000) and use a target frequency for the goertzel of the
> same value i used to generate the signal;
> 
> ...however, i can't seem to calculate the 'proportion' that the target
> frequency's energy is vs. the total signal energy (e.g., a value between
> 0.0 and 1.0).
> 
> i've played around with other signal energy/power calculations to see if
> i could stumble on a result, but they don't see to be much help when
> comparing to the goertzel 'relative magnitude squared' result (though
> they are helpful for other purposes):
> 
> * the teager-kaiser energy calcuation, which i believe is (in pseudo-
> code): energy[i] = (signal[i] * signal[i]) - (signal[i + 1] * signal[i -
> 1]);
> 
> ...where 'i' is the index into the signal's array of non-normalized 16-
> bit values (stored as floats).
> 
> * signal power, which i believe is: power[i] = signal[i] * signal[i];
> 
> * a windowed power equation i came across: power[i] = power[i - 1] +
> (signal[i] * signal[i]) - (signal[i - windowLength] * signal[i -
> windowLength]);
> 
> * a 'smoothed power' equation: smoothedPower[i] = (weight *
> smoothedPower[i - 1] + ((1 - weight) * power[i]);
> 
> any help would be greatly appreciated. sorry if there are novice
> questions...
> 
> thanks, tom



-- 
Tim Wescott
Control systems and communications consulting
http://www.wescottdesign.com

Need to learn how to apply control theory in your embedded system?
"Applied Control Theory for Embedded Systems" by Tim Wescott
Elsevier/Newnes, http://www.wescottdesign.com/actfes/actfes.html

Reply by tj ●May 26, 20072007-05-26

On May 25, 3:10 pm, Clay <phys...@bellsouth.net> wrote:
> On May 24, 7:52 pm, tj <tomjohns...@gmail.com> wrote:
>
>
>
>
>
> > hi,
>
> > i'm trying to implement the goertzel algorithm to detect a small
> > number of frequencies in an audio signal (16-bit, 8kHz).
>
> > i'm using the approach/code outlined in this article:http://www.embedded.com/story/OEG20020819S0057
>
> > i've seen a post on comp.dsp which suggests to compare the sum of
> > squares of the samples for the block used in the goertzel iteration
> > and compare it to the energy of the tone being detected in order to
> > calculate if the frequency represents the 'majority' of the signal's
> > energy...however, i'm having trouble making this calculation.
>
> > here's the snippet from a posting located at:http://www.dsprelated.com/showmessage/38661/1.php
>
> > [snip]> Parseval's theorem relates the
> > > time domain and frequency domain energies associated with Fourier
> > > transforms. So you compute the time domain energy for a block of data -
> > > i.e., it is the sum of squares of the samples, and then you compare it
> > > with the energies in the two tones. If the agreement is good, then you
> > > have pure tones and not some wideband source...
>
> > [snip]
>
> > any help would be greatly appreciated. sorry if there are novice
> > questions...
>
> > thanks, tom
>
> Hello Tom,
>
> Let Et be the sum of the squares of your data in the time domain.
> Let Ew be the sum of the magnitudes squared of your data in the
> frequency domain.
> Let N be the number of samples in your data block.
>
> Then the purity is simply (2*Ew)/(N*Et) (for f != 0 or 1/2 f_sample if
> N is even)
>
> For example let your data be xi=cos(2*pi*k*i/N) where i,k are in
> [0,1,2,3,...,N-1]
>
> Then Et = sum cos^2(2*pi*k*i/N) = N/2
>
> For the frequency k, Hk=sum(xi*exp(2*pi*k*i/N)) = N/2, Thus Ew=(N/2)^2
> = N^2 /4
>
> p=2*((N^2)/4) / N*(N/2) = 1.00000
>
> If the frequency of your data does not have an integral k, then the p
> < 1.
>
> I hope this helps,
>
> Clay- Hide quoted text -
>
> - Show quoted text -

hi clay,

thanks for the info.

so, if i read your posting correctly, if i am detecting just a single
frequency using the goertzel algorithm and, after initializing the
goertzel algorithm and running it on N samples, i get its magnitude
squared (i.e., Ew), as well as the sum of squares of the block of N
non-normalized samples (i.e., Et) that the goertzel algorithm operated
on, then i should be able to calculated the purity by doing this:

purity = (2*Ew)/(N*Et)

and when you wrote "for f != 0 or 1/2 f_sample if N is even", i read
this as meaning that the 'frequency being detected must be non-zero
and less that 1/2 the sampling rate if N is even.' (if i'm off, please
let me know.)

thanks again for your help!

tom

Reply by tj ●May 26, 20072007-05-26

On May 25, 11:31 pm, Tim Wescott <t...@seemywebsite.com> wrote:
> On Thu, 24 May 2007 16:52:39 -0700, tj wrote:
> > hi,
>
> > i'm trying to implement the goertzel algorithm to detect a small
> > number of frequencies in an audio signal (16-bit, 8kHz).
>
> > i'm using the approach/code outlined in this article:
> >http://www.embedded.com/story/OEG20020819S0057
>
> > i've seen a post on comp.dsp which suggests to compare the sum of
> > squares of the samples for the block used in the goertzel iteration
> > and compare it to the energy of the tone being detected in order to
> > calculate if the frequency represents the 'majority' of the signal's
> > energy...however, i'm having trouble making this calculation.
>
> > here's the snippet from a posting located at:http://www.dsprelated.com/showmessage/38661/1.php
>
> > [snip]
> >> Parseval's theorem relates the
> >> time domain and frequency domain energies associated with Fourier
> >> transforms. So you compute the time domain energy for a block of data -
> >> i.e., it is the sum of squares of the samples, and then you compare it
> >> with the energies in the two tones. If the agreement is good, then you
> >> have pure tones and not some wideband source...
> > [snip]
>
> > however, when i tried to implement this, i get a goertzel result that
> > is several times larger than the sum of squares of the samples in the
> > block. ...i was thinking that the goertzel result (relative magnitude
> > squared, as outlined in the article) divided by the sum of squares for
> > the block would lead to a value between 0.0 and 1.0.
>
> Indeed it should.  Chances are that you have a scaling problem in your
> algorithm -- have you tried running a pure tone through it to get the
> scaling factor?
>
>
>
> > i followed the source code from the article directly. a few things to
> > note about it (i'm not sure if this matters):
>
> > * the signal is not normalized (does this matter? the signal is 8kHz,
> > 16bit signed values, and i'm operating w/float values when computing the
> > goertzel, etc.);
>
> If you're using floats then you don't have to normalize, just keep track
> of your magnitudes.
>
> > * i'm not doing any sort of windowing function on the signal; * each
> > time i calculate the goertzel, i slide the block of N by N/2 samples
> > (does this matter?);
>
> It's not clear what you mean here.  If you do a Goertzel algorithm for a
> block of N points, then start out fresh on another block of N points you
> should be OK.  If those two blocks should happen to overlap by N/2 that's
> OK, too.  If you're somehow using the leftover state from the Goertzel
> algorithm on the newly slid block -- then you're in weird math land, and
> I'd have to expend effort to figure out if it's OK or not.
>
>
>
>
>
> > * i've tested the code with various generated signals of a fixed
> > frequency (e.g., from 500Hz - 1000Hz, with a duration of 1-2 seconds,
> > with pre- and post-silence of 1-2 seconds)) and varied my N value (e.g.,
> > from 100 to 1000) and use a target frequency for the goertzel of the
> > same value i used to generate the signal;
>
> > ...however, i can't seem to calculate the 'proportion' that the target
> > frequency's energy is vs. the total signal energy (e.g., a value between
> > 0.0 and 1.0).
>
> > i've played around with other signal energy/power calculations to see if
> > i could stumble on a result, but they don't see to be much help when
> > comparing to the goertzel 'relative magnitude squared' result (though
> > they are helpful for other purposes):
>
> > * the teager-kaiser energy calcuation, which i believe is (in pseudo-
> > code): energy[i] = (signal[i] * signal[i]) - (signal[i + 1] * signal[i -
> > 1]);
>
> > ...where 'i' is the index into the signal's array of non-normalized 16-
> > bit values (stored as floats).
>
> > * signal power, which i believe is: power[i] = signal[i] * signal[i];
>
> > * a windowed power equation i came across: power[i] = power[i - 1] +
> > (signal[i] * signal[i]) - (signal[i - windowLength] * signal[i -
> > windowLength]);
>
> > * a 'smoothed power' equation: smoothedPower[i] = (weight *
> > smoothedPower[i - 1] + ((1 - weight) * power[i]);
>
> > any help would be greatly appreciated. sorry if there are novice
> > questions...
>
> > thanks, tom
>
> --
> Tim Wescott
> Control systems and communications consultinghttp://www.wescottdesign.com
>
> Need to learn how to apply control theory in your embedded system?
> "Applied Control Theory for Embedded Systems" by Tim Wescott
> Elsevier/Newnes,http://www.wescottdesign.com/actfes/actfes.html- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -

hi tim,

thanks for your response. i'll check my code on a pure tones for the
scaling factor.

also, regarding the 'sliding window' (so to speak), i'm resetting the
goertzel state before running it against the next block of N samples
(which i get by shifting 'forward' by N / 2 samples).

i've read that the size of N affects both the bin width of the
frequency as well as the duration of the signal being detected. the
reason i was thinking this 'sliding window' would be a good idea is if
a short signal starts and ends in consecutive blocks, this sort of
shifting might have a better chance of detecting the signal than by
just resetting/re-running the goertzel with the next N samples.

...i would think that looking at the magnituded squared values of the
goertzel for successive iterations would allow perhaps shorter-
duration signals to be detected when a higher N value is required in
order to minimize the bin width.

does that sound like a reasonable assumption?

thanks for your help!

tom

Reply by Tim Wescott ●May 27, 20072007-05-27

On Sat, 26 May 2007 10:38:11 -0700, tj wrote:

> On May 25, 11:31 pm, Tim Wescott <t...@seemywebsite.com> wrote:
>> On Thu, 24 May 2007 16:52:39 -0700, tj wrote:
>> > hi,
>>
>> > i'm trying to implement the goertzel algorithm to detect a small
>> > number of frequencies in an audio signal (16-bit, 8kHz).
>>
>> > i'm using the approach/code outlined in this article:
>> >http://www.embedded.com/story/OEG20020819S0057
>>
>> > i've seen a post on comp.dsp which suggests to compare the sum of
>> > squares of the samples for the block used in the goertzel iteration
>> > and compare it to the energy of the tone being detected in order to
>> > calculate if the frequency represents the 'majority' of the signal's
>> > energy...however, i'm having trouble making this calculation.
>>
-- snip --
> 
> hi tim,
> 
> thanks for your response. i'll check my code on a pure tones for the
> scaling factor.
> 
> also, regarding the 'sliding window' (so to speak), i'm resetting the
> goertzel state before running it against the next block of N samples
> (which i get by shifting 'forward' by N / 2 samples).

In that case you're effectively looking at overlapping blocks of N points
for your signal -- there's nothing wrong with that.  I was afraid that you
may be munging things together somehow by _not_ resetting the state.
> 
> i've read that the size of N affects both the bin width of the frequency
> as well as the duration of the signal being detected.

More like the size of N affects the bin width, which affects the duration
of signal that can be most effectively detected.  A short signal has it's
energy spread out over a larger portion of the spectrum, and a long
signal has its energy more concentrated; ideally you'd match that with
your filtering algorithm but close is often good enough.

> the reason i was
> thinking this 'sliding window' would be a good idea is if a short signal
> starts and ends in consecutive blocks, this sort of shifting might have
> a better chance of detecting the signal than by just
> resetting/re-running the goertzel with the next N samples.

You are correct.  In the worst case of doing this with no overlap you'd
have 1/2 of your signal in one block and the other half in the following
block -- this could lead you to miss the signal in both.
> 
> ...i would think that looking at the magnituded squared values of the
> goertzel for successive iterations would allow perhaps shorter- duration
> signals to be detected when a higher N value is required in order to
> minimize the bin width.
> 
> does that sound like a reasonable assumption?
> 
Yes, kind of.  Remember that if the signal is really short it'll be spread
out in the spectrum, and the way to catch it would ideally be with a short
length filter.  If you know you have lots of noise that's overlapping the
signal's energy, however, then a narrower filter (i.e. a larger N) becomes
the better one for discriminating between the signal and noise.

-- 
Tim Wescott
Control systems and communications consulting
http://www.wescottdesign.com

Need to learn how to apply control theory in your embedded system?
"Applied Control Theory for Embedded Systems" by Tim Wescott
Elsevier/Newnes, http://www.wescottdesign.com/actfes/actfes.html

frequency detection (goertzel) - comparing detected tone energy vs. total signal energy

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group