Forums

frequency detection (goertzel) - comparing detected tone energy vs. total signal energy

Started by tj May 24, 2007
hi,

i'm trying to implement the goertzel algorithm to detect a small
number of frequencies in an audio signal (16-bit, 8kHz).

i'm using the approach/code outlined in this article:
http://www.embedded.com/story/OEG20020819S0057

i've seen a post on comp.dsp which suggests to compare the sum of
squares of the samples for the block used in the goertzel iteration
and compare it to the energy of the tone being detected in order to
calculate if the frequency represents the 'majority' of the signal's
energy...however, i'm having trouble making this calculation.

here's the snippet from a posting located at: http://www.dsprelated.com/showmessage/38661/1.php

[snip]
> Parseval's theorem relates the > time domain and frequency domain energies associated with Fourier > transforms. So you compute the time domain energy for a block of data - > i.e., it is the sum of squares of the samples, and then you compare it > with the energies in the two tones. If the agreement is good, then you > have pure tones and not some wideband source...
[snip] however, when i tried to implement this, i get a goertzel result that is several times larger than the sum of squares of the samples in the block. ...i was thinking that the goertzel result (relative magnitude squared, as outlined in the article) divided by the sum of squares for the block would lead to a value between 0.0 and 1.0. i followed the source code from the article directly. a few things to note about it (i'm not sure if this matters): * the signal is not normalized (does this matter? the signal is 8kHz, 16bit signed values, and i'm operating w/float values when computing the goertzel, etc.); * i'm not doing any sort of windowing function on the signal; * each time i calculate the goertzel, i slide the block of N by N/2 samples (does this matter?); * i've tested the code with various generated signals of a fixed frequency (e.g., from 500Hz - 1000Hz, with a duration of 1-2 seconds, with pre- and post-silence of 1-2 seconds)) and varied my N value (e.g., from 100 to 1000) and use a target frequency for the goertzel of the same value i used to generate the signal; ...however, i can't seem to calculate the 'proportion' that the target frequency's energy is vs. the total signal energy (e.g., a value between 0.0 and 1.0). i've played around with other signal energy/power calculations to see if i could stumble on a result, but they don't see to be much help when comparing to the goertzel 'relative magnitude squared' result (though they are helpful for other purposes): * the teager-kaiser energy calcuation, which i believe is (in pseudo- code): energy[i] = (signal[i] * signal[i]) - (signal[i + 1] * signal[i - 1]); ...where 'i' is the index into the signal's array of non-normalized 16- bit values (stored as floats). * signal power, which i believe is: power[i] = signal[i] * signal[i]; * a windowed power equation i came across: power[i] = power[i - 1] + (signal[i] * signal[i]) - (signal[i - windowLength] * signal[i - windowLength]); * a 'smoothed power' equation: smoothedPower[i] = (weight * smoothedPower[i - 1] + ((1 - weight) * power[i]); any help would be greatly appreciated. sorry if there are novice questions... thanks, tom
On May 24, 7:52 pm, tj <tomjohns...@gmail.com> wrote:
> hi, > > i'm trying to implement the goertzel algorithm to detect a small > number of frequencies in an audio signal (16-bit, 8kHz). > > i'm using the approach/code outlined in this article:http://www.embedded.com/story/OEG20020819S0057 > > i've seen a post on comp.dsp which suggests to compare the sum of > squares of the samples for the block used in the goertzel iteration > and compare it to the energy of the tone being detected in order to > calculate if the frequency represents the 'majority' of the signal's > energy...however, i'm having trouble making this calculation. > > here's the snippet from a posting located at:http://www.dsprelated.com/showmessage/38661/1.php > > [snip]> Parseval's theorem relates the > > time domain and frequency domain energies associated with Fourier > > transforms. So you compute the time domain energy for a block of data - > > i.e., it is the sum of squares of the samples, and then you compare it > > with the energies in the two tones. If the agreement is good, then you > > have pure tones and not some wideband source... > > [snip] > > any help would be greatly appreciated. sorry if there are novice > questions... > > thanks, tom
Hello Tom, Let Et be the sum of the squares of your data in the time domain. Let Ew be the sum of the magnitudes squared of your data in the frequency domain. Let N be the number of samples in your data block. Then the purity is simply (2*Ew)/(N*Et) (for f != 0 or 1/2 f_sample if N is even) For example let your data be xi=cos(2*pi*k*i/N) where i,k are in [0,1,2,3,...,N-1] Then Et = sum cos^2(2*pi*k*i/N) = N/2 For the frequency k, Hk=sum(xi*exp(2*pi*k*i/N)) = N/2, Thus Ew=(N/2)^2 = N^2 /4 p=2*((N^2)/4) / N*(N/2) = 1.00000 If the frequency of your data does not have an integral k, then the p < 1. I hope this helps, Clay
On Thu, 24 May 2007 16:52:39 -0700, tj wrote:

> hi, > > i'm trying to implement the goertzel algorithm to detect a small > number of frequencies in an audio signal (16-bit, 8kHz). > > i'm using the approach/code outlined in this article: > http://www.embedded.com/story/OEG20020819S0057 > > i've seen a post on comp.dsp which suggests to compare the sum of > squares of the samples for the block used in the goertzel iteration > and compare it to the energy of the tone being detected in order to > calculate if the frequency represents the 'majority' of the signal's > energy...however, i'm having trouble making this calculation. > > here's the snippet from a posting located at: http://www.dsprelated.com/showmessage/38661/1.php > > [snip] >> Parseval's theorem relates the >> time domain and frequency domain energies associated with Fourier >> transforms. So you compute the time domain energy for a block of data - >> i.e., it is the sum of squares of the samples, and then you compare it >> with the energies in the two tones. If the agreement is good, then you >> have pure tones and not some wideband source... > [snip] > > however, when i tried to implement this, i get a goertzel result that > is several times larger than the sum of squares of the samples in the > block. ...i was thinking that the goertzel result (relative magnitude > squared, as outlined in the article) divided by the sum of squares for > the block would lead to a value between 0.0 and 1.0.
Indeed it should. Chances are that you have a scaling problem in your algorithm -- have you tried running a pure tone through it to get the scaling factor?
> > i followed the source code from the article directly. a few things to > note about it (i'm not sure if this matters): > > * the signal is not normalized (does this matter? the signal is 8kHz, > 16bit signed values, and i'm operating w/float values when computing the > goertzel, etc.);
If you're using floats then you don't have to normalize, just keep track of your magnitudes.
> * i'm not doing any sort of windowing function on the signal; * each > time i calculate the goertzel, i slide the block of N by N/2 samples > (does this matter?);
It's not clear what you mean here. If you do a Goertzel algorithm for a block of N points, then start out fresh on another block of N points you should be OK. If those two blocks should happen to overlap by N/2 that's OK, too. If you're somehow using the leftover state from the Goertzel algorithm on the newly slid block -- then you're in weird math land, and I'd have to expend effort to figure out if it's OK or not.
> * i've tested the code with various generated signals of a fixed > frequency (e.g., from 500Hz - 1000Hz, with a duration of 1-2 seconds, > with pre- and post-silence of 1-2 seconds)) and varied my N value (e.g., > from 100 to 1000) and use a target frequency for the goertzel of the > same value i used to generate the signal; > > ...however, i can't seem to calculate the 'proportion' that the target > frequency's energy is vs. the total signal energy (e.g., a value between > 0.0 and 1.0). > > i've played around with other signal energy/power calculations to see if > i could stumble on a result, but they don't see to be much help when > comparing to the goertzel 'relative magnitude squared' result (though > they are helpful for other purposes): > > * the teager-kaiser energy calcuation, which i believe is (in pseudo- > code): energy[i] = (signal[i] * signal[i]) - (signal[i + 1] * signal[i - > 1]); > > ...where 'i' is the index into the signal's array of non-normalized 16- > bit values (stored as floats). > > * signal power, which i believe is: power[i] = signal[i] * signal[i]; > > * a windowed power equation i came across: power[i] = power[i - 1] + > (signal[i] * signal[i]) - (signal[i - windowLength] * signal[i - > windowLength]); > > * a 'smoothed power' equation: smoothedPower[i] = (weight * > smoothedPower[i - 1] + ((1 - weight) * power[i]); > > any help would be greatly appreciated. sorry if there are novice > questions... > > thanks, tom
-- Tim Wescott Control systems and communications consulting http://www.wescottdesign.com Need to learn how to apply control theory in your embedded system? "Applied Control Theory for Embedded Systems" by Tim Wescott Elsevier/Newnes, http://www.wescottdesign.com/actfes/actfes.html
On May 25, 3:10 pm, Clay <phys...@bellsouth.net> wrote:
> On May 24, 7:52 pm, tj <tomjohns...@gmail.com> wrote: > > > > > > > hi, > > > i'm trying to implement the goertzel algorithm to detect a small > > number of frequencies in an audio signal (16-bit, 8kHz). > > > i'm using the approach/code outlined in this article:http://www.embedded.com/story/OEG20020819S0057 > > > i've seen a post on comp.dsp which suggests to compare the sum of > > squares of the samples for the block used in the goertzel iteration > > and compare it to the energy of the tone being detected in order to > > calculate if the frequency represents the 'majority' of the signal's > > energy...however, i'm having trouble making this calculation. > > > here's the snippet from a posting located at:http://www.dsprelated.com/showmessage/38661/1.php > > > [snip]> Parseval's theorem relates the > > > time domain and frequency domain energies associated with Fourier > > > transforms. So you compute the time domain energy for a block of data - > > > i.e., it is the sum of squares of the samples, and then you compare it > > > with the energies in the two tones. If the agreement is good, then you > > > have pure tones and not some wideband source... > > > [snip] > > > any help would be greatly appreciated. sorry if there are novice > > questions... > > > thanks, tom > > Hello Tom, > > Let Et be the sum of the squares of your data in the time domain. > Let Ew be the sum of the magnitudes squared of your data in the > frequency domain. > Let N be the number of samples in your data block. > > Then the purity is simply (2*Ew)/(N*Et) (for f != 0 or 1/2 f_sample if > N is even) > > For example let your data be xi=cos(2*pi*k*i/N) where i,k are in > [0,1,2,3,...,N-1] > > Then Et = sum cos^2(2*pi*k*i/N) = N/2 > > For the frequency k, Hk=sum(xi*exp(2*pi*k*i/N)) = N/2, Thus Ew=(N/2)^2 > = N^2 /4 > > p=2*((N^2)/4) / N*(N/2) = 1.00000 > > If the frequency of your data does not have an integral k, then the p > < 1. > > I hope this helps, > > Clay- Hide quoted text - > > - Show quoted text -
hi clay, thanks for the info. so, if i read your posting correctly, if i am detecting just a single frequency using the goertzel algorithm and, after initializing the goertzel algorithm and running it on N samples, i get its magnitude squared (i.e., Ew), as well as the sum of squares of the block of N non-normalized samples (i.e., Et) that the goertzel algorithm operated on, then i should be able to calculated the purity by doing this: purity = (2*Ew)/(N*Et) and when you wrote "for f != 0 or 1/2 f_sample if N is even", i read this as meaning that the 'frequency being detected must be non-zero and less that 1/2 the sampling rate if N is even.' (if i'm off, please let me know.) thanks again for your help! tom
On May 25, 11:31 pm, Tim Wescott <t...@seemywebsite.com> wrote:
> On Thu, 24 May 2007 16:52:39 -0700, tj wrote: > > hi, > > > i'm trying to implement the goertzel algorithm to detect a small > > number of frequencies in an audio signal (16-bit, 8kHz). > > > i'm using the approach/code outlined in this article: > >http://www.embedded.com/story/OEG20020819S0057 > > > i've seen a post on comp.dsp which suggests to compare the sum of > > squares of the samples for the block used in the goertzel iteration > > and compare it to the energy of the tone being detected in order to > > calculate if the frequency represents the 'majority' of the signal's > > energy...however, i'm having trouble making this calculation. > > > here's the snippet from a posting located at:http://www.dsprelated.com/showmessage/38661/1.php > > > [snip] > >> Parseval's theorem relates the > >> time domain and frequency domain energies associated with Fourier > >> transforms. So you compute the time domain energy for a block of data - > >> i.e., it is the sum of squares of the samples, and then you compare it > >> with the energies in the two tones. If the agreement is good, then you > >> have pure tones and not some wideband source... > > [snip] > > > however, when i tried to implement this, i get a goertzel result that > > is several times larger than the sum of squares of the samples in the > > block. ...i was thinking that the goertzel result (relative magnitude > > squared, as outlined in the article) divided by the sum of squares for > > the block would lead to a value between 0.0 and 1.0. > > Indeed it should. Chances are that you have a scaling problem in your > algorithm -- have you tried running a pure tone through it to get the > scaling factor? > > > > > i followed the source code from the article directly. a few things to > > note about it (i'm not sure if this matters): > > > * the signal is not normalized (does this matter? the signal is 8kHz, > > 16bit signed values, and i'm operating w/float values when computing the > > goertzel, etc.); > > If you're using floats then you don't have to normalize, just keep track > of your magnitudes. > > > * i'm not doing any sort of windowing function on the signal; * each > > time i calculate the goertzel, i slide the block of N by N/2 samples > > (does this matter?); > > It's not clear what you mean here. If you do a Goertzel algorithm for a > block of N points, then start out fresh on another block of N points you > should be OK. If those two blocks should happen to overlap by N/2 that's > OK, too. If you're somehow using the leftover state from the Goertzel > algorithm on the newly slid block -- then you're in weird math land, and > I'd have to expend effort to figure out if it's OK or not. > > > > > > > * i've tested the code with various generated signals of a fixed > > frequency (e.g., from 500Hz - 1000Hz, with a duration of 1-2 seconds, > > with pre- and post-silence of 1-2 seconds)) and varied my N value (e.g., > > from 100 to 1000) and use a target frequency for the goertzel of the > > same value i used to generate the signal; > > > ...however, i can't seem to calculate the 'proportion' that the target > > frequency's energy is vs. the total signal energy (e.g., a value between > > 0.0 and 1.0). > > > i've played around with other signal energy/power calculations to see if > > i could stumble on a result, but they don't see to be much help when > > comparing to the goertzel 'relative magnitude squared' result (though > > they are helpful for other purposes): > > > * the teager-kaiser energy calcuation, which i believe is (in pseudo- > > code): energy[i] = (signal[i] * signal[i]) - (signal[i + 1] * signal[i - > > 1]); > > > ...where 'i' is the index into the signal's array of non-normalized 16- > > bit values (stored as floats). > > > * signal power, which i believe is: power[i] = signal[i] * signal[i]; > > > * a windowed power equation i came across: power[i] = power[i - 1] + > > (signal[i] * signal[i]) - (signal[i - windowLength] * signal[i - > > windowLength]); > > > * a 'smoothed power' equation: smoothedPower[i] = (weight * > > smoothedPower[i - 1] + ((1 - weight) * power[i]); > > > any help would be greatly appreciated. sorry if there are novice > > questions... > > > thanks, tom > > -- > Tim Wescott > Control systems and communications consultinghttp://www.wescottdesign.com > > Need to learn how to apply control theory in your embedded system? > "Applied Control Theory for Embedded Systems" by Tim Wescott > Elsevier/Newnes,http://www.wescottdesign.com/actfes/actfes.html- Hide quoted text - > > - Show quoted text -- Hide quoted text - > > - Show quoted text -
hi tim, thanks for your response. i'll check my code on a pure tones for the scaling factor. also, regarding the 'sliding window' (so to speak), i'm resetting the goertzel state before running it against the next block of N samples (which i get by shifting 'forward' by N / 2 samples). i've read that the size of N affects both the bin width of the frequency as well as the duration of the signal being detected. the reason i was thinking this 'sliding window' would be a good idea is if a short signal starts and ends in consecutive blocks, this sort of shifting might have a better chance of detecting the signal than by just resetting/re-running the goertzel with the next N samples. ...i would think that looking at the magnituded squared values of the goertzel for successive iterations would allow perhaps shorter- duration signals to be detected when a higher N value is required in order to minimize the bin width. does that sound like a reasonable assumption? thanks for your help! tom
On Sat, 26 May 2007 10:38:11 -0700, tj wrote:

> On May 25, 11:31 pm, Tim Wescott <t...@seemywebsite.com> wrote: >> On Thu, 24 May 2007 16:52:39 -0700, tj wrote: >> > hi, >> >> > i'm trying to implement the goertzel algorithm to detect a small >> > number of frequencies in an audio signal (16-bit, 8kHz). >> >> > i'm using the approach/code outlined in this article: >> >http://www.embedded.com/story/OEG20020819S0057 >> >> > i've seen a post on comp.dsp which suggests to compare the sum of >> > squares of the samples for the block used in the goertzel iteration >> > and compare it to the energy of the tone being detected in order to >> > calculate if the frequency represents the 'majority' of the signal's >> > energy...however, i'm having trouble making this calculation. >>
-- snip --
> > hi tim, > > thanks for your response. i'll check my code on a pure tones for the > scaling factor. > > also, regarding the 'sliding window' (so to speak), i'm resetting the > goertzel state before running it against the next block of N samples > (which i get by shifting 'forward' by N / 2 samples).
In that case you're effectively looking at overlapping blocks of N points for your signal -- there's nothing wrong with that. I was afraid that you may be munging things together somehow by _not_ resetting the state.
> > i've read that the size of N affects both the bin width of the frequency > as well as the duration of the signal being detected.
More like the size of N affects the bin width, which affects the duration of signal that can be most effectively detected. A short signal has it's energy spread out over a larger portion of the spectrum, and a long signal has its energy more concentrated; ideally you'd match that with your filtering algorithm but close is often good enough.
> the reason i was > thinking this 'sliding window' would be a good idea is if a short signal > starts and ends in consecutive blocks, this sort of shifting might have > a better chance of detecting the signal than by just > resetting/re-running the goertzel with the next N samples.
You are correct. In the worst case of doing this with no overlap you'd have 1/2 of your signal in one block and the other half in the following block -- this could lead you to miss the signal in both.
> > ...i would think that looking at the magnituded squared values of the > goertzel for successive iterations would allow perhaps shorter- duration > signals to be detected when a higher N value is required in order to > minimize the bin width. > > does that sound like a reasonable assumption? >
Yes, kind of. Remember that if the signal is really short it'll be spread out in the spectrum, and the way to catch it would ideally be with a short length filter. If you know you have lots of noise that's overlapping the signal's energy, however, then a narrower filter (i.e. a larger N) becomes the better one for discriminating between the signal and noise. -- Tim Wescott Control systems and communications consulting http://www.wescottdesign.com Need to learn how to apply control theory in your embedded system? "Applied Control Theory for Embedded Systems" by Tim Wescott Elsevier/Newnes, http://www.wescottdesign.com/actfes/actfes.html