# selective fft

Started by September 21, 2004

Clay Turner wrote:
> Steven,
> see at the bottom
>
>
> "Steven G. Johnson" <stevenj@alum.mit.edu> wrote in message
>
>>r.lyons@_BOGUS_ieee.org (Rick Lyons) wrote...
>>
>>>   your post is interesting.  I've modeled the
>>>Goertzel algorithm (using the floating-point
>>>precision of MATLAB) and have not seen the
>>>numerical errors that you mention.
>>>
>>>I'm wondering under what condition(s), or
>>>implementation(s), the Goertzel algorithm exhibits
>>>*much* worse numerical errors than FFT computations.
>>>Can I bug you tell us a little more?
>>
>>See:
>>
>>   W. M. Gentleman, "An error analysis of Goertzel's (Watt's) method
>>   for computing Fourier coefficients," Computer Journal 12 (2),
>>   160-165 (1969).
>>
>>   Abstract:  Goertzel's method is commonly used when only a small
>>   number of coefficients is desired for a given sequence. The
>>   paper gives a floating-point error analysis of the technique,
>>   and shows why it should be avoided,
>>   particularly for low frequencies.
>>
>>His error analysis indicates that Goertzel's method leads to floating
>>point errors that have an upper bound O(N^(5/2)), with the worst case
>>being low frequencies.  Compare this to an error bound of O(log N) for
>>the FFT, and mean errors of O(sqrt(log N)).
>>
>>It's perhaps not that surprising -- Goertzel's algorithm does not
>>explicitly compute N twiddle factors exp(ikn/N), but instead contains
>>something very similar to a trigonometric recurrence formula...and the
>>recurrence that it most resembles has horrible O(N^2) relative error
>>growth (both worst case *and* mean).
>>
>>Here are some sample numbers I got from a C implementation of
>>Goertzel's algorithm, compiled with gcc on a Pentium 4 (double
>>precision; code is structured to *not* stay within x86
>>extended-precision registers).  Computing the first 10 bins of a
>>size-N DFT with Goertzel and calculating the maximum relative error
>>w.r.t. an FFT (defined as max |difference| / max |value|) for uniform
>>random inputs, I get:
>>
>>N, relative error
>>100, 1.2e-14
>>1000, 9.3e-13
>>10000, 1.4e-10
>>100000, 3.3e-8
>>1000000, 7.7e-7
>>10000000, 7.1e-5
>>
>>which is almost exactly N^2 growth in the error.  In contrast, the
>>errors accumulated by the naive DFT formula in a similar experiment
>>grow only as ~sqrt(N) as expected, and are still only 1.2e-13 for
>>N=10^7 (but are still worse than the FFT's O(sqrt(log N))).
>>
>>I'd be interested to hear if there were some trick to make Goertzel's
>>algorithm more accurate without sacrificing performance (or if your
>>Matlab code achieves better errors for a similar experiment to the
>>above).  Of course, if you use e.g. extended-precision registers on
>>the Pentium, matters will improve somewhat (by a constant factor).
>>
>>Cordially,
>>Steven G. Johnson
>
>
> Hello Steven,
>
> A trick I use for low frequencies that is Goertzel like uses a near
> quadrature oscillator. The "c" like code is below: It does cost two mults
> per iter, but like the Goertzel algo, the twiddle factors are also computed
> on the fly. The advantage of this method is when viewed in matrix form, the
> condition number for the matrix  approaches 1 for low frequencies. Goerzel
> uses a biquad oscillator which needs to step 90 degrees per sample to
> achieve the low condition number.
>
> The input data is in x[], the data has N samples. And the bin number is w.
>
> y1=0;
> y2=0;
> k=2*sin(pi*w/N);             // not 2 pi !!
>
> for (j=0;j<N;j++) {
>       y2=y2-k*y1+x[j];
>       y1=y1+k*y2;
>     }
>
> And the energy is simply
>
> E = y1*y1 + y2*y2 - k*y1*y2;
>
> I would be interested in how this algo fairs in your comparison tests.
>
> Clay S. Turner
>

The "standard" recurrence (writing e(x) = e^(i 2 pi x) where i is
the imaginary constant also known a j in engineering) is

e((n+1)x) = e(nx) e(x).

For x small, e(x) is close to 1. Write e(x)=1+(e(x)-1) to have two
terms. One is 1 which is exactly determined in floating point and
e(x)-1 is small so that its relative error corresponds to a much
smaller absolute error. Now the recurrence becomes

e((n+1)x) = e(nx) + e(nx)(e(x)-1)

This alternate marching formula was pointed out by Dick Singleton (then
of SRI) many years ago for use in calculating the twiddle factors. One
would expect that it could be applied to the Goertzel algorithm as well.
Additional arithmetic is used to keep the update terms small. It could
be applied whenever e(x) is close to a simple value like 1, -1, i, -i.

This in another example of the well worn mathematical trick of adding
an awkward representation of zero into a formula in order to find a
"new" result.


"Gordon Sande" <g.sande@worldnet.att.net> wrote in message
news:Eqy6d.74086$vO1.403236@nnrp1.uunet.ca... > > The "standard" recurrence (writing e(x) = e^(i 2 pi x) where i is > the imaginary constant also known a j in engineering) is > > e((n+1)x) = e(nx) e(x). > > For x small, e(x) is close to 1. Write e(x)=1+(e(x)-1) to have two > terms. One is 1 which is exactly determined in floating point and > e(x)-1 is small so that its relative error corresponds to a much > smaller absolute error. Now the recurrence becomes > > e((n+1)x) = e(nx) + e(nx)(e(x)-1) > > This alternate marching formula was pointed out by Dick Singleton (then > of SRI) many years ago for use in calculating the twiddle factors. One > would expect that it could be applied to the Goertzel algorithm as well. > Additional arithmetic is used to keep the update terms small. It could > be applied whenever e(x) is close to a simple value like 1, -1, i, -i. > > This in another example of the well worn mathematical trick of adding > an awkward representation of zero into a formula in order to find a > "new" result. > > Hello Gordon, Thanks for pointing a way to another potential Goertzel method. The method I posted however doesn't use complex numbers in its recursion. It is based on a staggered update oscillator. It requires only 2 multiplies per iteration. The complex recursion that you cited will need 4 multiplies per iteration. Goertzel's method needs only one multiply per iteration. I chose the staggered update oscillator structure since the state variables become orthogonal in the limit as the step angle goes small. This is the region where Steven said the Goertzel method has difficulty. But even for nonsmall step angles, the determinant of the staggered update oscillator state matrix remains one. The importance in the near orthogonality is keeping the matrix's condition number small. Thus the accumulation of errors is kept small. A cool idea is basically any 2 term recurance relation that obeys the discrete Barkhausen criteria can be made into a Goertzel type iteration. A Digital Waveguide oscillator, which has only one multiply per iteration, works pretty well. The Goertzel method itself is based on using a Biquad oscillator. The Coupled Quadrature oscillator works fine, but it should at 4 multiplies per iteration! Of course all of these approaches are the same old idea[1] in different clothes. However, from a computational point of view, some of these forms are very advantageous when compared to others. Clay S. Turner [1] The "idea" is the DFT coef (single frequency bin) may be found by finding the particular solution to a driven simple harmonic oscillator.  On 28 Sep 2004 19:03:26 -0700, stevenj@alum.mit.edu (Steven G. Johnson) wrote: >r.lyons@_BOGUS_ieee.org (Rick Lyons) wrote... >> your post is interesting. I've modeled the >> Goertzel algorithm (using the floating-point >> precision of MATLAB) and have not seen the >> numerical errors that you mention. >> >> I'm wondering under what condition(s), or >> implementation(s), the Goertzel algorithm exhibits >> *much* worse numerical errors than FFT computations. >> Can I bug you tell us a little more? > >See: > > W. M. Gentleman, "An error analysis of Goertzel's (Watt's) method > for computing Fourier coefficients," Computer Journal 12 (2), > 160-165 (1969). > > Abstract: Goertzel's method is commonly used when only a small > number of coefficients is desired for a given sequence. The > paper gives a floating-point error analysis of the technique, > and shows why it should be avoided, > particularly for low frequencies. > >His error analysis indicates that Goertzel's method leads to floating >point errors that have an upper bound O(N^(5/2)), with the worst case >being low frequencies. Compare this to an error bound of O(log N) for >the FFT, and mean errors of O(sqrt(log N)). > >It's perhaps not that surprising -- Goertzel's algorithm does not >explicitly compute N twiddle factors exp(ikn/N), but instead contains >something very similar to a trigonometric recurrence formula...and the >recurrence that it most resembles has horrible O(N^2) relative error >growth (both worst case *and* mean). > >Here are some sample numbers I got from a C implementation of >Goertzel's algorithm, compiled with gcc on a Pentium 4 (double >precision; code is structured to *not* stay within x86 >extended-precision registers). Computing the first 10 bins of a >size-N DFT with Goertzel and calculating the maximum relative error >w.r.t. an FFT (defined as max |difference| / max |value|) for uniform >random inputs, I get: > >N, relative error >100, 1.2e-14 >1000, 9.3e-13 >10000, 1.4e-10 >100000, 3.3e-8 >1000000, 7.7e-7 >10000000, 7.1e-5 > >which is almost exactly N^2 growth in the error. In contrast, the >errors accumulated by the naive DFT formula in a similar experiment >grow only as ~sqrt(N) as expected, and are still only 1.2e-13 for >N=10^7 (but are still worse than the FFT's O(sqrt(log N))). > >I'd be interested to hear if there were some trick to make Goertzel's >algorithm more accurate without sacrificing performance (or if your >Matlab code achieves better errors for a similar experiment to the >above). Of course, if you use e.g. extended-precision registers on >the Pentium, matters will improve somewhat (by a constant factor). > >Cordially, >Steven G. Johnson Hi Steven, I just saw you post. I'll study your numbers to see if I can duplicate them (or something close to them). Thanks for the trouble. [-Rick-]  "Rick Lyons" <r.lyons@_BOGUS_ieee.org> wrote in message news:415c8a34.257585156@news.sf.sbcglobal.net... >> > Hi Steven, > > I just saw you post. I'll study your numbers to > see if I can duplicate them (or something close to them). > > Thanks for the trouble. > > [-Rick-] > Hello Rick, If you can, also try running the alog I posted and see how it compares - especially for low frequencies. My small number of tests indicate it is quite a bit better than the Goertzel method for low freqs. Clay  On Tue, 28 Sep 2004 14:15:23 -0500, Richard Owlett <rowlett@atlascomm.net> wrote: >Rick Lyons wrote: > >> ... DFT allows you to compute >> single-bin DFT samples. The nice part is that >> upon arrival of a new time sample, the new single-bin >> DFT sample can be computed with (roughly) >> a half dozen arithmetic operations. >> > >The question was posed for case of only a few frequencies being of interest. > >The answer got me musing. > >Having already calculated an m point DFT/FFT for points n thru n+m > >when point n+m+1 arrives > >is there a simple algorithm for calculating DFT/FFT of points n+1 thru >n+m+1? Sorry Richard, I'd help ya' if I could, but I don't completely follow your question. Are those "n" & "m" indices in the time or frequency domain? [-Rick-]   Rick Lyons wrote: (snip) >>Having already calculated an m point DFT/FFT for points >> n thru n+m when point n+m+1 arrives >>is there a simple algorithm for calculating DFT/FFT of >> points n+1 thru n+m+1? > Sorry Richard, > I'd help ya' if I could, but I don't completely > follow your question. Are those "n" & "m" indices > in the time or frequency domain? It seems to be the extension of moving average to frequencies higher than zero. For a moving average of the last m+1 points (maybe you should have n+1 to n+m so it is m points) you subtract X(n) and add X(n+m+1) to the previous average. It does seem that it might work. Though it would seem that an extra phase term comes in from the moving part, which may or may not be wanted. If I have four point transforms, do I want the tranforms of {0,1,2,3}, {1,2,3,4}, {2,3,4,5} or, {0,1,2,3}, {4,1,2,3}, {4,5,2,3}. It would seem easier to calculate the latter, as it is more like a moving average, subtract the old point add the new point. -- glen  Clay, > A cool idea is basically any 2 term recurance relation that obeys the > discrete Barkhausen criteria can > be made into a Goertzel type iteration. A Digital Waveguide oscillator, > which has only one multiply per > iteration, works pretty well. The Goertzel method itself is based on using a > Biquad oscillator. The > Coupled Quadrature oscillator works fine, but it should at 4 multiplies per > iteration! I know that I am kind of a newbie, but I've never heard of the discrete Barkhausen criteria. Where could I find out what you are talking about. Regards, Phil  "phil" <phil_simulink@sympatico.ca> wrote in message news:cjjruv$52b\$1@news.storm.ca...
> Clay,
>
> > A cool idea is basically any 2 term recurance relation that obeys the
> > discrete Barkhausen criteria can
> > be made into a Goertzel type iteration. A Digital Waveguide oscillator,
> > which has only one multiply per
> > iteration, works pretty well. The Goertzel method itself is based on
using
> a
> > Coupled Quadrature oscillator works fine, but it should at 4 multiplies
> per
> > iteration!
>
> I know that I am kind of a newbie, but I've never heard of the discrete
> Barkhausen criteria.  Where could I find out what you are talking about.
>
> Regards,
> Phil
>
>

Hello Phil,

http://personal.atl.bellsouth.net/p/h/physics/2nd_OSC_paper.pdf

Basically when one restricts oneself to using real valued numbers, then
discrete-time sinusoids can be formed by using a two term recurrance
relation. This two term relation can be formulated into a matrix form, where
the two state variables are updated by being multiplied by a 2 by 2 matrix.

I.e.,

[ x(n+1) ]       [  a       b ] [x(n)]
[            ]  =   [              ] [      ]
[ y(n+1) ]       [  c       d ] [y(n)]

For oscillation, this matrix needs to satisfy two criteria, which I call the
discrete Barkhausen criteria:

1)    ad-bc = 1    (the determinant is one)

2)   | a +d | < 2   ( this forces complex eigenvalues which causes
periodicity)

Also the oscillator's properties can be gleaned just from the matrix's
values.

Each of the state variables will produce a sinusoid, where both have the
same frequency.

If  a==d, the outputs will be 90 degrees out of phase with each other.

If b== -c, the outputs will have the same amplitude

The details are in my article.

Clay


"Clay Turner" <physics@bellsouth.net> wrote...
> A trick I use for low frequencies that is Goertzel like uses a near
> quadrature oscillator. The "c" like code is below: It does cost two mults
> per iter, but like the Goertzel algo, the twiddle factors are also computed
> on the fly.

Hi Clay,

Looking a little more closely, your algorithm seems related to a
Goertzel variant mentioned by Gentleman at the end of the 1969 paper I
referenced, and attribruted to Reinsch.  Gentleman wrote: "Reinsch
(unpublished) has suggested that for small \omega we calculate [...]
by the recurrence rewritten as" (in your notation):

k = (2*sin(pi*w))^2  [ w is the bin number ]
y1 = y2 = 0
for each input x[j]:
y2 = x[j] + y2 - k*y1
y1 = y1 + y2
Fourier coefficient = y2 + y1 * (exp(2*pi*i * w/N) - 1)

This has two extra additions per input compared to Goertzel's original
recurrence (which has one add and one multiply per input), unlike
yours which has both two extra additions and one extra multiplication
(although your extra multiply is free on an architecture with fma
instructions).

On the other hand, there is some evidence (below) that your variant is
more accurate than Reinsch's for low frequencies.  Both yours and
Reinsch's variants seem like they may be related to the Singleton
trigonometric recurrence that Gordon mentioned, since Singleton's
improved recurrence also uses sin(theta/2)^2.  (Singleton's recurrence
has O(sqrt(N)) mean errors and O(N) worst-case errors.)

I ran a similar test as before (gcc on Pentium III, double precision,
code structured so that recurrence does *not* stay within
extended-precision registers), to test the accuracy, but this time I
ran more N's to get a better fit and also ran two kinds of test.
Again, I also compared to a naive DFT formula with trig. factors
precomputed by standard library functions.

Test 1: compute first 10 DFT bins of uniform random inputs, take
maximum relative error (max |delta amplitude| / max |amplitude|)
compared to FFT.

Test 2: compute relative error in bin # N/10, compared to FFT.  This
corresponds to looking at a fixed frequency, while increasing the
number of samples.

(Data appended below.)

SUMMARY: In test 1 (lowest bins), Goertzel errors go as O(N^2), the
naive DFT and Goertzel-Turner errors are similar and go as O(sqrt(N)),
and it seems like the Goertzel-Reinsch errors go as O(N).  In test 2
(fixed frequency), on the other hand, all three of the Goertzel
variants have comparable errors that go as O(N), while the naive DFT
errors go as O(sqrt(N)).

Cordially,
Steven G. Johnson

---- Test 1 data (relative errors) ----
N, Naive, Goertzel, Goertzel-Reinsch, Goertzel-Turner
100, 5.768927e-16, 1.196684e-14, 3.244047e-15, 3.077573e-15
135, 3.663846e-16, 2.436090e-14, 3.134493e-15, 3.518056e-15
180, 4.514673e-16, 5.156812e-14, 4.145421e-15, 3.554201e-15
243, 9.965601e-16, 8.246100e-14, 3.286302e-15, 2.702573e-15
336, 7.803204e-16, 1.213349e-13, 1.550977e-15, 2.303929e-15
441, 1.603280e-15, 1.060468e-13, 6.939733e-15, 6.398852e-15
588, 8.975389e-16, 3.288405e-13, 1.558879e-15, 2.227132e-15
800, 1.090185e-15, 1.004612e-12, 2.509171e-15, 1.822233e-15
1080, 1.989567e-15, 3.112465e-12, 3.930957e-15, 2.845538e-15
1440, 1.984501e-15, 2.060528e-12, 2.459822e-15, 1.234903e-15
1920, 2.206150e-15, 1.130077e-11, 4.659866e-15, 2.461344e-15
2592, 2.043751e-15, 1.698036e-11, 3.249364e-15, 3.517311e-15
3456, 1.887078e-15, 2.494378e-11, 6.040271e-15, 2.241936e-15
4704, 2.400950e-15, 3.533572e-11, 3.507036e-15, 2.787962e-15
6250, 4.244307e-15, 2.835717e-11, 3.697269e-15, 3.146765e-15
8400, 3.231661e-15, 6.446742e-11, 8.990599e-15, 3.025899e-15
11340, 4.980775e-15, 1.325334e-10, 1.501639e-14, 4.255555e-15
15120, 4.562540e-15, 2.723181e-10, 2.302641e-14, 5.638064e-15
20412, 4.666504e-15, 3.225444e-10, 2.240962e-14, 3.120578e-15
27440, 5.536901e-15, 1.390847e-09, 5.264460e-14, 6.966537e-15
36750, 6.116477e-15, 2.462942e-09, 9.030457e-14, 7.758204e-15
49392, 1.180802e-14, 5.226243e-09, 5.678643e-14, 6.202364e-15
66150, 8.572646e-15, 9.450081e-09, 1.778278e-13, 7.885333e-15
89600, 1.186221e-14, 5.215808e-09, 4.900840e-14, 9.480164e-15
120000, 1.673760e-14, 1.830124e-08, 1.459358e-13, 1.295942e-14
161280, 1.702221e-14, 5.001014e-08, 1.962782e-13, 1.329992e-14
216000, 2.315709e-14, 1.539818e-07, 6.199681e-13, 2.843360e-14
290304, 1.801393e-14, 1.805272e-07, 7.545335e-13, 2.380428e-14
388962, 2.188741e-14, 2.343163e-07, 2.777119e-13, 2.240325e-14
524288, 2.035763e-14, 4.099524e-07, 1.143497e-12, 2.481492e-14
702464, 3.547538e-14, 3.486261e-07, 1.153292e-12, 3.585512e-14
944784, 3.028070e-14, 6.025130e-08, 7.409850e-14, 3.424375e-14
1270080, 4.060311e-14, 1.239544e-06, 9.309389e-13, 3.821647e-14
1714608, 6.904300e-14, 2.770016e-06, 3.068086e-12, 7.592274e-14
2286144, 4.737357e-14, 5.179142e-06, 4.160372e-12, 6.964357e-14
3072000, 7.973153e-14, 6.899510e-06, 2.542530e-12, 3.085472e-14
4128768, 1.011534e-13, 1.080983e-05, 3.762273e-12, 7.639704e-14
5556600, 8.030392e-14, 3.185682e-05, 1.220811e-11, 1.127271e-13
7464960, 5.260649e-14, 7.445618e-05, 2.017173e-11, 1.126379e-13
10000000, 1.144018e-13, 7.108695e-05, 4.971694e-12, 1.152115e-13

---- Test 2 data (relative errors) ----
N, Naive, Goertzel, Goertzel-Reinsch, Goertzel-Turner
100, 5.675134e-16, 2.728265e-15, 2.887572e-15, 3.638779e-15
135, 2.578197e-16, 4.813018e-15, 2.882512e-16, 7.516671e-16
180, 1.629837e-15, 2.808489e-14, 2.717018e-14, 3.614662e-14
243, 5.877135e-16, 9.890752e-15, 3.471437e-15, 4.718005e-15
336, 1.812251e-16, 1.093871e-15, 7.555112e-16, 1.756528e-16
441, 5.084716e-16, 5.745407e-15, 3.013960e-16, 5.032918e-16
588, 7.121250e-16, 1.719098e-15, 1.603577e-14, 1.281074e-14
800, 1.481899e-15, 2.628206e-14, 2.602363e-14, 3.459358e-14
1080, 3.601888e-16, 3.340275e-14, 3.203904e-14, 4.295586e-14
1440, 8.549557e-16, 7.074344e-14, 6.897564e-14, 9.081948e-14
1920, 2.296447e-15, 7.057616e-13, 6.893980e-13, 9.058316e-13
2592, 2.720932e-15, 7.614149e-14, 7.636443e-14, 4.812211e-14
3456, 2.128757e-15, 2.931276e-13, 1.930663e-13, 2.201816e-13
4704, 3.615979e-15, 3.317943e-13, 1.361347e-13, 1.080912e-13
6250, 2.345693e-15, 9.757064e-14, 9.933851e-14, 1.285021e-13
8400, 3.091087e-16, 1.084564e-13, 1.069234e-13, 1.375604e-13
11340, 5.370556e-16, 2.911501e-13, 2.906855e-13, 3.736852e-13
15120, 1.538081e-15, 3.238659e-13, 3.202911e-13, 4.121667e-13
20412, 9.563202e-15, 1.941979e-13, 1.918723e-13, 2.024850e-13
27440, 4.309886e-15, 7.528745e-13, 7.465632e-13, 9.609820e-13
36750, 8.013434e-15, 9.269283e-13, 9.275204e-13, 1.197884e-12
49392, 8.670708e-15, 2.289540e-12, 1.038763e-12, 5.536034e-13
66150, 1.030750e-14, 1.617214e-12, 1.626313e-12, 2.104541e-12
89600, 7.509249e-15, 2.346569e-12, 2.361969e-12, 3.061123e-12
120000, 1.734127e-14, 4.118063e-12, 4.136743e-12, 5.357115e-12
161280, 5.440572e-14, 9.841797e-12, 9.881918e-12, 1.277817e-11
216000, 4.660105e-14, 1.029317e-11, 1.032748e-11, 1.334444e-11
290304, 2.463539e-14, 8.124890e-12, 2.205229e-11, 2.145427e-11
388962, 1.696306e-14, 2.259415e-11, 5.004758e-12, 1.714555e-12
524288, 2.387831e-14, 4.850925e-12, 1.153522e-11, 4.506709e-12
702464, 4.082871e-14, 1.262675e-11, 1.262873e-11, 1.206775e-12
944784, 3.850889e-14, 3.550530e-11, 9.711823e-11, 1.267819e-10
1270080, 2.295474e-14, 2.979076e-11, 2.976344e-11, 3.836333e-11
1714608, 8.975055e-15, 1.295532e-13, 2.054187e-11, 1.065351e-11
2286144, 4.720868e-14, 3.349752e-11, 3.353434e-11, 5.824715e-11
3072000, 5.272088e-14, 8.178050e-11, 8.168906e-11, 1.053857e-10
4128768, 1.529778e-13, 5.321170e-10, 2.641123e-10, 2.754986e-10
5556600, 4.128449e-14, 1.297618e-10, 1.297337e-10, 1.674373e-10
7464960, 3.174057e-14, 1.909011e-10, 1.908910e-10, 2.463832e-10
10000000, 4.978507e-14, 2.538476e-10, 2.539156e-10, 3.276861e-10

Hello Steven,

Thanks for running the tests and doing the comparisons. To see the
structural difference between the method I proposed and Reinsch's method is
most easily understood via oscillator theory. See my paper for details.
http://personal.atl.bellsouth.net/p/h/physics/2nd_OSC_paper.pdf

The matrix form for Reinsch's method is:

[ 1-k    1 ]
[              ]
[ -k      1 ]

In the limit of small k this is seen to be quadrature but nonequiamplitude.

The matrix form for my proposal is:

[ 1-k^2   k ]
[                 ]
[ -k         1 ]

In the limit of small k, this is both quadrature and equiamplitude.

Both methods are similar in that they use staggered updating - I.e., the
matrix factors into a product of two triangular matrices.

One of my philosophies in life is when multiple elements are working
together to acheive a goal, that each element should contribute equally to
the load. With both of these methods (in the limit of small k, hence low
frequency) both state variables are 90 degrees out of phase. This is why
these do better than the biquad structure (standard Goertzel). The state
variables in the biquad differ in phase by an amount equal to the step
angle. So when handling low frequencies, the biquad's two state variables
tend to hold almost identical info.

The method I proposed is equiamplitude whereas Reinsch's method is not. And
I believe this in addition to the orthogonality at low frequency is what
helps minimize the errors. I predict other structures would be better near
either 1/4 or 1/2 of the sampling rate.

Certainly I think one could have quite a bit of fun picking resonator
structures that meet the discrete Barkhausen criteria and testing their
driven responses to see what their relative error functions are.

Thanks,
Clay S. Turner

"Steven G. Johnson" <stevenj@alum.mit.edu> wrote in message
> "Clay Turner" <physics@bellsouth.net> wrote...
> > A trick I use for low frequencies that is Goertzel like uses a near
> > quadrature oscillator. The "c" like code is below: It does cost two
mults
> > per iter, but like the Goertzel algo, the twiddle factors are also
computed
> > on the fly.
>
> Hi Clay,
>
> Looking a little more closely, your algorithm seems related to a
> Goertzel variant mentioned by Gentleman at the end of the 1969 paper I
> referenced, and attribruted to Reinsch.  Gentleman wrote: "Reinsch
> (unpublished) has suggested that for small \omega we calculate [...]
> by the recurrence rewritten as" (in your notation):
>
>     k = (2*sin(pi*w))^2  [ w is the bin number ]
>     y1 = y2 = 0
>     for each input x[j]:
>        y2 = x[j] + y2 - k*y1
>        y1 = y1 + y2
>     Fourier coefficient = y2 + y1 * (exp(2*pi*i * w/N) - 1)
>
> This has two extra additions per input compared to Goertzel's original
> recurrence (which has one add and one multiply per input), unlike
> yours which has both two extra additions and one extra multiplication
> (although your extra multiply is free on an architecture with fma
> instructions).
>
> On the other hand, there is some evidence (below) that your variant is
> more accurate than Reinsch's for low frequencies.  Both yours and
> Reinsch's variants seem like they may be related to the Singleton
> trigonometric recurrence that Gordon mentioned, since Singleton's
> improved recurrence also uses sin(theta/2)^2.  (Singleton's recurrence
> has O(sqrt(N)) mean errors and O(N) worst-case errors.)
>
> I ran a similar test as before (gcc on Pentium III, double precision,
> code structured so that recurrence does *not* stay within
> extended-precision registers), to test the accuracy, but this time I
> ran more N's to get a better fit and also ran two kinds of test.
> Again, I also compared to a naive DFT formula with trig. factors
> precomputed by standard library functions.
>
> Test 1: compute first 10 DFT bins of uniform random inputs, take
> maximum relative error (max |delta amplitude| / max |amplitude|)
> compared to FFT.
>
> Test 2: compute relative error in bin # N/10, compared to FFT.  This
> corresponds to looking at a fixed frequency, while increasing the
> number of samples.
>
> (Data appended below.)
>
> SUMMARY: In test 1 (lowest bins), Goertzel errors go as O(N^2), the
> naive DFT and Goertzel-Turner errors are similar and go as O(sqrt(N)),
> and it seems like the Goertzel-Reinsch errors go as O(N).  In test 2
> (fixed frequency), on the other hand, all three of the Goertzel
> variants have comparable errors that go as O(N), while the naive DFT
> errors go as O(sqrt(N)).
>
> Cordially,
> Steven G. Johnson
>
> ---- Test 1 data (relative errors) ----
>  N, Naive, Goertzel, Goertzel-Reinsch, Goertzel-Turner
>  100, 5.768927e-16, 1.196684e-14, 3.244047e-15, 3.077573e-15
>  135, 3.663846e-16, 2.436090e-14, 3.134493e-15, 3.518056e-15
>  180, 4.514673e-16, 5.156812e-14, 4.145421e-15, 3.554201e-15
>  243, 9.965601e-16, 8.246100e-14, 3.286302e-15, 2.702573e-15
>  336, 7.803204e-16, 1.213349e-13, 1.550977e-15, 2.303929e-15
>  441, 1.603280e-15, 1.060468e-13, 6.939733e-15, 6.398852e-15
>  588, 8.975389e-16, 3.288405e-13, 1.558879e-15, 2.227132e-15
>  800, 1.090185e-15, 1.004612e-12, 2.509171e-15, 1.822233e-15
>  1080, 1.989567e-15, 3.112465e-12, 3.930957e-15, 2.845538e-15
>  1440, 1.984501e-15, 2.060528e-12, 2.459822e-15, 1.234903e-15
>  1920, 2.206150e-15, 1.130077e-11, 4.659866e-15, 2.461344e-15
>  2592, 2.043751e-15, 1.698036e-11, 3.249364e-15, 3.517311e-15
>  3456, 1.887078e-15, 2.494378e-11, 6.040271e-15, 2.241936e-15
>  4704, 2.400950e-15, 3.533572e-11, 3.507036e-15, 2.787962e-15
>  6250, 4.244307e-15, 2.835717e-11, 3.697269e-15, 3.146765e-15
>  8400, 3.231661e-15, 6.446742e-11, 8.990599e-15, 3.025899e-15
>  11340, 4.980775e-15, 1.325334e-10, 1.501639e-14, 4.255555e-15
>  15120, 4.562540e-15, 2.723181e-10, 2.302641e-14, 5.638064e-15
>  20412, 4.666504e-15, 3.225444e-10, 2.240962e-14, 3.120578e-15
>  27440, 5.536901e-15, 1.390847e-09, 5.264460e-14, 6.966537e-15
>  36750, 6.116477e-15, 2.462942e-09, 9.030457e-14, 7.758204e-15
>  49392, 1.180802e-14, 5.226243e-09, 5.678643e-14, 6.202364e-15
>  66150, 8.572646e-15, 9.450081e-09, 1.778278e-13, 7.885333e-15
>  89600, 1.186221e-14, 5.215808e-09, 4.900840e-14, 9.480164e-15
>  120000, 1.673760e-14, 1.830124e-08, 1.459358e-13, 1.295942e-14
>  161280, 1.702221e-14, 5.001014e-08, 1.962782e-13, 1.329992e-14
>  216000, 2.315709e-14, 1.539818e-07, 6.199681e-13, 2.843360e-14
>  290304, 1.801393e-14, 1.805272e-07, 7.545335e-13, 2.380428e-14
>  388962, 2.188741e-14, 2.343163e-07, 2.777119e-13, 2.240325e-14
>  524288, 2.035763e-14, 4.099524e-07, 1.143497e-12, 2.481492e-14
>  702464, 3.547538e-14, 3.486261e-07, 1.153292e-12, 3.585512e-14
>  944784, 3.028070e-14, 6.025130e-08, 7.409850e-14, 3.424375e-14
>  1270080, 4.060311e-14, 1.239544e-06, 9.309389e-13, 3.821647e-14
>  1714608, 6.904300e-14, 2.770016e-06, 3.068086e-12, 7.592274e-14
>  2286144, 4.737357e-14, 5.179142e-06, 4.160372e-12, 6.964357e-14
>  3072000, 7.973153e-14, 6.899510e-06, 2.542530e-12, 3.085472e-14
>  4128768, 1.011534e-13, 1.080983e-05, 3.762273e-12, 7.639704e-14
>  5556600, 8.030392e-14, 3.185682e-05, 1.220811e-11, 1.127271e-13
>  7464960, 5.260649e-14, 7.445618e-05, 2.017173e-11, 1.126379e-13
>  10000000, 1.144018e-13, 7.108695e-05, 4.971694e-12, 1.152115e-13
>
> ---- Test 2 data (relative errors) ----
>  N, Naive, Goertzel, Goertzel-Reinsch, Goertzel-Turner
>  100, 5.675134e-16, 2.728265e-15, 2.887572e-15, 3.638779e-15
>  135, 2.578197e-16, 4.813018e-15, 2.882512e-16, 7.516671e-16
>  180, 1.629837e-15, 2.808489e-14, 2.717018e-14, 3.614662e-14
>  243, 5.877135e-16, 9.890752e-15, 3.471437e-15, 4.718005e-15
>  336, 1.812251e-16, 1.093871e-15, 7.555112e-16, 1.756528e-16
>  441, 5.084716e-16, 5.745407e-15, 3.013960e-16, 5.032918e-16
>  588, 7.121250e-16, 1.719098e-15, 1.603577e-14, 1.281074e-14
>  800, 1.481899e-15, 2.628206e-14, 2.602363e-14, 3.459358e-14
>  1080, 3.601888e-16, 3.340275e-14, 3.203904e-14, 4.295586e-14
>  1440, 8.549557e-16, 7.074344e-14, 6.897564e-14, 9.081948e-14
>  1920, 2.296447e-15, 7.057616e-13, 6.893980e-13, 9.058316e-13
>  2592, 2.720932e-15, 7.614149e-14, 7.636443e-14, 4.812211e-14
>  3456, 2.128757e-15, 2.931276e-13, 1.930663e-13, 2.201816e-13
>  4704, 3.615979e-15, 3.317943e-13, 1.361347e-13, 1.080912e-13
>  6250, 2.345693e-15, 9.757064e-14, 9.933851e-14, 1.285021e-13
>  8400, 3.091087e-16, 1.084564e-13, 1.069234e-13, 1.375604e-13
>  11340, 5.370556e-16, 2.911501e-13, 2.906855e-13, 3.736852e-13
>  15120, 1.538081e-15, 3.238659e-13, 3.202911e-13, 4.121667e-13
>  20412, 9.563202e-15, 1.941979e-13, 1.918723e-13, 2.024850e-13
>  27440, 4.309886e-15, 7.528745e-13, 7.465632e-13, 9.609820e-13
>  36750, 8.013434e-15, 9.269283e-13, 9.275204e-13, 1.197884e-12
>  49392, 8.670708e-15, 2.289540e-12, 1.038763e-12, 5.536034e-13
>  66150, 1.030750e-14, 1.617214e-12, 1.626313e-12, 2.104541e-12
>  89600, 7.509249e-15, 2.346569e-12, 2.361969e-12, 3.061123e-12
>  120000, 1.734127e-14, 4.118063e-12, 4.136743e-12, 5.357115e-12
>  161280, 5.440572e-14, 9.841797e-12, 9.881918e-12, 1.277817e-11
>  216000, 4.660105e-14, 1.029317e-11, 1.032748e-11, 1.334444e-11
>  290304, 2.463539e-14, 8.124890e-12, 2.205229e-11, 2.145427e-11
>  388962, 1.696306e-14, 2.259415e-11, 5.004758e-12, 1.714555e-12
>  524288, 2.387831e-14, 4.850925e-12, 1.153522e-11, 4.506709e-12
>  702464, 4.082871e-14, 1.262675e-11, 1.262873e-11, 1.206775e-12
>  944784, 3.850889e-14, 3.550530e-11, 9.711823e-11, 1.267819e-10
>  1270080, 2.295474e-14, 2.979076e-11, 2.976344e-11, 3.836333e-11
>  1714608, 8.975055e-15, 1.295532e-13, 2.054187e-11, 1.065351e-11
>  2286144, 4.720868e-14, 3.349752e-11, 3.353434e-11, 5.824715e-11
>  3072000, 5.272088e-14, 8.178050e-11, 8.168906e-11, 1.053857e-10
>  4128768, 1.529778e-13, 5.321170e-10, 2.641123e-10, 2.754986e-10
>  5556600, 4.128449e-14, 1.297618e-10, 1.297337e-10, 1.674373e-10
>  7464960, 3.174057e-14, 1.909011e-10, 1.908910e-10, 2.463832e-10
>  10000000, 4.978507e-14, 2.538476e-10, 2.539156e-10, 3.276861e-10