## Purity > 1? Started by 2 weeks ago20 replieslatest reply 6 days ago112 views
I have an application where I am computing a purity value, because I'm interested in how much noise is present.  Since I don't have a full spectrum measurement (I'm using the generalized Goertzel https://doi.org/10.1186/1687-6180-2012-56 to compute a single DTFT), I use Parseval's energy theorem to compute full signal energy in the time domain.  This normally works very well, but occasionally I get a purity slightly greater than 1.  Is there some good reason for this, or should I start looking for an error in the computation?

I have compared my Goertzel implementation to FFTW so I know there is minimal error there: https://bitbucket.org/hexamer/testgoertzelresonato...

[ - ] Please give some description  and expression for calculation of the purity.

[ - ] "Total energy" from time domain is the summation of all the squares of samples Σ(x[n])^2

"DTFT energy" is (2/N)*(DTFT magnitude)^2

Purity = "DTFT energy" / "Total energy"

[ - ] How much is the difference?   If it is a factor of two or a factor of sqrt(2) or their inverses, there is an arbitrary scaling factor that can be applied.   If it is something very close to 1.0, it is probably just a scale factor error (maybe due to quantization) in the transform coefficients.

[ - ] Its normally very close to 1 (e.g in a low noise scenario, it's typical for the purity to be > 0.999).

Goertzel is known to have numerical errors with large input sequences, near theta=0 (up to N^2 error growth), but I've applied the Reinsch modifications to mitigate this.  With the size of sequences I use and these mitigations, these errors are < 1e-10, as confirmed in this project:https://bitbucket.org/hexamer/testgoertzelresonato...

[ - ] I think DTFT energy should be (1/N)*(DTFT)^2  .

It all depends as to how DTFT is calculated.  Please compare the DTFT with the FFT output.  To obtain the Fourier coefficients from FFT output, the DC and Nyquist terms have to be divided by N/2.  The other terms by N.  Parseval's theorem holds for the Fourier coefficients.

[ - ] I had been studying this which also notes the special treatment of the 0 and nyquist bins: http://www.dspguide.com/ch10/7.htm

I can definitely see similar effects in my application - as the DTFT is approaching Nyquist, the purity can be near 2.

In the dspguide the factor for the adjusted DFT is 2/N.  In my application, for low noise signals the purity is very near (and occasionally more than) 1, so I assume 2/N is right.

[ - ] If you have access to the input time series, you can apply FFT and get the specific Fourier component you calculate with Goertzel.  Do they differ by a factor?

Does most of the energy in your time series reside in the DTFT component?  Normally the purity for a single frequency should be well below 1, and as you have noted, not exceed 1.

I suspect you may missing another factor of (1/N),

[ - ] DFT is defined in math as: Thus there is no down scaling of accumulated sum (hence output is scaled up by N relative to input). For iDFT the accumulator result is scaled back by 1/N. Apparently the purpose is to make sure if you apply DFT followed by iDFT you get back to your original signal.

This scaling is arbitrary and is not respected by most hardware fft ip vendors (Intel,xilinx...etc). Instead they output(in both cases of fft/ifft) scaling up by sqrt(N).

Either way it is correct. You only need to take scaling effect out of the analysis. accumulation of N samples scales result N times and you might conclude that you need to divide back by N but this approach causes trouble to power results. Matlab fft follows math rule and outputs scaling by N (i.e. just accumulates) and it outputs ifft as 1/N of accumulated result. So if you want to see fft power in matlab you need divide it back by sqrt(N). Tools, books & platforms are not that in sync.

For the case of purity of single tone, that depends on DFT smearing and noise but it can't be > 1 based on the posted definition because you get out what you have put in.

[ - ] Thank you, Kaz.  This is kind of what I suspected.  Purity > 1 makes no sense.  Wondering if this has anything to do with the use of DTFT vs DFT.

[ - ] The factor of 2 in the purity expression is not correct. The outside factor should be (1/N)

Please see the simple derivation of Parseval's theorem in

[ - ] It’s pretty correct. OP is only calculating a single bin value. There would be an identical magnitude value at the negative bin, hence the factor of two

[ - ] Maybe a dumb question, but are you using the same N for the Goerztel and signal energies?  If they were mismatched, it would mostly work on average, but it could for sure cause excursions above 1

[ - ] Not a dumb question at all, in fact it's been on my mind too.  As you probably know, to compute a DFT coefficient with Goertzel, you iterate one final time with a 0 input.  So, in my case to compute the signal energy, the factor I use is actually 2/(N+1), not 2/N.  That last sample does not affect the time domain energy since it's just a summation of the samples squared.  However, if I were to use N instead of N+1, it would only make the purity greater.  And, I don't know is skipping the last iteration with 0 input is valid for my applicaiton.

[ - ] Aaron45, are you using 64-bit numerical values in your computations? Also, what is the nature of your input test signal?

[ - ] Hi Rick,

The inputs are all 16 bit samples, and all subsequent computations are done with doubles.  Input signals are very often pure sinusoids with some low background noise, besides quantization.

I'm starting to think that I should start generating some simulated ideal inputs to see if I can characterize when purity goes greater than 1.

I've also started to ponder how using a DTFT is affecting things.  With an FFT with discrete bins, and the 0 and fs/2 bins requiring special treatment for for Parseval to hold true, what might that mean for DTFT where there aren't necessarily discrete bins?

Aaron.

[ - ] Hi Aaron45. I see what's happening here!

The version of Parseval's Theorem that applies to digital signal processing is:

Σ(x[n])^2 = 1/N*Σ(|X[m]|)^2         (1)

where |X[m]| is the N spectral magnitude samples computed using an N-point DFT (or FFT). The above Equation (1) says that the total time-domain energy will be equal to the total spectral magnitude energy divided by N.

Your computation of the value you call "Purity" will only be correct if: (1) the single input sinusoid's frequency is exactly at a DFT bin center (no spectral leakage), and (2) the input sinusoid is noise free.

So, ...for a noise-contaminated signal you must compute an N-point DFT, compute the |X[m]| magnitude sequence, and then use the quantities in my above Equation (1). And then your Purity value will always be unity.

SIDE NOTE: Aaron45, the Wiki page for the Goertzel algorithm states that poles on the z-plane's unit circle make the Goertzel algorithm "marginally stable and vulnerable to numerical-error accumulation". That widely-believed statement is NOT true. The Goertzel algorithm is always guaranteed stable!

[ - ] Thank you for this insight, Rick.  One reason I use the generalized Goertzel (a DTFT and not DFT) is so I don't have to worry about bin centering.  But I think you're hinting that the use of the DTFT is a reason purity does not seem to work.  I have some ideas for an experiment to see how the DTFT affects purity - stay tuned :)

On Goertzel, I agree with you that the marginally stable part is wrong .. I've read your excellent article on that matter.  The part on numerical-error accumulation correct.  Not that I really needed to given the amount of literature on the matter, but I independently confirmed it: https://www.dsprelated.com/thread/767/error-growth...

[ - ] Hi Aaron45. I think of the numerical-error accumulation as being caused by the fact that we cannot place the Goertzel network's poles *exactly* at the desired angles on the z-plane's unit circle due to quantized coefficients. With quantized network coefficients the poles will always be located on the unit circle, but at angles that are very slightly different than what we desire. (By the way, when the desired pole angles are 0, ±pi/2, or pi radians then no coefficient quantization error occurs.)

Aaron45, my previous November 20, 2020 "Reply" was based on my assumption that in your expression:

"DTFT energy" is (2/N)*(DTFT magnitude)^2

the term "DTFT magnitude" was a single number. Was my assumption correct?

[ - ]  