DSPRelated.com
Forums

stft vs fft resolution

Started by Adam Chapman January 3, 2009
Does a short-time fourier transform have greater frequency resolution
that a normal fft?

the code at
http://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
would suggest not since it's only using a normal fft on a frame of
data rather than the whole dataset.
On Jan 4, 12:30�pm, Adam Chapman
<adam.chap...@student.manchester.ac.uk> wrote:
> Does a short-time fourier transform have greater frequency resolution > that a normal fft? > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html > would suggest not since it's only using a normal fft on a frame of > data rather than the whole dataset.
Resolution is sampling freq/Number of points for a normal FFT. Making it short-time only makes it have the ability to track time-varying components more accurately. H.
On 4 Jan, 00:30, Adam Chapman <adam.chap...@student.manchester.ac.uk>
wrote:
> Does a short-time fourier transform have greater frequency resolution > that a normal fft?
No. The DFT produces a set of coefficients equidistant in frequency domain, that are spaced Fs/N frequency units apart. So since the STFT contains fewer samples than the full DFT of the same data, the frequency resolution is in fact coarser. However, the full DFT contains no time information. If you are looking at a dynamic signal, say a cirp, this fact will not be visile in the DFT. By computing a number of shorter DFTs, based on data frames slightly offset in time, one might track the dynamic evolution of the signal. So the STFT trades frequency resolution for time evolution. Rune
On Jan 3, 6:30&#4294967295;pm, Adam Chapman
<adam.chap...@student.manchester.ac.uk> wrote:
> Does a short-time fourier transform have greater frequency resolution > that a normal fft?
now, while i don't disagree strictly with Hardy and Rune, i will say this: with adjacent frames in the STFT, you can get greater frequency resolution (better than Nyquist/(N/2)) by observing the change in phase of a spectral peak from one frame to the next. this is key in Jean Laroche's phase vocoder implementation (Jean is well known to us audio heads). it's important to estimate the frequency to a better resolution than what you get with N points in an FFT.
> the code at http://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
i noticed what i thought is a small mistake in Julius's MATLAB snippet. i think it should be: ____________________________________ M = length(w); % M = (even) window length, N = FFT length Mo2 = M/2; nframes = ceil(length(x)/R); % R = STFT hop-size x = [x; zeros(R+M-1,1)]; % make sure we don't run offa de end s = linspace(0,j*2*pi*(N-1)/N,N)'; Xtwz = zeros(N,nframes); % pre-allocate STFT output array zp = zeros(N-M,1); % zero padding (to be inserted) xoff = 0; % current offset in input signal x for m=1:nframes xt = x(xoff+1:xoff+M); % extract frame of input data xtw = w .* xt; % apply window to current frame xtwz = [xtw(Mo2+1:M); zp; xtw(1:Mo2)]; % windowed, zero padded Xtwz(:,m) = exp(-xoff*s) .* fft(xtwz); % STFT frame m, compensate neg delay xoff = xoff + R; % advance in-pointer by hop-size R end ____________________________________ note the "exp(-xoff*s)" compensation for phase. i think that is necessary. r b-j
On Jan 3, 3:30 pm, Adam Chapman
<adam.chap...@student.manchester.ac.uk> wrote:
> Does a short-time fourier transform have greater frequency resolution > that a normal fft? > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html > would suggest not since it's only using a normal fft on a frame of > data rather than the whole dataset.
Adam To keep us all talking about the same things, it may be useful to separate spectrum analysis schemes into three parts: 1) data blocking 2) transform (including transform size) 3) post processing The data blocking consists of operations to prepare sets of data for transformation. This may include weighting (windowing), folding, zero extension, overlap and other functions. The input data blocks may be larger, smaller or the same size as the transform. The data blocks may be adjacent, overlapped, or have gaps between them. The transform takes real or complex time domain samples and produces frequency domain samples, usually complex. Post processing may consist of power spectrum conversion, simply passing on the complex coefficients or many other operations such as averaging multiple outputs. With respect to the transform portion, the fft is an algorithm that usually provides an efficient means to calculate the STFT. The reference you cite happens to include zero sampling as part of the data blocking performed before the transform. Any question of 'frequency resolution' in comp.dsp needs a definition of which 'frequency resolution' you mean. a) If you mean 'frequency difference between centers of adjacent bins', that is determined by transform size and sampling frequency, not any of the data blocking functions like zero filling. b) If you mean the 'window variance' or 'effective noise bandwidth' definition of resolution, it is a function of the window length and weights. c) If your definition relates to 'variance of the estimation of an output parameter', you need to define the parameter; then it will depend on the data blocking, the transform, the post processing algorithm and the signal to noise ratio of the input signal. d) If your definition of resolution relates to 'resolving two tones' then you must specify the post processing functions you will use to decide when there are two tones and any other conditions, thresholds or probabilities for your definition. Few OPs are looking for this analysis. Unfortunately, even experienced responders on comp.dsp will make their own best guess about the OP's meaning and reply without stating their assumptions. If you get multiple unrelated responses, this may be why. HardySpicer seems to have made the a) answer. Rune seems to discuss effects of data blocking on the interpretation of your question. robert seems to have suggested the advantages of a post processing scheme. So, Adam, which if any of these did you mean? Would you like to state a more specific question? Good luck! Dale B. Dalrymple http://dbdimages.com
On Jan 5, 7:06&#4294967295;am, dbd <d...@ieee.org> wrote:
> On Jan 3, 3:30 pm, Adam Chapman > > <adam.chap...@student.manchester.ac.uk> wrote: > > Does a short-time fourier transform have greater frequency resolution > > that a normal fft? > > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html > > would suggest not since it's only using a normal fft on a frame of > > data rather than the whole dataset. > > Adam > > To keep us all talking about the same things, it may be useful to > separate spectrum analysis schemes into three parts: > 1) data blocking > 2) transform (including transform size) > 3) post processing > > The data blocking consists of operations to prepare sets of data for > transformation. This may include weighting (windowing), folding, zero > extension, overlap and other functions. The input data blocks may be > larger, smaller or the same size as the transform. The data blocks may > be adjacent, overlapped, or have gaps between them. > > The transform takes real or complex time domain samples and produces > frequency domain samples, usually complex. > > Post processing may consist of power spectrum conversion, simply > passing on the complex coefficients or many other operations such as > averaging multiple outputs. > > With respect to the transform portion, the fft is an algorithm that > usually provides an efficient means to calculate the STFT. The > reference you cite happens to include zero sampling as part of the > data blocking performed before the transform. > > Any question of 'frequency resolution' in comp.dsp needs a definition > of which 'frequency resolution' you mean. > > a) If you mean 'frequency difference between centers of adjacent > bins', that is determined by transform size and sampling frequency, > not any of the data blocking functions like zero filling. > > b) If you mean the 'window variance' or 'effective noise bandwidth' > definition of resolution, it is a function of the window length and > weights. > > c) If your definition relates to 'variance of the estimation of an > output parameter', you need to define the parameter; then it will > depend on the data blocking, the transform, the post processing > algorithm and the signal to noise ratio of the input signal. > > d) If your definition of resolution relates to 'resolving two tones' > then you must specify the post processing functions you will use to > decide when there are two tones and any other conditions, thresholds > or probabilities for your definition. > > Few OPs are looking for this analysis. Unfortunately, even experienced > responders on comp.dsp will make their own best guess about the OP's > meaning and reply without stating their assumptions. If you get > multiple unrelated responses, this may be why. > > HardySpicer seems to have made the a) answer. > Rune seems to discuss effects of data blocking on the interpretation > of your question. > robert seems to have suggested the advantages of a post processing > scheme. > > So, Adam, which if any of these did you mean? Would you like to state > a more specific question? > > Good luck! > > Dale B. Dalrymplehttp://dbdimages.com
With post-processing you can of course average successive FFT and get a smooth result for noisy transforms. Smoothness in my book however does not imply greater resolution, just the ability to read the result better with less noise present. Averaging is after all just integrating of a sort and hence a low-pass filter. I often use for the frequency vector X (conjugate X*) S(i)=betaS(i-1) + (1-beta)XX* where S(i) is the periodogram at frame i and beta (<1) is a forgetting factor. This is just a low-pass filter of course. I think this is known as the Welch method. H.
On Jan 4, 3:37&#4294967295;pm, HardySpicer <gyansor...@gmail.com> wrote:
> On Jan 5, 7:06&#4294967295;am, dbd <d...@ieee.org> wrote: > > > > > On Jan 3, 3:30 pm, Adam Chapman > > > <adam.chap...@student.manchester.ac.uk> wrote: > > > Does a short-time fourier transform have greater frequency resolution > > > that a normal fft? > > > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html > > > would suggest not since it's only using a normal fft on a frame of > > > data rather than the whole dataset. > > > Adam > > > To keep us all talking about the same things, it may be useful to > > separate spectrum analysis schemes into three parts: > > 1) data blocking > > 2) transform (including transform size) > > 3) post processing > > > The data blocking consists of operations to prepare sets of data for > > transformation. This may include weighting (windowing), folding, zero > > extension, overlap and other functions. The input data blocks may be > > larger, smaller or the same size as the transform. The data blocks may > > be adjacent, overlapped, or have gaps between them. > > > The transform takes real or complex time domain samples and produces > > frequency domain samples, usually complex. > > > Post processing may consist of power spectrum conversion, simply > > passing on the complex coefficients or many other operations such as > > averaging multiple outputs. > > > With respect to the transform portion, the fft is an algorithm that > > usually provides an efficient means to calculate the STFT. The > > reference you cite happens to include zero sampling as part of the > > data blocking performed before the transform. > > > Any question of 'frequency resolution' in comp.dsp needs a definition > > of which 'frequency resolution' you mean. > > > a) If you mean 'frequency difference between centers of adjacent > > bins', that is determined by transform size and sampling frequency, > > not any of the data blocking functions like zero filling. > > > b) If you mean the 'window variance' or 'effective noise bandwidth' > > definition of resolution, it is a function of the window length and > > weights. > > > c) If your definition relates to 'variance of the estimation of an > > output parameter', you need to define the parameter; then it will > > depend on the data blocking, the transform, the post processing > > algorithm and the signal to noise ratio of the input signal. > > > d) If your definition of resolution relates to 'resolving two tones' > > then you must specify the post processing functions you will use to > > decide when there are two tones and any other conditions, thresholds > > or probabilities for your definition. > > > Few OPs are looking for this analysis. Unfortunately, even experienced > > responders on comp.dsp will make their own best guess about the OP's > > meaning and reply without stating their assumptions. If you get > > multiple unrelated responses, this may be why. > > > HardySpicer seems to have made the a) answer. > > Rune seems to discuss effects of data blocking on the interpretation > > of your question. > > robert seems to have suggested the advantages of a post processing > > scheme. > > > So, Adam, which if any of these did you mean? Would you like to state > > a more specific question? > > > Good luck! > > > Dale B. Dalrymplehttp://dbdimages.com > > With post-processing you can of course average successive FFT and get > a smooth result for noisy transforms. Smoothness in my book however > does not imply greater resolution, just the ability to read the result > better with less noise present. Averaging is after all just > integrating of a sort and hence a low-pass filter. I often use for the > frequency vector X (conjugate X*) > > S(i)=betaS(i-1) + (1-beta)XX* > > where S(i) is the periodogram at frame i and beta (<1) is a forgetting > factor. This is just a low-pass filter of course. > I think this is known as the Welch method. > > H.
Welch's method uses an equal-weight average of all the squared FFT magnitudes. John
On Jan 4, 3:37&#4294967295;pm, HardySpicer <gyansor...@gmail.com> wrote:
> On Jan 5, 7:06&#4294967295;am, dbd <d...@ieee.org> wrote: > > > > > On Jan 3, 3:30 pm, Adam Chapman > > > <adam.chap...@student.manchester.ac.uk> wrote: > > > Does a short-time fourier transform have greater frequency resolution > > > that a normal fft? > > > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html > > > would suggest not since it's only using a normal fft on a frame of > > > data rather than the whole dataset. > > > Adam > > > To keep us all talking about the same things, it may be useful to > > separate spectrum analysis schemes into three parts: > > 1) data blocking > > 2) transform (including transform size) > > 3) post processing > > > The data blocking consists of operations to prepare sets of data for > > transformation. This may include weighting (windowing), folding, zero > > extension, overlap and other functions. The input data blocks may be > > larger, smaller or the same size as the transform. The data blocks may > > be adjacent, overlapped, or have gaps between them. > > > The transform takes real or complex time domain samples and produces > > frequency domain samples, usually complex. > > > Post processing may consist of power spectrum conversion, simply > > passing on the complex coefficients or many other operations such as > > averaging multiple outputs. > > > With respect to the transform portion, the fft is an algorithm that > > usually provides an efficient means to calculate the STFT. The > > reference you cite happens to include zero sampling as part of the > > data blocking performed before the transform. > > > Any question of 'frequency resolution' in comp.dsp needs a definition > > of which 'frequency resolution' you mean. > > > a) If you mean 'frequency difference between centers of adjacent > > bins', that is determined by transform size and sampling frequency, > > not any of the data blocking functions like zero filling. > > > b) If you mean the 'window variance' or 'effective noise bandwidth' > > definition of resolution, it is a function of the window length and > > weights. > > > c) If your definition relates to 'variance of the estimation of an > > output parameter', you need to define the parameter; then it will > > depend on the data blocking, the transform, the post processing > > algorithm and the signal to noise ratio of the input signal. > > > d) If your definition of resolution relates to 'resolving two tones' > > then you must specify the post processing functions you will use to > > decide when there are two tones and any other conditions, thresholds > > or probabilities for your definition. > > > Few OPs are looking for this analysis. Unfortunately, even experienced > > responders on comp.dsp will make their own best guess about the OP's > > meaning and reply without stating their assumptions. If you get > > multiple unrelated responses, this may be why. > > > HardySpicer seems to have made the a) answer. > > Rune seems to discuss effects of data blocking on the interpretation > > of your question. > > robert seems to have suggested the advantages of a post processing > > scheme. > > > So, Adam, which if any of these did you mean? Would you like to state > > a more specific question? > > > Good luck! > > > Dale B. Dalrymplehttp://dbdimages.com > > With post-processing you can of course average successive FFT and get > a smooth result for noisy transforms. Smoothness in my book however > does not imply greater resolution, just the ability to read the result > better with less noise present. Averaging is after all just > integrating of a sort and hence a low-pass filter. I often use for the > frequency vector X (conjugate X*) > > S(i)=betaS(i-1) + (1-beta)XX* > > where S(i) is the periodogram at frame i and beta (<1) is a forgetting > factor. This is just a low-pass filter of course. > I think this is known as the Welch method. > > H.
That sounds like the "exponential" averaging mode that you might see on your spectrum analyzer. Jason
On Jan 4, 6:06&#4294967295;pm, dbd <d...@ieee.org> wrote:
> On Jan 3, 3:30 pm, Adam Chapman > > <adam.chap...@student.manchester.ac.uk> wrote: > > Does a short-time fourier transform have greater frequency resolution > > that a normal fft? > > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html > > would suggest not since it's only using a normal fft on a frame of > > data rather than the whole dataset. > > Adam > > To keep us all talking about the same things, it may be useful to > separate spectrum analysis schemes into three parts: > 1) data blocking > 2) transform (including transform size) > 3) post processing > > The data blocking consists of operations to prepare sets of data for > transformation. This may include weighting (windowing), folding, zero > extension, overlap and other functions. The input data blocks may be > larger, smaller or the same size as the transform. The data blocks may > be adjacent, overlapped, or have gaps between them. > > The transform takes real or complex time domain samples and produces > frequency domain samples, usually complex. > > Post processing may consist of power spectrum conversion, simply > passing on the complex coefficients or many other operations such as > averaging multiple outputs. > > With respect to the transform portion, the fft is an algorithm that > usually provides an efficient means to calculate the STFT. The > reference you cite happens to include zero sampling as part of the > data blocking performed before the transform. > > Any question of 'frequency resolution' in comp.dsp needs a definition > of which 'frequency resolution' you mean. > > a) If you mean 'frequency difference between centers of adjacent > bins', that is determined by transform size and sampling frequency, > not any of the data blocking functions like zero filling. > > b) If you mean the 'window variance' or 'effective noise bandwidth' > definition of resolution, it is a function of the window length and > weights. > > c) If your definition relates to 'variance of the estimation of an > output parameter', you need to define the parameter; then it will > depend on the data blocking, the transform, the post processing > algorithm and the signal to noise ratio of the input signal. > > d) If your definition of resolution relates to 'resolving two tones' > then you must specify the post processing functions you will use to > decide when there are two tones and any other conditions, thresholds > or probabilities for your definition. > > Few OPs are looking for this analysis. Unfortunately, even experienced > responders on comp.dsp will make their own best guess about the OP's > meaning and reply without stating their assumptions. If you get > multiple unrelated responses, this may be why. > > HardySpicer seems to have made the a) answer. > Rune seems to discuss effects of data blocking on the interpretation > of your question. > robert seems to have suggested the advantages of a post processing > scheme. > > So, Adam, which if any of these did you mean? Would you like to state > a more specific question? > > Good luck! > > Dale B. Dalrymplehttp://dbdimages.com
My question was a) 'frequency difference between centers of adjacent bins' It seems the number of bins in matlab's fft is equal to the size of the Hamming window used when getting data from an audio stream. When you perform an fft on audio, you only use the first half f the result (since it's reflected anyway), but do you assume the highest frequency is now Fs or Fs/2? Wouldn't fft data from frequncies above Nyquist be inaccurate? I'm trying to find the fundamental frequency of voice signals, and though maybe I could get greater stft frequency resolution somehow. Sometimes the method I'm using estimates f0 as the actual 2nd harmonic of the real f0, but I've found a paper that adds extra steps to the REPS algorithm to resolve this. Jean Laroche's ideas look pretty interesting, I'll check them out soon. Thanks for all the responses too :)
On Jan 5, 11:57 am, Adam Chapman
<adam.chap...@student.manchester.ac.uk> wrote:
> On Jan 4, 6:06 pm, dbd <d...@ieee.org> wrote: > > > > > On Jan 3, 3:30 pm, Adam Chapman > > > <adam.chap...@student.manchester.ac.uk> wrote: > > > Does a short-time fourier transform have greater frequency resolution > > > that a normal fft? > > > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html > > > would suggest not since it's only using a normal fft on a frame of > > > data rather than the whole dataset. > > > Adam > > > To keep us all talking about the same things, it may be useful to > > separate spectrum analysis schemes into three parts: > > 1) data blocking > > 2) transform (including transform size) > > 3) post processing > > > The data blocking consists of operations to prepare sets of data for > > transformation. This may include weighting (windowing), folding, zero > > extension, overlap and other functions. The input data blocks may be > > larger, smaller or the same size as the transform. The data blocks may > > be adjacent, overlapped, or have gaps between them. > > > The transform takes real or complex time domain samples and produces > > frequency domain samples, usually complex. > > > Post processing may consist of power spectrum conversion, simply > > passing on the complex coefficients or many other operations such as > > averaging multiple outputs. > > > With respect to the transform portion, the fft is an algorithm that > > usually provides an efficient means to calculate the STFT. The > > reference you cite happens to include zero sampling as part of the > > data blocking performed before the transform. > > > Any question of 'frequency resolution' in comp.dsp needs a definition > > of which 'frequency resolution' you mean. > > > a) If you mean 'frequency difference between centers of adjacent > > bins', that is determined by transform size and sampling frequency, > > not any of the data blocking functions like zero filling. > > > b) If you mean the 'window variance' or 'effective noise bandwidth' > > definition of resolution, it is a function of the window length and > > weights. > > > c) If your definition relates to 'variance of the estimation of an > > output parameter', you need to define the parameter; then it will > > depend on the data blocking, the transform, the post processing > > algorithm and the signal to noise ratio of the input signal. > > > d) If your definition of resolution relates to 'resolving two tones' > > then you must specify the post processing functions you will use to > > decide when there are two tones and any other conditions, thresholds > > or probabilities for your definition. > > > Few OPs are looking for this analysis. Unfortunately, even experienced > > responders on comp.dsp will make their own best guess about the OP's > > meaning and reply without stating their assumptions. If you get > > multiple unrelated responses, this may be why. > > > HardySpicer seems to have made the a) answer. > > Rune seems to discuss effects of data blocking on the interpretation > > of your question. > > robert seems to have suggested the advantages of a post processing > > scheme. > > > So, Adam, which if any of these did you mean? Would you like to state > > a more specific question? > > > Good luck! > > > Dale B. Dalrymplehttp://dbdimages.com > > My question was a) 'frequency difference between centers of adjacent > bins' > > It seems the number of bins in matlab's fft is equal to the size of > the Hamming window used when getting data from an audio stream.
You always have the option of appending zeros to your windowed data set and performing a larger transform than the size of your data. You can also use a smaller fft by splitting your data into blocks that you sum element by element before transforming. This is less common.
> When > you perform an fft on audio, you only use the first half f the result > (since it's reflected anyway), but do you assume the highest frequency > is now Fs or Fs/2?
Fs/2 Wouldn't fft data from frequncies above Nyquist be
> inaccurate? > > I'm trying to find the fundamental frequency of voice signals, and > though maybe I could get greater stft frequency resolution somehow. > Sometimes the method I'm using estimates f0 as the actual 2nd harmonic > of the real f0, but I've found a paper that adds extra steps to the > REPS algorithm to resolve this.
There's no guarantee that the first harmonic is the most energetic on speech and music signals.
> > Jean Laroche's ideas look pretty interesting, I'll check them out > soon. > > Thanks for all the responses too :)
Dale B. Dalrymple