comp.dsp | stft vs fft resolution

Does a short-time fourier transform have greater frequency resolution
that a normal fft?

the code at
http://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
would suggest not since it's only using a normal fft on a frame of
data rather than the whole dataset.

Reply by HardySpicer ●January 3, 20092009-01-03

On Jan 4, 12:30&#4294967295;pm, Adam Chapman
<adam.chap...@student.manchester.ac.uk> wrote:
> Does a short-time fourier transform have greater frequency resolution
> that a normal fft?
>
> the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
> would suggest not since it's only using a normal fft on a frame of
> data rather than the whole dataset.

Resolution is sampling freq/Number of points for a normal FFT. Making
it short-time only makes it have the ability to track time-varying
components more accurately.

H.

Reply by Rune Allnor ●January 3, 20092009-01-03

On 4 Jan, 00:30, Adam Chapman <adam.chap...@student.manchester.ac.uk>
wrote:
> Does a short-time fourier transform have greater frequency resolution
> that a normal fft?

No. The DFT produces a set of coefficients equidistant in
frequency domain, that are spaced Fs/N frequency units apart.
So since the STFT contains fewer samples than the full DFT
of the same data, the frequency resolution is in fact coarser.

However, the full DFT contains no time information. If you
are looking at a dynamic signal, say a cirp, this fact will
not be visile in the DFT. By computing a number of shorter
DFTs, based on data frames slightly offset in time, one might
track the dynamic evolution of the signal.

So the STFT trades frequency resolution for time evolution.

Rune

Reply by robert bristow-johnson ●January 4, 20092009-01-04

On Jan 3, 6:30&#4294967295;pm, Adam Chapman
<adam.chap...@student.manchester.ac.uk> wrote:
> Does a short-time fourier transform have greater frequency resolution
> that a normal fft?

now, while i don't disagree strictly with Hardy and Rune, i will say
this:

with adjacent frames in the STFT, you can get greater frequency
resolution (better than Nyquist/(N/2)) by observing the change in
phase of a spectral peak from one frame to the next.  this is key in
Jean Laroche's phase vocoder implementation (Jean is well known to us
audio heads).  it's important to estimate the frequency to a better
resolution than what you get with N points in an FFT.

> the code at http://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html

i noticed what i thought is a small mistake in Julius's MATLAB
snippet.

i think it should be:

____________________________________

M = length(w);           % M = (even) window length, N = FFT length
Mo2 = M/2;
nframes = ceil(length(x)/R);  % R = STFT hop-size
x = [x; zeros(R+M-1,1)]; % make sure we don't run offa de end
s = linspace(0,j*2*pi*(N-1)/N,N)';
Xtwz = zeros(N,nframes); % pre-allocate STFT output array
zp = zeros(N-M,1);       % zero padding (to be inserted)
xoff = 0;                % current offset in input signal x
for m=1:nframes
   xt = x(xoff+1:xoff+M); % extract frame of input data
   xtw = w .* xt;         % apply window to current frame
   xtwz = [xtw(Mo2+1:M); zp; xtw(1:Mo2)]; % windowed, zero padded
   Xtwz(:,m) = exp(-xoff*s) .* fft(xtwz);  % STFT frame m, compensate
neg delay
   xoff = xoff + R;       % advance in-pointer by hop-size R
end

____________________________________

note the "exp(-xoff*s)" compensation for phase.  i think that is
necessary.

r b-j

Reply by dbd ●January 4, 20092009-01-04

On Jan 3, 3:30 pm, Adam Chapman
<adam.chap...@student.manchester.ac.uk> wrote:
> Does a short-time fourier transform have greater frequency resolution
> that a normal fft?
>
> the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
> would suggest not since it's only using a normal fft on a frame of
> data rather than the whole dataset.

Adam

To keep us all talking about the same things, it may be useful to
separate spectrum analysis schemes into three parts:
1) data blocking
2) transform (including transform size)
3) post processing

The data blocking consists of operations to prepare sets of data for
transformation. This may include weighting (windowing), folding, zero
extension, overlap and other functions. The input data blocks may be
larger, smaller or the same size as the transform. The data blocks may
be adjacent, overlapped, or have gaps between them.

The transform takes real or complex time domain samples and produces
frequency domain samples, usually complex.

Post processing may consist of power spectrum conversion, simply
passing on the complex coefficients or many other operations such as
averaging multiple outputs.

With respect to the transform portion, the fft is an algorithm that
usually provides an efficient means to calculate the STFT. The
reference you cite happens to include zero sampling as part of the
data blocking performed before the transform.

Any question of 'frequency resolution' in comp.dsp needs a definition
of which 'frequency resolution' you mean.

a) If you mean 'frequency difference between centers of adjacent
bins', that is determined by transform size and sampling frequency,
not any of the data blocking functions like zero filling.

b) If you mean the 'window variance' or 'effective noise bandwidth'
definition of resolution, it is a function of the window length and
weights.

c) If your definition relates to 'variance of the estimation of an
output parameter', you need to define the parameter; then it will
depend on the data blocking, the transform, the post processing
algorithm and the signal to noise ratio of the input signal.

d) If your definition of resolution relates to 'resolving two tones'
then you must specify the post processing functions you will use to
decide when there are two tones and any other conditions, thresholds
or probabilities for your definition.

Few OPs are looking for this analysis. Unfortunately, even experienced
responders on comp.dsp will make their own best guess about the OP's
meaning and reply without stating their assumptions. If you get
multiple unrelated responses, this may be why.

HardySpicer seems to have made the a) answer.
Rune seems to discuss effects of data blocking on the interpretation
of your question.
robert seems to have suggested the advantages of a post processing
scheme.

So, Adam, which if any of these did you mean? Would you like to state
a more specific question?

Good luck!

Dale B. Dalrymple
http://dbdimages.com

Reply by HardySpicer ●January 4, 20092009-01-04

On Jan 5, 7:06&#4294967295;am, dbd <d...@ieee.org> wrote:
> On Jan 3, 3:30 pm, Adam Chapman
>
> <adam.chap...@student.manchester.ac.uk> wrote:
> > Does a short-time fourier transform have greater frequency resolution
> > that a normal fft?
>
> > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
> > would suggest not since it's only using a normal fft on a frame of
> > data rather than the whole dataset.
>
> Adam
>
> To keep us all talking about the same things, it may be useful to
> separate spectrum analysis schemes into three parts:
> 1) data blocking
> 2) transform (including transform size)
> 3) post processing
>
> The data blocking consists of operations to prepare sets of data for
> transformation. This may include weighting (windowing), folding, zero
> extension, overlap and other functions. The input data blocks may be
> larger, smaller or the same size as the transform. The data blocks may
> be adjacent, overlapped, or have gaps between them.
>
> The transform takes real or complex time domain samples and produces
> frequency domain samples, usually complex.
>
> Post processing may consist of power spectrum conversion, simply
> passing on the complex coefficients or many other operations such as
> averaging multiple outputs.
>
> With respect to the transform portion, the fft is an algorithm that
> usually provides an efficient means to calculate the STFT. The
> reference you cite happens to include zero sampling as part of the
> data blocking performed before the transform.
>
> Any question of 'frequency resolution' in comp.dsp needs a definition
> of which 'frequency resolution' you mean.
>
> a) If you mean 'frequency difference between centers of adjacent
> bins', that is determined by transform size and sampling frequency,
> not any of the data blocking functions like zero filling.
>
> b) If you mean the 'window variance' or 'effective noise bandwidth'
> definition of resolution, it is a function of the window length and
> weights.
>
> c) If your definition relates to 'variance of the estimation of an
> output parameter', you need to define the parameter; then it will
> depend on the data blocking, the transform, the post processing
> algorithm and the signal to noise ratio of the input signal.
>
> d) If your definition of resolution relates to 'resolving two tones'
> then you must specify the post processing functions you will use to
> decide when there are two tones and any other conditions, thresholds
> or probabilities for your definition.
>
> Few OPs are looking for this analysis. Unfortunately, even experienced
> responders on comp.dsp will make their own best guess about the OP's
> meaning and reply without stating their assumptions. If you get
> multiple unrelated responses, this may be why.
>
> HardySpicer seems to have made the a) answer.
> Rune seems to discuss effects of data blocking on the interpretation
> of your question.
> robert seems to have suggested the advantages of a post processing
> scheme.
>
> So, Adam, which if any of these did you mean? Would you like to state
> a more specific question?
>
> Good luck!
>
> Dale B. Dalrymplehttp://dbdimages.com

With post-processing you can of course average successive FFT and get
a smooth result for noisy transforms. Smoothness in my book however
does not imply greater resolution, just the ability to read the result
better with less noise present. Averaging is after all just
integrating of a sort and hence a low-pass filter. I often use for the
frequency vector X (conjugate X*)

S(i)=betaS(i-1) + (1-beta)XX*

where S(i) is the periodogram at frame i and beta (<1) is a forgetting
factor. This is just a low-pass filter of course.
I think this is known as the Welch method.

H.

Reply by John ●January 4, 20092009-01-04

On Jan 4, 3:37&#4294967295;pm, HardySpicer <gyansor...@gmail.com> wrote:
> On Jan 5, 7:06&#4294967295;am, dbd <d...@ieee.org> wrote:
>
>
>
> > On Jan 3, 3:30 pm, Adam Chapman
>
> > <adam.chap...@student.manchester.ac.uk> wrote:
> > > Does a short-time fourier transform have greater frequency resolution
> > > that a normal fft?
>
> > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
> > > would suggest not since it's only using a normal fft on a frame of
> > > data rather than the whole dataset.
>
> > Adam
>
> > To keep us all talking about the same things, it may be useful to
> > separate spectrum analysis schemes into three parts:
> > 1) data blocking
> > 2) transform (including transform size)
> > 3) post processing
>
> > The data blocking consists of operations to prepare sets of data for
> > transformation. This may include weighting (windowing), folding, zero
> > extension, overlap and other functions. The input data blocks may be
> > larger, smaller or the same size as the transform. The data blocks may
> > be adjacent, overlapped, or have gaps between them.
>
> > The transform takes real or complex time domain samples and produces
> > frequency domain samples, usually complex.
>
> > Post processing may consist of power spectrum conversion, simply
> > passing on the complex coefficients or many other operations such as
> > averaging multiple outputs.
>
> > With respect to the transform portion, the fft is an algorithm that
> > usually provides an efficient means to calculate the STFT. The
> > reference you cite happens to include zero sampling as part of the
> > data blocking performed before the transform.
>
> > Any question of 'frequency resolution' in comp.dsp needs a definition
> > of which 'frequency resolution' you mean.
>
> > a) If you mean 'frequency difference between centers of adjacent
> > bins', that is determined by transform size and sampling frequency,
> > not any of the data blocking functions like zero filling.
>
> > b) If you mean the 'window variance' or 'effective noise bandwidth'
> > definition of resolution, it is a function of the window length and
> > weights.
>
> > c) If your definition relates to 'variance of the estimation of an
> > output parameter', you need to define the parameter; then it will
> > depend on the data blocking, the transform, the post processing
> > algorithm and the signal to noise ratio of the input signal.
>
> > d) If your definition of resolution relates to 'resolving two tones'
> > then you must specify the post processing functions you will use to
> > decide when there are two tones and any other conditions, thresholds
> > or probabilities for your definition.
>
> > Few OPs are looking for this analysis. Unfortunately, even experienced
> > responders on comp.dsp will make their own best guess about the OP's
> > meaning and reply without stating their assumptions. If you get
> > multiple unrelated responses, this may be why.
>
> > HardySpicer seems to have made the a) answer.
> > Rune seems to discuss effects of data blocking on the interpretation
> > of your question.
> > robert seems to have suggested the advantages of a post processing
> > scheme.
>
> > So, Adam, which if any of these did you mean? Would you like to state
> > a more specific question?
>
> > Good luck!
>
> > Dale B. Dalrymplehttp://dbdimages.com
>
> With post-processing you can of course average successive FFT and get
> a smooth result for noisy transforms. Smoothness in my book however
> does not imply greater resolution, just the ability to read the result
> better with less noise present. Averaging is after all just
> integrating of a sort and hence a low-pass filter. I often use for the
> frequency vector X (conjugate X*)
>
> S(i)=betaS(i-1) + (1-beta)XX*
>
> where S(i) is the periodogram at frame i and beta (<1) is a forgetting
> factor. This is just a low-pass filter of course.
> I think this is known as the Welch method.
>
> H.

Welch's method uses an equal-weight average of all the squared FFT
magnitudes.

John

Reply by ●January 4, 20092009-01-04

On Jan 4, 3:37&#4294967295;pm, HardySpicer <gyansor...@gmail.com> wrote:
> On Jan 5, 7:06&#4294967295;am, dbd <d...@ieee.org> wrote:
>
>
>
> > On Jan 3, 3:30 pm, Adam Chapman
>
> > <adam.chap...@student.manchester.ac.uk> wrote:
> > > Does a short-time fourier transform have greater frequency resolution
> > > that a normal fft?
>
> > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
> > > would suggest not since it's only using a normal fft on a frame of
> > > data rather than the whole dataset.
>
> > Adam
>
> > To keep us all talking about the same things, it may be useful to
> > separate spectrum analysis schemes into three parts:
> > 1) data blocking
> > 2) transform (including transform size)
> > 3) post processing
>
> > The data blocking consists of operations to prepare sets of data for
> > transformation. This may include weighting (windowing), folding, zero
> > extension, overlap and other functions. The input data blocks may be
> > larger, smaller or the same size as the transform. The data blocks may
> > be adjacent, overlapped, or have gaps between them.
>
> > The transform takes real or complex time domain samples and produces
> > frequency domain samples, usually complex.
>
> > Post processing may consist of power spectrum conversion, simply
> > passing on the complex coefficients or many other operations such as
> > averaging multiple outputs.
>
> > With respect to the transform portion, the fft is an algorithm that
> > usually provides an efficient means to calculate the STFT. The
> > reference you cite happens to include zero sampling as part of the
> > data blocking performed before the transform.
>
> > Any question of 'frequency resolution' in comp.dsp needs a definition
> > of which 'frequency resolution' you mean.
>
> > a) If you mean 'frequency difference between centers of adjacent
> > bins', that is determined by transform size and sampling frequency,
> > not any of the data blocking functions like zero filling.
>
> > b) If you mean the 'window variance' or 'effective noise bandwidth'
> > definition of resolution, it is a function of the window length and
> > weights.
>
> > c) If your definition relates to 'variance of the estimation of an
> > output parameter', you need to define the parameter; then it will
> > depend on the data blocking, the transform, the post processing
> > algorithm and the signal to noise ratio of the input signal.
>
> > d) If your definition of resolution relates to 'resolving two tones'
> > then you must specify the post processing functions you will use to
> > decide when there are two tones and any other conditions, thresholds
> > or probabilities for your definition.
>
> > Few OPs are looking for this analysis. Unfortunately, even experienced
> > responders on comp.dsp will make their own best guess about the OP's
> > meaning and reply without stating their assumptions. If you get
> > multiple unrelated responses, this may be why.
>
> > HardySpicer seems to have made the a) answer.
> > Rune seems to discuss effects of data blocking on the interpretation
> > of your question.
> > robert seems to have suggested the advantages of a post processing
> > scheme.
>
> > So, Adam, which if any of these did you mean? Would you like to state
> > a more specific question?
>
> > Good luck!
>
> > Dale B. Dalrymplehttp://dbdimages.com
>
> With post-processing you can of course average successive FFT and get
> a smooth result for noisy transforms. Smoothness in my book however
> does not imply greater resolution, just the ability to read the result
> better with less noise present. Averaging is after all just
> integrating of a sort and hence a low-pass filter. I often use for the
> frequency vector X (conjugate X*)
>
> S(i)=betaS(i-1) + (1-beta)XX*
>
> where S(i) is the periodogram at frame i and beta (<1) is a forgetting
> factor. This is just a low-pass filter of course.
> I think this is known as the Welch method.
>
> H.

That sounds like the "exponential" averaging mode that you might see
on your spectrum analyzer.

Jason

Reply by Adam Chapman ●January 5, 20092009-01-05

On Jan 4, 6:06&#4294967295;pm, dbd <d...@ieee.org> wrote:
> On Jan 3, 3:30 pm, Adam Chapman
>
> <adam.chap...@student.manchester.ac.uk> wrote:
> > Does a short-time fourier transform have greater frequency resolution
> > that a normal fft?
>
> > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
> > would suggest not since it's only using a normal fft on a frame of
> > data rather than the whole dataset.
>
> Adam
>
> To keep us all talking about the same things, it may be useful to
> separate spectrum analysis schemes into three parts:
> 1) data blocking
> 2) transform (including transform size)
> 3) post processing
>
> The data blocking consists of operations to prepare sets of data for
> transformation. This may include weighting (windowing), folding, zero
> extension, overlap and other functions. The input data blocks may be
> larger, smaller or the same size as the transform. The data blocks may
> be adjacent, overlapped, or have gaps between them.
>
> The transform takes real or complex time domain samples and produces
> frequency domain samples, usually complex.
>
> Post processing may consist of power spectrum conversion, simply
> passing on the complex coefficients or many other operations such as
> averaging multiple outputs.
>
> With respect to the transform portion, the fft is an algorithm that
> usually provides an efficient means to calculate the STFT. The
> reference you cite happens to include zero sampling as part of the
> data blocking performed before the transform.
>
> Any question of 'frequency resolution' in comp.dsp needs a definition
> of which 'frequency resolution' you mean.
>
> a) If you mean 'frequency difference between centers of adjacent
> bins', that is determined by transform size and sampling frequency,
> not any of the data blocking functions like zero filling.
>
> b) If you mean the 'window variance' or 'effective noise bandwidth'
> definition of resolution, it is a function of the window length and
> weights.
>
> c) If your definition relates to 'variance of the estimation of an
> output parameter', you need to define the parameter; then it will
> depend on the data blocking, the transform, the post processing
> algorithm and the signal to noise ratio of the input signal.
>
> d) If your definition of resolution relates to 'resolving two tones'
> then you must specify the post processing functions you will use to
> decide when there are two tones and any other conditions, thresholds
> or probabilities for your definition.
>
> Few OPs are looking for this analysis. Unfortunately, even experienced
> responders on comp.dsp will make their own best guess about the OP's
> meaning and reply without stating their assumptions. If you get
> multiple unrelated responses, this may be why.
>
> HardySpicer seems to have made the a) answer.
> Rune seems to discuss effects of data blocking on the interpretation
> of your question.
> robert seems to have suggested the advantages of a post processing
> scheme.
>
> So, Adam, which if any of these did you mean? Would you like to state
> a more specific question?
>
> Good luck!
>
> Dale B. Dalrymplehttp://dbdimages.com

My question was  a) 'frequency difference between centers of adjacent
bins'

It seems the number of bins in matlab's fft is equal to the size of
the Hamming window used when getting data from an audio stream. When
you perform an fft on audio, you only use the first half f the result
(since it's reflected anyway), but do you assume the highest frequency
is now Fs or Fs/2? Wouldn't fft data from frequncies above Nyquist be
inaccurate?

I'm trying to find the fundamental frequency of voice signals, and
though maybe I could get greater stft frequency resolution somehow.
Sometimes the method I'm using estimates f0 as the actual 2nd harmonic
of the real f0, but I've found a paper that adds extra steps to the
REPS algorithm to resolve this.

Jean Laroche's ideas look pretty interesting, I'll check them out
soon.

Thanks for all the responses too :)

Reply by dbd ●January 5, 20092009-01-05

On Jan 5, 11:57 am, Adam Chapman
<adam.chap...@student.manchester.ac.uk> wrote:
> On Jan 4, 6:06 pm, dbd <d...@ieee.org> wrote:
>
>
>
> > On Jan 3, 3:30 pm, Adam Chapman
>
> > <adam.chap...@student.manchester.ac.uk> wrote:
> > > Does a short-time fourier transform have greater frequency resolution
> > > that a normal fft?
>
> > > the code athttp://ccrma.stanford.edu/~jos/sasp/STFT_Matlab.html
> > > would suggest not since it's only using a normal fft on a frame of
> > > data rather than the whole dataset.
>
> > Adam
>
> > To keep us all talking about the same things, it may be useful to
> > separate spectrum analysis schemes into three parts:
> > 1) data blocking
> > 2) transform (including transform size)
> > 3) post processing
>
> > The data blocking consists of operations to prepare sets of data for
> > transformation. This may include weighting (windowing), folding, zero
> > extension, overlap and other functions. The input data blocks may be
> > larger, smaller or the same size as the transform. The data blocks may
> > be adjacent, overlapped, or have gaps between them.
>
> > The transform takes real or complex time domain samples and produces
> > frequency domain samples, usually complex.
>
> > Post processing may consist of power spectrum conversion, simply
> > passing on the complex coefficients or many other operations such as
> > averaging multiple outputs.
>
> > With respect to the transform portion, the fft is an algorithm that
> > usually provides an efficient means to calculate the STFT. The
> > reference you cite happens to include zero sampling as part of the
> > data blocking performed before the transform.
>
> > Any question of 'frequency resolution' in comp.dsp needs a definition
> > of which 'frequency resolution' you mean.
>
> > a) If you mean 'frequency difference between centers of adjacent
> > bins', that is determined by transform size and sampling frequency,
> > not any of the data blocking functions like zero filling.
>
> > b) If you mean the 'window variance' or 'effective noise bandwidth'
> > definition of resolution, it is a function of the window length and
> > weights.
>
> > c) If your definition relates to 'variance of the estimation of an
> > output parameter', you need to define the parameter; then it will
> > depend on the data blocking, the transform, the post processing
> > algorithm and the signal to noise ratio of the input signal.
>
> > d) If your definition of resolution relates to 'resolving two tones'
> > then you must specify the post processing functions you will use to
> > decide when there are two tones and any other conditions, thresholds
> > or probabilities for your definition.
>
> > Few OPs are looking for this analysis. Unfortunately, even experienced
> > responders on comp.dsp will make their own best guess about the OP's
> > meaning and reply without stating their assumptions. If you get
> > multiple unrelated responses, this may be why.
>
> > HardySpicer seems to have made the a) answer.
> > Rune seems to discuss effects of data blocking on the interpretation
> > of your question.
> > robert seems to have suggested the advantages of a post processing
> > scheme.
>
> > So, Adam, which if any of these did you mean? Would you like to state
> > a more specific question?
>
> > Good luck!
>
> > Dale B. Dalrymplehttp://dbdimages.com
>
> My question was  a) 'frequency difference between centers of adjacent
> bins'
>
> It seems the number of bins in matlab's fft is equal to the size of
> the Hamming window used when getting data from an audio stream.

You always have the option of appending zeros to your windowed data
set and performing a larger transform than the size of your data. You
can also use a smaller fft by splitting your data into blocks that you
sum element by element before transforming. This is less common.

>                                                                                         When
> you perform an fft on audio, you only use the first half f the result
> (since it's reflected anyway), but do you assume the highest frequency
> is now Fs or Fs/2?

Fs/2

Wouldn't fft data from frequncies above Nyquist be
> inaccurate?
>
> I'm trying to find the fundamental frequency of voice signals, and
> though maybe I could get greater stft frequency resolution somehow.
> Sometimes the method I'm using estimates f0 as the actual 2nd harmonic
> of the real f0, but I've found a paper that adds extra steps to the
> REPS algorithm to resolve this.

There's no guarantee that the first harmonic is the most energetic on
speech and music signals.

>
> Jean Laroche's ideas look pretty interesting, I'll check them out
> soon.
>
> Thanks for all the responses too :)

Dale B. Dalrymple

stft vs fft resolution

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group