Forums

FFT of audio signal resolution

Started by vivu91 September 25, 2010
Hi,
I have a doubt which might be very basic. I am currently working on a
project. In that i input a test audio signal of 1.5 seconds in which the
actual audio is of only 0.5 seconds or so. The sampling frequency is 44100.
I am currently using a hanning window and the pwelch command in matlab to
find the frequency spectrum. However I am using a hanning window of size
128 and that gives me a resolution of only 173Hz or so. I would like to
improve the resolution to 1hz or so. I am only interested in the
frequencies at which the peaks occur the actual magnitude is irrelevant. 

I thought of increasing window size but was not sure whether that would
improve the resolution or increase noise. 

Any suggestions or help along these lines would be extremely helpful.
I tried reading through a few books all the books seem to suggest to
improve resolution to pad with zeros. The problem for me is that I have a
long enough duration signal but am not sure if i take a large window size
if resolution improves or noise is added.

Please help i need the solution for this as quickly as possible.

Vivek


On 9/25/2010 11:18 AM, vivu91 wrote:
> Hi, > I have a doubt which might be very basic. I am currently working on a > project. In that i input a test audio signal of 1.5 seconds in which the > actual audio is of only 0.5 seconds or so. The sampling frequency is 44100. > I am currently using a hanning window and the pwelch command in matlab to > find the frequency spectrum. However I am using a hanning window of size > 128 and that gives me a resolution of only 173Hz or so. I would like to > improve the resolution to 1hz or so. I am only interested in the > frequencies at which the peaks occur the actual magnitude is irrelevant. > > I thought of increasing window size but was not sure whether that would > improve the resolution or increase noise. > > Any suggestions or help along these lines would be extremely helpful. > I tried reading through a few books all the books seem to suggest to > improve resolution to pad with zeros. The problem for me is that I have a > long enough duration signal but am not sure if i take a large window size > if resolution improves or noise is added. > > Please help i need the solution for this as quickly as possible. > > Vivek > >
You want the window to be as long as the sample sequence unless you want to analyze a shorter sequence with lower resolution. The purpose of the window is to reduce spectral spreading, not to reduce noise as if it were a lowpass or bandpass filter. I figure that in 1.5 seconds you have 66,150 samples, is that right? So the window would be 66,150 samples long as well. The resolution should be 0.67Hz with this sequence length. Or, if you have 0.5 seconds of real data and the rest are zeros then the apparent resolution is as above and the real resolution would be 2Hz. And, you're multiplying the samples by the window in the time domain, yes? Now, if this resolution is too high / if the number of samples in the sequence is too great for some reason then you can take fewer samples and adjust the resolution. In your context that may be equivalent to using a shorter window I'm not sure of your numbers given: 44,100 has a sample interval of 0.00226 msec 128 samples has a duration of about 0.0029 secs This corresponds to a resolution of 344Hz not 173Hz (the reciprocal of the duration). Anyway, this is the math that relates resolution to duration and you can adjust as needed or desired. As a test, try some sequences without the window. Same duration/resolution numbers as above in rough terms. The more aggressive the window, the less spread energy in the spectrum and a little worse resolution. Note: "resolution" can mean the distance between frequency samples OR, if you have appended zeros to the sequence then the distance between frequency samples will be less than the "real" resolution - it will be over-resolved. You can demonstrate this for yourself by taking a sum of two sinusoids that are say 5% apart in frequency. Use their sum as the input and sample them at some reasonable rate so that they can be resolved in frequency. (say sample at 8X the difference in thier frequencies for starters) Then, zero out half the samples in a contiguous fashion (keeping the number of total samples the same) and analyze again. Repeat zeroing out half of the samples remaining - until you can no longer resolve the two sinusoids. At each step, the frequency information becomes "smoother" until the two sinusoids in frequency become one peak. That will demonstrate the point that the apparent frequency resolution (determined by the sample interval NT as 1/NT) is better than the "real" frequency resolution which does indeed change as the nonzero duration is reduced. This is very much like using windows of different lengths (assuming the window and the sample duration match in each case). I hope this helps. Fred
>On 9/25/2010 11:18 AM, vivu91 wrote: >> Hi, >> I have a doubt which might be very basic. I am currently working on a >> project. In that i input a test audio signal of 1.5 seconds in which
the
>> actual audio is of only 0.5 seconds or so. The sampling frequency is
44100.
>> I am currently using a hanning window and the pwelch command in matlab
to
>> find the frequency spectrum. However I am using a hanning window of
size
>> 128 and that gives me a resolution of only 173Hz or so. I would like to >> improve the resolution to 1hz or so. I am only interested in the >> frequencies at which the peaks occur the actual magnitude is
irrelevant.
>> >> I thought of increasing window size but was not sure whether that would >> improve the resolution or increase noise. >> >> Any suggestions or help along these lines would be extremely helpful. >> I tried reading through a few books all the books seem to suggest to >> improve resolution to pad with zeros. The problem for me is that I have
a
>> long enough duration signal but am not sure if i take a large window
size
>> if resolution improves or noise is added. >> >> Please help i need the solution for this as quickly as possible. >> >> Vivek >> >> > >You want the window to be as long as the sample sequence unless you want >to analyze a shorter sequence with lower resolution. >The purpose of the window is to reduce spectral spreading, not to reduce >noise as if it were a lowpass or bandpass filter. > >I figure that in 1.5 seconds you have 66,150 samples, is that right? So >the window would be 66,150 samples long as well. >The resolution should be 0.67Hz with this sequence length. >Or, if you have 0.5 seconds of real data and the rest are zeros then the >apparent resolution is as above and the real resolution would be 2Hz. > >And, you're multiplying the samples by the window in the time domain,
yes?
> >Now, if this resolution is too high / if the number of samples in the >sequence is too great for some reason then you can take fewer samples >and adjust the resolution. In your context that may be equivalent to >using a shorter window > >I'm not sure of your numbers given: >44,100 has a sample interval of 0.00226 msec >128 samples has a duration of about 0.0029 secs >This corresponds to a resolution of 344Hz not 173Hz (the reciprocal of >the duration). >Anyway, this is the math that relates resolution to duration and you can >adjust as needed or desired. > >As a test, try some sequences without the window. Same >duration/resolution numbers as above in rough terms. The more >aggressive the window, the less spread energy in the spectrum and a >little worse resolution. > >Note: "resolution" can mean the distance between frequency samples OR, >if you have appended zeros to the sequence then the distance between >frequency samples will be less than the "real" resolution - it will be >over-resolved. You can demonstrate this for yourself by taking a sum of >two sinusoids that are say 5% apart in frequency. Use their sum as the >input and sample them at some reasonable rate so that they can be >resolved in frequency. (say sample at 8X the difference in thier >frequencies for starters) >Then, zero out half the samples in a contiguous fashion (keeping the >number of total samples the same) and analyze again. >Repeat zeroing out half of the samples remaining - until you can no >longer resolve the two sinusoids. At each step, the frequency >information becomes "smoother" until the two sinusoids in frequency >become one peak. That will demonstrate the point that the apparent >frequency resolution (determined by the sample interval NT as 1/NT) is >better than the "real" frequency resolution which does indeed change as >the nonzero duration is reduced. This is very much like using windows >of different lengths (assuming the window and the sample duration match >in each case). > >I hope this helps. > >Fred > > > > >
Thank you very much for the input. However I had a doubt as to whether the resolution is to be calulated using fs or fs/2. Since on performing fft we only obtain meaningful values for upto fs/2 right? Due to nyquist criterion? Vivek

vivu91 wrote:

> Hi, > I have a doubt which might be very basic.
[...]
> Please help i need the solution for this as quickly as possible. > > Vivek
Vivek, If not this last demanding phrase, I might have helped you. Do you have money to pay for the solution as quickly as possible? VLV
On 9/25/2010 11:02 PM, vivu91 wrote:

>> > Thank you very much for the input. However I had a doubt as to whether the > resolution is to be calulated using fs or fs/2. Since on performing fft we > only obtain meaningful values for upto fs/2 right? Due to nyquist > criterion? > > Vivek
Perhaps more than you need but just to put things in context: Often we normalize on one or more of the parameters to make the discussion and some of the analysis independent of the absolute values of time and frequency. Then, when the numbers get "real" we put the unnormalized values to those parameters. Then we can talk about "40% of fs" and so forth and fs/2 might be simply "0.5". I think of the parameters as: N - the number of samples in both time and frequency if one is doing DFTs or FFTs. T - the time interval between temporal samples. (uncommonly one could define an "F" as the frequency interval between spectral samples but we are usually OK with 1/NT). When normalized, T=1 is a very typical approach. fs - the sample rate = 1/T So, if T=1 them fs=1 as well in a normalized case. Then: - NT is the time duration - 1/NT = fs/N is the frequency sample interval ... "resolution" So, that's the answer to your question. Note that none of this has much to do with Nyquist or "meaningful". That's another consideration of course but not one that affects these definitions. As before, if you take a situation: fs = 1/T N So the duration in time is NT then the "frequency sample interval / resolution" is 1/NT Now, let's append N zeros in time to get 2N samples. Now we get: fs = 1/T as before 2N now doubled So the duration in time is 2NT and the "frequency sample interval / resolution" is 1/2NT. BUT... because, by adding the zeros we still have but a duration of NT seconds of nonzero signal values, we still have a resolution of 1/NT Hz from the perspective of information. The frequency samples have been interpolated but without adding any "new" information - thus no improvement in real resolution. Nonetheless this can be a handy thing to do if one is wanting to visualize or plot something like the reconstructed spectrum. And, in fact, that can be "information" to *you*. An example might be the DFT of a rectangular window. What are its spectral spreading properties? I hope this helps. Fred
On Sun, 26 Sep 2010 10:54:13 -0500, Vladimir Vassilevsky
<nospam@nowhere.com> wrote:

> > >vivu91 wrote: > >> Hi, >> I have a doubt which might be very basic. > >[...] > >> Please help i need the solution for this as quickly as possible. >> >> Vivek > >If not this last demanding phrase, I might have helped you. >Do you have money to pay for the solution as quickly as possible? > >VLV
Hi Vladimir, Ha ha. I agree with you. What's the Russian word for "manners"? No need to answer, I couldn't pronounce the answer anyway. [-Rick-]
vivu91 <vivu91@n_o_s_p_a_m.gmail.com> wrote:
(snip)

> Thank you very much for the input. However I had a doubt as to whether the > resolution is to be calulated using fs or fs/2. Since on performing fft we > only obtain meaningful values for upto fs/2 right? Due to nyquist > criterion?
There is a very mysterious 2 that keeps floating around these calculations. OK, start with the DST and DCT (sine and cosine transform). The sine transform (continous or discrete) has the boundary condition that the function, f, goes to zero at the boundary. That implies that the basis functions (those being summed to generate f) also goes to zero at the boundary. Sine goes to zero every half cycle. So, for a given transform length, T, f(0)=f(T)=0, a sine with an integer number of half cycles will match the boundary conditions. Because of that half, the Nth basis function, for an N point transform, will have frequency (about) Fs/2. The DCT (cosine transform) works in a similar way, with the derivative going to zero at the boundary. f'(0)=f'(T)=0. Again multiples of one half cycle fit the boundary conditions, and the highest frequency is (about) Fs/2. For the DFT (FFT) it gets more interesting. Periodic boundary conditions f(0)=f(T), f'(0)=f'(T) allow for only whole cycles, but either sine or cosine. One can uses as basis functions sines and cosines from 0 up to (about) Fs. It is more usual, though, to use those from -Fs/2 up to Fs/2. Now it seems to happily satisfy Nyquist, but where do those negative frequencies come from? Is it all a trick to get a factor of two where there is no other reason for it? Since cos(x)=cos(-x) a positive frequency looks exactly the same as a negative frequency. For complex functions, you need them to allow for the imaginary part, but not for real functions. So, as before, only up to Fs/2. The actual basis functions for the DFT are exp(iwt), and again, for a real transform, frequencies from -Fs/2 to Fs/2 are needed, depending of the phase. -- glen
Hey thank you so much for all the answers. And I am sorry it was kind of
wrong to put the thing about answer as quickly as possible. I apologize.
The problem was I was working on the project and fell sick so I was lagging
on the deadline. Sorry it was still wrong. 

Anyway thank you so much for the responses
"Rick Lyons" <R.Lyons@_BOGUS_ieee.org> wrote in message 
news:leqv96l1pgqif3pqgq8rebsj6q142m4952@4ax.com...
> On Sun, 26 Sep 2010 10:54:13 -0500, Vladimir Vassilevsky > <nospam@nowhere.com> wrote: > >> >> >>vivu91 wrote: >> >>> Hi, >>> I have a doubt which might be very basic. >> >>[...] >> >>> Please help i need the solution for this as quickly as possible. >>> >>> Vivek >> >>If not this last demanding phrase, I might have helped you. >>Do you have money to pay for the solution as quickly as possible? >> >>VLV > > Hi Vladimir, > Ha ha. I agree with you. > > What's the Russian word for "manners"? > > No need to answer, I couldn't pronounce the > answer anyway. > > [-Rick-]
He just sounded a little desperate to me, and he did say 'please.' Calling other people stupid, or often replying with money demands is an example of good manners, I suppose? This is not called 'pro.comp.dsp' - if you don't want to discuss with or help people, then simply don't reply. Thanks.

Rick Lyons wrote:
> On Sun, 26 Sep 2010 10:54:13 -0500, Vladimir Vassilevsky > <nospam@nowhere.com> wrote: > > >> >>vivu91 wrote: >> >> >>>Hi, >>>I have a doubt which might be very basic. >> >>[...] >> >> >>>Please help i need the solution for this as quickly as possible. >>> >>>Vivek >> >>If not this last demanding phrase, I might have helped you. >>Do you have money to pay for the solution as quickly as possible? > > Hi Vladimir, > Ha ha. I agree with you. > > What's the Russian word for "manners"? > No need to answer, I couldn't pronounce the > answer anyway.
Same as in English. I think it is originally French word. VLV