comp.dsp | What is the result of FFT exactly?

Hello,

I'm rather new to DSP stuff, so I've got some little questions.

At the moment, I coded a little audioplayer which transforms the audio
data to spectrum using FFT and then back to sample data before playing it
back.

So far, so good. And the way I understand, the result of the FFT is an
array of amplitudes for every possible sine wave. But how should those
numbers be interpreted? Are these linear values?

To experiment a little, I multiplied each value with a slowly increasing
factor to create a fade in (starting from 0.0f and then increasing by
0.005f each cycle). The fade seems to be linear (to my ears at least), so
I assume, the amplitudes are linear as well?

There's another thing that's weird to me: as soon as the factor gets
bigger than 1.0f, the song starts clipping immediately. Then again, if I
increase the volume with an audio tool to 120%, there's no clipping (even
though I enabled the option "allow clipping"). Since I increase EVERY
single value of my result array (and of course all of them by multiplying
with the same value), I expect it to come out just as perfect as with the
audio tool, but it won't.

And the third thing that I don't understand is the following: I picked out
a little frequency band and left the values inside it unchanged, but I
changed the rest of the signal to zero. Looking at my sonagram, I'd expect
to have a little colorful stripe while the rest being pitch-black. The
stripe is there in fact, but the rest isn't black, it's ranging from dark
purple to dark blue. Why is there any signal, when I set all the
amplitudes to zero?

I hope my questions are understandable :)
Thank you in advance

Chris

Reply by Rune Allnor ●June 22, 20072007-06-22

On 22 Jun, 12:58, "Rock Lobster" <e...@christian-gleinser.de> wrote:
> Hello,
>
> I'm rather new to DSP stuff, so I've got some little questions.
>
> At the moment, I coded a little audioplayer which transforms the audio
> data to spectrum using FFT and then back to sample data before playing it
> back.
>
> So far, so good. And the way I understand, the result of the FFT is an
> array of amplitudes for every possible sine wave.

Not every *possible* sine wave, but close enough... by selecting the
parameters of the FFT one can tune the resolution of the spectrum,
i.e. how many sines are computeded inside a bandwidth. But leave
that for now.

> But how should those
> numbers be interpreted? Are these linear values?

Eh, yes, inasmuch as the FFT is linear. Or do you mean "dB"
or a linear scale? What comes out of the FFT is linear. There may
be conversions to a logarithmic dB scale between the FFT and
the display, though.

Rune

Reply by Fred Marshall ●June 22, 20072007-06-22

"Rock Lobster" <email@christian-gleinser.de> wrote in message 
news:ncudnRVCq_GmNubbnZ2dnUVZ_ompnZ2d@giganews.com...
...............

> To experiment a little, I multiplied each value with a slowly increasing
> factor to create a fade in (starting from 0.0f and then increasing by
> 0.005f each cycle). The fade seems to be linear (to my ears at least), so
> I assume, the amplitudes are linear as well?
>
> There's another thing that's weird to me: as soon as the factor gets
> bigger than 1.0f, the song starts clipping immediately.

***This isn't clear to me because you don't say if the multiplication is on 
the time samples or the frequency samples.  And, what is "f"?


>Then again, if I
> increase the volume with an audio tool to 120%, there's no clipping (even
> though I enabled the option "allow clipping"). Since I increase EVERY
> single value of my result array (and of course all of them by multiplying
> with the same value), I expect it to come out just as perfect as with the
> audio tool, but it won't.

***The clipping suggests that you've increased the signal amplitude 
considerably - but since the process is unclear, one couldn't say why.  The 
120% scaling suggests that there remains adequate dynamic range to do that 
without much clipping.

>
> And the third thing that I don't understand is the following: I picked out
> a little frequency band and left the values inside it unchanged, but I
> changed the rest of the signal to zero. Looking at my sonagram, I'd expect
> to have a little colorful stripe while the rest being pitch-black. The
> stripe is there in fact, but the rest isn't black, it's ranging from dark
> purple to dark blue. Why is there any signal, when I set all the
> amplitudes to zero?

You don't say how you "picked out".  If in the frequency domain then if 
you're plotting in the frequency domain then zeros would be zeros wouldn't 
they?  How colors are assigned is just a detail.

Fred

Reply by Ron N. ●June 22, 20072007-06-22

On Jun 22, 3:58 am, "Rock Lobster" <e...@christian-gleinser.de> wrote:
> At the moment, I coded a little audioplayer which transforms the audio
> data to spectrum using FFT and then back to sample data before playing it
> back.
>
> So far, so good. And the way I understand, the result of the FFT is an
> array of amplitudes for every possible sine wave.

Not every possible sine wave, but only sinusoids (consisting
of a mix of sine waves and cosine waves) whose periods are
exact submultiples of the FFT width.  There are only a finite
number of these sinusoid frequencies exactly represented in
the result (let's call them "bin" frequencies).

So what happens to the periodic waveforms which are in between
these bin frequencies?  They get broken up into components,
and spattered all over the FFT result (not just the closest
bin).  So if you play with only one bin frequency, you are
only playing with a fractional portion of some sine wave
(except in that rare case when that sinusoid's period is an
exact match with the FFT width).

That's why when you zero a range of bins, you don't zero all
the frequencies in that range, since portions of those
frequencies which are not exactly centered in those bins is
splattered elsewhere, and thus will still show up in the
result.  You will also end up munging frequencies well away
from those bins, since portions of them will be spattered into
the modified bins.

IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M
  http://www.nicholson.com/rhn/dsp.html

Reply by Fred Marshall ●June 24, 20072007-06-24

"Ron N." <rhnlogic@yahoo.com> wrote in message 
news:1182529924.762171.13430@z28g2000prd.googlegroups.com...
> On Jun 22, 3:58 am, "Rock Lobster" <e...@christian-gleinser.de> wrote:
>> At the moment, I coded a little audioplayer which transforms the audio
>> data to spectrum using FFT and then back to sample data before playing it
>> back.
>>
>> So far, so good. And the way I understand, the result of the FFT is an
>> array of amplitudes for every possible sine wave.
>
> Not every possible sine wave, but only sinusoids (consisting
> of a mix of sine waves and cosine waves) whose periods are
> exact submultiples of the FFT width.  There are only a finite
> number of these sinusoid frequencies exactly represented in
> the result (let's call them "bin" frequencies).
>
> So what happens to the periodic waveforms which are in between
> these bin frequencies?  They get broken up into components,
> and spattered all over the FFT result (not just the closest
> bin).  So if you play with only one bin frequency, you are
> only playing with a fractional portion of some sine wave
> (except in that rare case when that sinusoid's period is an
> exact match with the FFT width).
>
> That's why when you zero a range of bins, you don't zero all
> the frequencies in that range, since portions of those
> frequencies which are not exactly centered in those bins is
> splattered elsewhere, and thus will still show up in the
> result.  You will also end up munging frequencies well away
> from those bins, since portions of them will be spattered into
> the modified bins.

Ron,

I think I know what you're referring to but zeroed samples are zeroes 
nonetheless.  And there should be no contribution to the other samples by 
zeroing.  Now, if you want to talk about what splattering happens in the 
time domain as a result, then yes.

Fred

Reply by Ron N. ●June 24, 20072007-06-24

On Jun 24, 1:48 pm, "Fred Marshall" <fmarshallx@remove_the_x.acm.org>
wrote:
> "Ron N." <rhnlo...@yahoo.com> wrote in message
> news:1182529924.762171.13430@z28g2000prd.googlegroups.com...
...
> > On Jun 22, 3:58 am, "Rock Lobster" <e...@christian-gleinser.de> wrote:
> >> At the moment, I coded a little audioplayer which transforms the audio
> >> data to spectrum using FFT and then back to sample data before playing it
> >> back.
>
> >> So far, so good. And the way I understand, the result of the FFT is an
> >> array of amplitudes for every possible sine wave.
>
> > Not every possible sine wave, but only sinusoids (consisting
> > of a mix of sine waves and cosine waves) whose periods are
> > exact submultiples of the FFT width.  There are only a finite
> > number of these sinusoid frequencies exactly represented in
> > the result (let's call them "bin" frequencies).
>
> > So what happens to the periodic waveforms which are in between
> > these bin frequencies?  They get broken up into components,
> > and spattered all over the FFT result (not just the closest
> > bin).  So if you play with only one bin frequency, you are
> > only playing with a fractional portion of some sine wave
> > (except in that rare case when that sinusoid's period is an
> > exact match with the FFT width).
>
> > That's why when you zero a range of bins, you don't zero all
> > the frequencies in that range, since portions of those
> > frequencies which are not exactly centered in those bins is
> > splattered elsewhere, and thus will still show up in the
> > result.  You will also end up munging frequencies well away
> > from those bins, since portions of them will be spattered into
> > the modified bins.
>
> Ron,
>
> I think I know what you're referring to but zeroed samples are zeroes
> nonetheless.  And there should be no contribution to the other samples by
> zeroing.  Now, if you want to talk about what splattering happens in the
> time domain as a result, then yes.

If you zero a sample either in the time domain or the frequency
domain, then you might not change the value of the continuous
time or spectrum waveform as is passes through the other sample
points, but you could cause that time domain or spectral waveform
to bounce around wildly between sample points, and perhaps even
at places far removed from the the one(s) you've zeroed.

The "splattering" (window convolution) happens almost
identically in both the time and frequency domains (unless
you happen to have a dft/fft aperture exactly synchronized
to the waveform periodicities.)


IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M

Reply by Rock Lobster ●June 25, 20072007-06-25

First of all, thanks for your answers!

Well, since I posted the thread, I experimented a little more, and now I
managed to build some little Equalizer which uses gaussian curves, and the
clipping was reduced a little.


To clarify my questions a bit:
1) the main question was if the frequencies in my output array are linear
(which should mean that multiplying them with 3 should get 300% volume),
or are they logarithmical (dB-like, though I'm not too familiar with
that)? Now I assume it's the latter, judging from my equalizing
experiment, but I'm not quite sure, since from my first experiments, I
thought they are linear.

2) The clipping occured when I multiplied the frequency values, not sample
points. I programmed a loop that multiplied each frequency band value with
a factor that slowly increased by 0.0005f (the f simply stands for float),
and once the factor was bigger than 1.0f (which would result in 100%
(original) volume), I got massive clipping.

3) For the zeroing thing, I divided my frequency array into six parts.
Let's say, the whole array carries 4096 floats, then I first divided it
into two parts with 2048 floats each, since the FFT result is symmetric.
Then, each part of those was again divided into three parts, of which one
was left completely unchanged (let's say the floats from 128 to 256 and
their corresponding ones in the other part from 3840 to 3968), and the
other parts' float values were radically set to zero. That way, I saw a
nice thin frequency band on a sonagram view, but the rest wasn't black, it
was purple and blue (meaning there's very low activity) which wasn't what I
expected.

I hope now it's a little bit clearer. The most important part for me would
be 1), but I'd be glad if someone could explain the other two as well.

Since I want to further develop my equalizer, it's actually important to
know if the frequency values are linear or logarithmic, so that my
parameters do not change their characteristics once the master volume is
changed.

Thank you very much!

Reply by Ron N. ●June 25, 20072007-06-25

On Jun 24, 11:07 pm, "Rock Lobster" <e...@christian-gleinser.de>
wrote:
> First of all, thanks for your answers!
>
> Well, since I posted the thread, I experimented a little more, and now I
> managed to build some little Equalizer which uses gaussian curves, and the
> clipping was reduced a little.
>
> To clarify my questions a bit:
> 1) the main question was if the frequencies in my output array are linear
> (which should mean that multiplying them with 3 should get 300% volume),
> or are they logarithmical (dB-like, though I'm not too familiar with
> that)? Now I assume it's the latter, judging from my equalizing
> experiment, but I'm not quite sure, since from my first experiments, I
> thought they are linear.
>
> 2) The clipping occured when I multiplied the frequency values, not sample
> points. I programmed a loop that multiplied each frequency band value with
> a factor that slowly increased by 0.0005f (the f simply stands for float),
> and once the factor was bigger than 1.0f (which would result in 100%
> (original) volume), I got massive clipping.
>
> 3) For the zeroing thing, I divided my frequency array into six parts.
> Let's say, the whole array carries 4096 floats, then I first divided it
> into two parts with 2048 floats each, since the FFT result is symmetric.
> Then, each part of those was again divided into three parts, of which one
> was left completely unchanged (let's say the floats from 128 to 256 and
> their corresponding ones in the other part from 3840 to 3968), and the
> other parts' float values were radically set to zero. That way, I saw a
> nice thin frequency band on a sonagram view, but the rest wasn't black, it
> was purple and blue (meaning there's very low activity) which wasn't what I
> expected.
>
> I hope now it's a little bit clearer. The most important part for me would
> be 1), but I'd be glad if someone could explain the other two as well.
>
> Since I want to further develop my equalizer, it's actually important to
> know if the frequency values are linear or logarithmic, so that my
> parameters do not change their characteristics once the master volume is
> changed.
>
> Thank you very much!

Common FFT code is linear in both the time and frequency
domains.  Human hearing closer to logarithmic in response.

You should end up with two FFT result arrays, real and
imaginary (or cosine and sine correlations).  The sine
array should be antisymmetric, not symmetric.

Clipping due to small changes in a multiplier around 1.0
could indicate an arithmetic, type conversion or numeric
format bug of some sort.

A Gaussian filter curve will result in much less frequency
domain "spatter" than zeroing selected bins, so your sonogram
should look quieter in the far stop band with a Gaussian
equalizer.



IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M
  http://www.nicholson.com/rhn/dsp.html

Reply by Rock Lobster ●June 25, 20072007-06-25

Ahh yes, I just tried the frequency band thing again with my gauss
equalizer, and it's indeed much more what I'd expect ;)

Another question to linearity:
The frequencies are linear (from 0 - 22 kHz), and as you said, the volume
as well. Would that mean that multiplying the entire array with 0.5f is
exactly half volume, and multiplying with 2.0f would be doubled volume? Or
is that just mathematically and the human ear would judge otherwise?

And since the frequencies are linear, but doubled frequency means one
octave increased, my equalizer should in fact work non-linear (at least in
the frequency domain), how could I accomplish this? The gaussian bell curve
should in fact get wider the higher the basic frequency gets. And depending
on my above question (about linearity of volume), the height should also be
affected the higher the overall volume gets. What would be the correct
factors to calculate that?


About the array:
In my case, I use a FFT method that I didn't code by myself, and it just
returns a symmetric float array. The method is called smsFft() but I don't
know what sms stands for in that case.

Reply by Ron N. ●June 25, 20072007-06-25

On Jun 25, 12:24 am, "Rock Lobster" <e...@christian-gleinser.de>
wrote:
> About the array:
> In my case, I use a FFT method that I didn't code by myself, and it just
> returns a symmetric float array. The method is called smsFft() but I don't
> know what sms stands for in that case.

There is a routine on the web named smsFft() which takes
and returns the vectors with the complex components
interleaved in both the time and frequency domain. e.g.:
 real[0], imag[0], real[1], imag[1]...
So your vector may actually be data pairs, and your real
time domain sound data should be in every other vector
element.

IMHO. YMMV.

Previous12 Next

What is the result of FFT exactly?

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group