Hi everyone, Okay so I just plotted the spectrogram of the superposition of two pure tones - a 1khz and a 3 khz, using my FFT/spectrogram code.. I'm not evne sure if its right - it looks right I see two lines - one at the 3khz makr and the other at the 1khz mark which go across, however they seem to be fading in and out at the edges.. It doesnt look bad but i'm just wondering if this is accurate for the case of 2 pure tones..?? shoudlnt it be just a solid line , or is htis the effect of the Hann window?? Then I tried getting rid of the overlapping Hann window and instead just took FFT's of one time segment to the next , i.e. from 0-1023, then from 1024 to 2047 etc.. (so no overlaps). I didnt plot this but I just looked at the individual power spectra, and was assuming I would see only 2 non-zero values , i.e. one at the 1khz index and the other where teh 3khz index.. but instead I see whole bunch values that are to the 10^8, and then around the 1 and 3khz index, they are 10^10 or something.. I cant seem to figure out what I'm doing wrong and I really dont have intuition wiht this stuff maybe its a problem wiht kiss_fft though i would assume that its reliable?? I never figured the answer to the following which someone posted though:>Are you using kissfft in floating point or fixed mode? Is it consistent >throughout the build?but I'm not sure how to check.. Thanks again.. LD
producing spectrogram from audio file.. confusion with FFT
Started by ●October 10, 2006
Reply by ●October 13, 20062006-10-13
Reply by ●October 13, 20062006-10-13
Hi again, Okay actually nevermind about the float/fixed point issue - I am using floating point and I believe this is the default with kissfft so this shouldnlt be the problem.. Basically this is what I do to get my spectrogram: fftsize = 1024; buffsize = 441000; hopsize = 512; for( index=0; index<buffsize-fftsize; index+=hopsize ) { //-------------- Apply the Hann window -----------------// for( s=0; s<fftsize; s++ ) { input_window[s] = audiobuffer[index+s]*hanning[s]; rin[s] = input_window[s]; } //------------------ Perform FFT -----------------------// kiss_fftr( kiss_fftr_state, rin, sout ); //------------------ Compute Magnitude -----------------// for( s=0; s<fftsize; s++ ) { output_frame[s] = powf( sout[s].r , 2.0 ) + powf( sout[s].i , 2.0 ); } //----------------- Assort power values ----------------// for( s=0; s<fftsize/2; s++ ) { assorted_output[s+fftsize/2] = output_frame[s]; } for( s=fftsize; s>=fftsize/2; s-- ) { assorted_output[s-fftsize/2] = output_frame[s]; } //------------------ Write out to file -----------------// //---writing out--// //------------------ Increment loop -------------------// lp++; } I can't see any flaw with it? But there must be a bug in here because my power spectra don't look completely right as I described in my last post.. especially for the case when I dont overlap with Hann and just do: for( index=0; index<buffsize-fftsize; index+=fftsize ) { for( s=0; s<fftsize; s++ ) { input_window[s] = audiobuffer[index+s]; rin[s] = input_window[s]; } // etc... }
Reply by ●October 13, 20062006-10-13
"louis" <lost_bits1110@hotmail.com> writes:> for( s=fftsize; s>=fftsize/2; s-- ) > { > assorted_output[s-fftsize/2] = output_frame[s]; > }I haven't been following the thread, but in any case you have an error here. Should be for( s=fftsize-1; s>=fftsize/2; s-- ) { assorted_output[s-fftsize/2] = output_frame[s]; } -- % Randy Yates % "So now it's getting late, %% Fuquay-Varina, NC % and those who hesitate %%% 919-577-9882 % got no one..." %%%% <yates@ieee.org> % 'Waterfall', *Face The Music*, ELO http://home.earthlink.net/~yatescr
Reply by ●October 13, 20062006-10-13
Thanks so much, you're right! I added that '-1'.. however the output still seems wrong.. :S I guess I'll have to try fftw or something (though I'm sure there shouldn't be anything wrong with kissfft??) but i guess with programming you can never be 100% sure.. sigh.>"louis" <lost_bits1110@hotmail.com> writes: > >> for( s=fftsize; s>=fftsize/2; s-- ) >> { >> assorted_output[s-fftsize/2] = output_frame[s]; >> } > >I haven't been following the thread, but in any case you >have an error here. Should be > > for( s=fftsize-1; s>=fftsize/2; s-- ) > { > assorted_output[s-fftsize/2] = output_frame[s]; > }
Reply by ●October 13, 20062006-10-13
"louis" <lost_bits1110@hotmail.com> writes:> Thanks so much, you're right! I added that '-1'.. however the output still > seems wrong.. :S > I guess I'll have to try fftw or something (though I'm sure there > shouldn't be anything wrong with kissfft??) but i guess with programming > you can never be 100% sure.. sigh.Hey Louis, You're welcome! However, I feel like I could help you more (and there's probably others here too) if you would include the entire listing instead of a piece. For example, I don't know what data types you're using for s, fftsize, etc. I also have no idea what the lp++ is for at the end of the loop. Also, what are you using input_window[] for? It looks useless from the code you've provided. Your basic approach seems sound - there may be a problem in the coding, or in your understanding of what to expect. For the latter, I suggest you use something like octave (http://www.octave.org) to take the fft of your input data and plot the resulting magnitude squared. --Randy> >>"louis" <lost_bits1110@hotmail.com> writes: >> >>> for( s=fftsize; s>=fftsize/2; s-- ) >>> { >>> assorted_output[s-fftsize/2] = output_frame[s]; >>> } >> >>I haven't been following the thread, but in any case you >>have an error here. Should be >> >> for( s=fftsize-1; s>=fftsize/2; s-- ) >> { >> assorted_output[s-fftsize/2] = output_frame[s]; >> } > > >-- % Randy Yates % "Rollin' and riding and slippin' and %% Fuquay-Varina, NC % sliding, it's magic." %%% 919-577-9882 % %%%% <yates@ieee.org> % 'Living' Thing', *A New World Record*, ELO http://home.earthlink.net/~yatescr
Reply by ●October 13, 20062006-10-13
>Hey Louis, > >You're welcome! However, I feel like I could help you more (and there's >probably others here too) if you would include the entire listinginstead>of a piece. For example, I don't know what data types you're using for >s, fftsize, etc. I also have no idea what the lp++ is for at the end >of the loop. Also, what are you using input_window[] for? It looks >useless from the code you've provided. > >Your basic approach seems sound - there may be a problem in the coding, >or in your understanding of what to expect. For the latter, I >suggest you use something like octave (http://www.octave.org) to >take the fft of your input data and plot the resulting magnitude >squared.Hi Randy, Thanks so much for your help! (I can't believe I'm having so much trouble getting a silly spectrogram!! argh!) Here is my full code programmed in VC++ using a current version of kissfft: #include <iostream> #include <fstream> #include <string> using namespace std; #include <math.h> #include <stdlib.h> #include "_kiss_fft_guts.h" #include "kiss_fft.h" #include "kiss_fftr.h" const int fftsize = 1024; const int hopsize = 512; // buffsize hardcoded for now: 10 audio cycles x 44100 sampling rate x 0.92 seconds per audio cycle const int buffsize = 405720; const float filter=40; // all values less than 40 Hz should be set to 0. const int samplerate= 44100; float hanning[fftsize]; float input_window[fftsize]; float output_frame[fftsize]; float assorted_output[fftsize]; float audiobuffer[buffsize]; kiss_fft_scalar rin[fftsize+2]; kiss_fft_cpx sout[fftsize]; int main() { kiss_fftr_cfg kiss_fftr_state; int temp, loops, i, j, k, lp, s, index, bins; char outfilename[128]; float frequencyResolution; ifstream audiofile; ofstream powerspectfile; kiss_fftr_state = kiss_fftr_alloc(fftsize,0,0,0); kiss_fft_scalar zero; memset(&zero,0,sizeof(zero) ); filename = "audio.txt"; audiofile.open( filename, ios::in ); //******* Create Hann Window ****************// for (k = 0; k < fftsize; k++) { hanning[k] = 0.5 - 0.5 * cos(6.2832 * k / fftsize); } //****** Read-in samples from ascii audio file ******// for( i=0; i<buffsize; i++ ) { audiofile>>audiobuffer[i]; } audiofile.close(); loops = buffsize/hopsize; lp = 0; //********Loop FFT **************************// for( index=0; index<buffsize-fftsize; index+=hopsize ) { //-------------- Apply the Hann window -----------------// for( s=0; s<fftsize; s++ ) { input_window[s] = audiobuffer[index+s]*hanning[s]; rin[s] = input_window[s]; } //------------------ Perform FFT -----------------------// kiss_fftr( kiss_fftr_state, rin, sout ); //------------------ Compute Magnitude -----------------// for( s=0; s<fftsize; s++ ) { output_frame[s] = powf( sout[s].r , 2.0 ) + powf( sout[s].i , 2.0 ); } //----------------- Assort power values ----------------// for( s=0; s<fftsize/2; s++ ) { assorted_output[s+fftsize/2] = output_frame[s]; } for( s=fftsize-1; s>=fftsize/2; s-- ) { assorted_output[s-fftsize/2] = output_frame[s]; } //-----------Remove frequencies less than 40 Hz -------// frequencyResolution = (float)samplerate/(float)fftsize; bins = ceil(filter/frequencyResolution); for( j=0; j<=wallbins; j++ ) { assorted_output[ fftsize/2 - j ] = 0; assorted_output[ fftsize/2 + j ] = 0; } //------------------ Write out to file -----------------// sprintf( outfilename, "%s.powspec%5.5d", "cca", lp ); powerspectfile.open(outfilename, ios::out); for ( j = 0; j < fftsize; j++ ) { powerspectfile<<assorted_output[j]<<endl; } powerspectfile.close(); //------------------ Increment loop -------------------// lp++; } free(kiss_fft_state); free(kiss_fftr_state); return 0; }
Reply by ●October 13, 20062006-10-13
btw.. "loop" is useless in the code i just posted, that was just for output purposes. So just to clarify, that code i posted basically creates a power spectra at each "lp" value. Then I have other code which combines all the powerspectra to make the spectrogram.. I can't figure out why its so noisy.. the audio doesn't sound noisy at all.. but my spectrogram shows otherwise.. :(
Reply by ●October 14, 20062006-10-14
louis wrote: [...]> //------------------ Perform FFT -----------------------// > kiss_fftr( kiss_fftr_state, rin, sout ); > > //------------------ Compute Magnitude -----------------// > for( s=0; s<fftsize; s++ ) > { > output_frame[s] = powf( sout[s].r , 2.0 ) + powf( sout[s].i , 2.0 ); > }The output of the real forward fft is roughly half the FFT [0:pi]. The transform is conjugate symmetric. The second half can be easily inferred from the first half. So your frequency domain calculations should be using (fftsize/2+1) complex pairs, not fftsize. -- Mark Borgerding
Reply by ●October 14, 20062006-10-14
Thanks for your response, you're totally right. I changed that but I still see the same spectrogram :s btw I made a mistake when posting my code, below "wallbins" should be just "bins" so: "//-----------Remove frequencies less than 40 Hz -------// frequencyResolution = (float)samplerate/(float)fftsize; bins = ceil(filter/frequencyResolution); for( j=0; j<=bins; j++ ) //<----------- { assorted_output[ fftsize/2 - j ] = 0; assorted_output[ fftsize/2 + j ] = 0; }" not a huge deal but just in case there was confusion.
Reply by ●October 14, 20062006-10-14
louis skrev:> Hi everyone, > > Okay so I just plotted the spectrogram of the superposition of two pure > tones - a 1khz and a 3 khz, using my FFT/spectrogram code.. > > I'm not evne sure if its right - it looks right I see two lines - one at > the 3khz makr and the other at the 1khz mark which go across, however they > seem to be fading in and out at the edges.. It doesnt look bad but i'm just > wondering if this is accurate for the case of 2 pure tones..?? shoudlnt it > be just a solid line , or is htis the effect of the Hann window??It is not an effect of the Hann window, it might be an effect of partially filled buffers at the ends. Rune