DSPRelated.com
Forums

Creating raw sound data from IFFT - how to append sound chunks

Started by louis June 22, 2005
Hello, 

I was wondering if any of you knew anything about this, or could point me
to some resources.

So I have spectral data at about 100 time points which was taken over a
one second time period.  (This was obtained originally by doing a forward
FFT on chunks of one second of sound data)

Now for each of these spectrograms, where I have power vs frequency info,
I perform a 1024 point inverse FFT.

Then all I do - after an IFFT is done for that time point - is append this
chunk of data to the previous chunk, until all 100 time points are done,
and until I have an array that is 100*1024 elements long. This must be the
wrong way of putting together a sound file? Could you let me know how
exactly I am supposed to composite all these sound chunks ?

I tested out the Inverse FFT by producing just pure tones, and summations
of a few simple frequencies, and when I output and plot these they look
and sound right

For example to produce a 1khz tone - I assign a positive amplitude to the
bin corresponding to this frequency in my 1024 element array (since I'm
doign a 1024 pt. IFFT); and all the other elements are set to zero. Then I
do an IFFT, and plot this using 'snd-7' on linux, and I see the amplitude
vs time plot and can play the sound back, and everythign looks/sounds
right

Anyways,  so basically I know the Inverse FFT works. Now when I apply this
to my spectral data then I dont get the correct sound output.  It seems
like the problem is in the way  I am compiling the sound chunks.

If you could please help me understand how Im supposed to put these chunks
together, that owuld be great. 

What are the implications on the size of the data (for instance I am
dealing with 1024x100 eleemnts) and on how long my sound will play for,
and how this is related to the sampling rate I've chosen (I'm dealing with
a 44.1khz sampling rate).  

I am confused as to how I can compile these sound chunks, and have it play
for exactly one second like its supposed to.  

Thank you this owuld really help me alot!!! Thank you again,
LD

(Note I was describing mono mode above- so one channel )
Below is some fragments of my code to give you a better idea if you like,
I am actually dealing with stereo mode:

---------------------------------------------------------------------------------------------------------------
for ( t=0; t< 100; t++ )
{
         for(  k=0; k<fft_size/2; k++)
          {
		 fourierArray_leftchannel[k] = power value of positive frequencies
                 fourierArray_rightchannel[k] = power value of negative
frequencies
         }
			
	four1( fourierArray_leftchannel,   1024, -1);          // does a 1024
point inverse FFT 
	four1( fourierArray_rightchannel,   1024, -1);

	// -------- Writing Audio ------------//
			
	for (i=0; i< fft_size; i++)
	{
               buf_audio[ 2*i ]= (short int)fourierArray_leftchannel[2*i
]; // including only reals from fourierArray
	       buf_audio[ (2*i) + 1 ] = (short int)fourierArray_rightchannel[2*i
];	
	}
		
	for ( i = (t*2048), j=0; j < 2*fft_size ; i++, j++ ) 
	{
	       large_buffer_audio[ i ] = buf_audio[j];
	}
}		

// I then output large_buffer_audio


		
This message was sent using the Comp.DSP web interface on
www.DSPRelated.com
Louis,

take it one step at a time. You say:

>So I have spectral data at about 100 time points which was taken over a >one second time period.
I gather this means that you have 100 chunks of spectral data of length 1024, right? Are you absolutely sure that these chunks were generated by taking 100 FFTs of length 1024 with no overlap? Really, really sure? If so, just take the inverse FFT of the first chunk (do it with Matlab, I believe you have several errors in your C code). You now have 1024 time domain samples. Now take the inverse FFT of the second chunk, and try to see what it looks like if you append it to the one you already have. Does it look like expected, or is there a discontinuity at the splice point? Does it look like the time domain data was windowed? In which case you have to de-window the data. OTH, if the two time domain data chunks have a common section, the original data was gathered using overlapped FFTs, in which case you have to take care to reconstruct the time domain data by scrapping every other overlap section. Once you have your algorithm working in Matlab, you can then go on to coding it in C. And Louis, you write:
> fourierArray_leftchannel[k] = power value of positive frequencies
I have already mentioned that a power value ( = magnitude squared) does not fully specify a frequency domain vector. Furthermore, the numerical recipes FFT routines requires interleaved real / imaginary floats, but if I were you I wouldn't bother correcting the C code but rather get the algorithm working in Matlab (or Scilab or Octave) beforehand. Regards, Andor
Louis,

If you can combine spectral tones back into the time domain flawlessly, then 
I'd say that there is nothing wrong with your approach or code. (Flawlessly 
meaning perfect phase alignment.) The fact that you don't have access to the 
original samples, and only the FFT's, makes me think that you might be 
reverse engineering something. To reverse engineer a spectral plot is not 
simple if you don't know the sample rate or the amount of samples processed 
(a.k.a the resolution bandwidth).


Thomas

"louis" <lost_bits1110@hotmail.com> wrote in message 
news:ct6dnRkHKv8pFSTfRVn-1Q@giganews.com...
> Hello, > > I was wondering if any of you knew anything about this, or could point me > to some resources. > > So I have spectral data at about 100 time points which was taken over a > one second time period. (This was obtained originally by doing a forward > FFT on chunks of one second of sound data) > > Now for each of these spectrograms, where I have power vs frequency info, > I perform a 1024 point inverse FFT. > > Then all I do - after an IFFT is done for that time point - is append this > chunk of data to the previous chunk, until all 100 time points are done, > and until I have an array that is 100*1024 elements long. This must be the > wrong way of putting together a sound file? Could you let me know how > exactly I am supposed to composite all these sound chunks ? > > I tested out the Inverse FFT by producing just pure tones, and summations > of a few simple frequencies, and when I output and plot these they look > and sound right > > For example to produce a 1khz tone - I assign a positive amplitude to the > bin corresponding to this frequency in my 1024 element array (since I'm > doign a 1024 pt. IFFT); and all the other elements are set to zero. Then I > do an IFFT, and plot this using 'snd-7' on linux, and I see the amplitude > vs time plot and can play the sound back, and everythign looks/sounds > right > > Anyways, so basically I know the Inverse FFT works. Now when I apply this > to my spectral data then I dont get the correct sound output. It seems > like the problem is in the way I am compiling the sound chunks. > > If you could please help me understand how Im supposed to put these chunks > together, that owuld be great. > > What are the implications on the size of the data (for instance I am > dealing with 1024x100 eleemnts) and on how long my sound will play for, > and how this is related to the sampling rate I've chosen (I'm dealing with > a 44.1khz sampling rate). > > I am confused as to how I can compile these sound chunks, and have it play > for exactly one second like its supposed to. > > Thank you this owuld really help me alot!!! Thank you again, > LD > > (Note I was describing mono mode above- so one channel ) > Below is some fragments of my code to give you a better idea if you like, > I am actually dealing with stereo mode: > > --------------------------------------------------------------------------------------------------------------- > for ( t=0; t< 100; t++ ) > { > for( k=0; k<fft_size/2; k++) > { > fourierArray_leftchannel[k] = power value of positive frequencies > fourierArray_rightchannel[k] = power value of negative > frequencies > } > > four1( fourierArray_leftchannel, 1024, -1); // does a 1024 > point inverse FFT > four1( fourierArray_rightchannel, 1024, -1); > > // -------- Writing Audio ------------// > > for (i=0; i< fft_size; i++) > { > buf_audio[ 2*i ]= (short int)fourierArray_leftchannel[2*i > ]; // including only reals from fourierArray > buf_audio[ (2*i) + 1 ] = (short int)fourierArray_rightchannel[2*i > ]; > } > > for ( i = (t*2048), j=0; j < 2*fft_size ; i++, j++ ) > { > large_buffer_audio[ i ] = buf_audio[j]; > } > } > > // I then output large_buffer_audio > > > > This message was sent using the Comp.DSP web interface on > www.DSPRelated.com
Thank you Andor and Thomas for your responses, 

So to clarify a few things:

>So I have spectral data at about 100 time points which was taken over a >one second time period.
---you wrote--- I gather this means that you have 100 chunks of spectral data of length 1024, right? Are you absolutely sure that these chunks were generated by taking 100 FFTs of length 1024 with no overlap? Really, really sure? --- So to clarify, the spectral data that I am transforming to audio is actually a simulated spectrogram. We validated these spectrograms by comparing them to real spectrograms which were generated by doing a 1024 point FFT using a 512 Hann window. So our spectrograms are in fact realistic when comparing them to the real ones, but they are still simulated, and therefore are not exact replicas. I am now trying to simulate the audio output that corresponds with this spectrogram. ---you wrote---- And Louis, you write:
> fourierArray_leftchannel[k] = power value of positive frequencies
I have already mentioned that a power value ( = magnitude squared) does not fully specify a frequency domain vector. Furthermore, the numerical recipes FFT routines requires interleaved real / imaginary floats, but if I were you I wouldn't bother correcting the C code but rather get the algorithm working in Matlab (or Scilab or Octave) beforehand. ---- Regarding the fact that I have no phase info, I am actually treating my values as real - so the real part of the num. recipes array is assigned to my power values, and the imaginaries are set to zero. Then I perform the IFFT, and I just use the real values, and ignore the imaginaries. Thank you again for your assistance! LD This message was sent using the Comp.DSP web interface on www.DSPRelated.com