DSPRelated.com
Forums

looking for straightforward pruned IDFT algorithm reference

Started by all4dsp January 28, 2011
>On Jan 28, 1:32=A0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net> wrote: >> >On Jan 28, 1:11=3DA0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net>
wrot=
>e: >> >> >On Jan 28, 12:43=3D3DA0am, "all4dsp"
<all4dsp@n_o_s_p_a_m.comcast.net=
>> >> >> >wrote: >> >> >> Hi, >> >> >> >> I need to IDFT a large column array where only something like >> 0.00001% >> >> of >> >> >> the FFT coefficients are non-zero. I've tried using the
definition
>> IDF=3D >> >T >> >> b=3D3D >> >> >ut >> >> >> find that Matlab (or, FFTW) IFFT function is still faster. >> >> >> >> As both of these methods are still too slow, I wonder if anyone >> could >> >> >> recommend a pruned IDFT algorithm to speed things up while >> maintaining >> >> >> accuracy. My largest array size would be 536M rows x 1 column,
and
>> up >> >> to >> >> >> 4000 non-zero FFT coefficients. The data is all real. I'll
implemen=
>t >> i=3D >> >n >> >> >> Matlab as well as C. >> >> >> >is there any rhyme or reason to which 4K bins outa 536M are being >> >> >used? =3DA0is there any grouping or pattern or are they at "random" >> >> >frequencies? >> >> >> >r b-j >> >> >> >> Thanks in advance for any comments. >> >> >> There are two applications: >> >> >> The first application has chronological groupings of 4-20 bins, each >> grou=3D >> >p >> >> appearing randomly throughout the spectrum. >> >> >do you mean "contiguous" groupings of 4-20 bins? >> >> >> For example if there are 100M >> >> FFT coefficients, the first three groups have bin indices 104-108, >> >> 42812-42822, 89352-89359 (etc., there may be 200 hundred or so
groups)
>> ar=3D >> >e >> >> non-zero. >> >> >can two (or more) groupings overlap? =A0if they do, are they
identified
>> >simply as a single grouping? >> >> >> The second application would be one spectrum containing only one >> group's >> >> indices. For example, the first spectrum would contain only bins >> 104-108 >> >> non-zero. A separate analysis/spectrum would contain only bins >> 42812-4282=3D >> >2 >> >> bins non-zero. etc. For each group there would be a separate
analysis
>> >> containing only that groups non-zero indices. >> >> >i dunno, perhaps having separate, optimized FFTs for each possible >> >group size (like a little FFT for 4 bins, another for 8 bins, another >> >for 16 bins, and others for 6 or 12 or 20 bins). =A0you can FFT the >> >group (as if it started at bin 0), and then adjust the group's phase >> >(with one big phase offset for each group), depending on where the >> >group is offset into the whole input spectrum. >> >> >r b-j >> >> >do you mean "contiguous" groupings of 4-20 bins? >> >> YES >> >> >can two (or more) groupings overlap? =A0if they do, are they
identified
>> >simply as a single grouping? >> >> NO >> >> (note, I have the FFT coefficients and I'm trying to get the IDFT >> computed) >> >> Interestingly, the dominant time consumption in computing the IFFT
using
>> FFTW is creating the large complex array with mostly zeros. It takes
long=
>er >> to do this than compute the IFFT. Even if I could optimize the IFFT time
=
>to >> be very short, I'd still have the overhead of creating the input array. >> Maybe this is a Matlab issue, and it'll speed up once in C (anyone know > >In MATLAB, even if X is complex, do not initialize with > >clear all >tic >for i =3D 1:100 > X =3D zeros(20e6,1); >end >toc % elapsed_time =3D 25.891 > >%Instead, just use > >clear all >tic >for i =3D 1:100 > X(20e6,1) =3D 0; >end >toc % elapsed_time =3D 0.25 > >Then fill in the small number of nonzero complex components. > >Hope this helps. > >Greg >
Thanks Greg, I tried your suggestion but it didn't make much of a difference. From the best I can tell, regardless of how the mostly-zero array is created, when this array is sent into ifft (or fft), matlab converts all the zeros to complex zeros and this is the dominant time consumer. It seems one can either create a complex array of mostly zeros and populate with the few non-zero complex numbers (in which case the major time-hit occurs up front), or one can create a fast non-complex array and then populate it with complex numbers (which also if fast), but then the ifft process converts the zeros to complex zeros (and the time-hit occurs at this point).
On Jan 30, 4:46&#4294967295;pm, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net> wrote:
> >On Jan 28, 1:32=A0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net> wrote: > >> >On Jan 28, 1:11=3DA0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net> > wrot= > >e: > >> >> >On Jan 28, 12:43=3D3DA0am, "all4dsp" > > <all4dsp@n_o_s_p_a_m.comcast.net= > > > > > > > > >> >> >wrote: > >> >> >> Hi, > > >> >> >> I need to IDFT a large column array where only something like > >> 0.00001% > >> >> of > >> >> >> the FFT coefficients are non-zero. I've tried using the > definition > >> IDF=3D > >> >T > >> >> b=3D3D > >> >> >ut > >> >> >> find that Matlab (or, FFTW) IFFT function is still faster. > > >> >> >> As both of these methods are still too slow, I wonder if anyone > >> could > >> >> >> recommend a pruned IDFT algorithm to speed things up while > >> maintaining > >> >> >> accuracy. My largest array size would be 536M rows x 1 column, > and > >> up > >> >> to > >> >> >> 4000 non-zero FFT coefficients. The data is all real. I'll > implemen= > >t > >> i=3D > >> >n > >> >> >> Matlab as well as C. > > >> >> >is there any rhyme or reason to which 4K bins outa 536M are being > >> >> >used? =3DA0is there any grouping or pattern or are they at "random" > >> >> >frequencies? > > >> >> >r b-j > > >> >> >> Thanks in advance for any comments. > > >> >> There are two applications: > > >> >> The first application has chronological groupings of 4-20 bins, each > >> grou=3D > >> >p > >> >> appearing randomly throughout the spectrum. > > >> >do you mean "contiguous" groupings of 4-20 bins? > > >> >> For example if there are 100M > >> >> FFT coefficients, the first three groups have bin indices 104-108, > >> >> 42812-42822, 89352-89359 (etc., there may be 200 hundred or so > groups) > >> ar=3D > >> >e > >> >> non-zero. > > >> >can two (or more) groupings overlap? =A0if they do, are they > identified > >> >simply as a single grouping? > > >> >> The second application would be one spectrum containing only one > >> group's > >> >> indices. For example, the first spectrum would contain only bins > >> 104-108 > >> >> non-zero. A separate analysis/spectrum would contain only bins > >> 42812-4282=3D > >> >2 > >> >> bins non-zero. etc. For each group there would be a separate > analysis > >> >> containing only that groups non-zero indices. > > >> >i dunno, perhaps having separate, optimized FFTs for each possible > >> >group size (like a little FFT for 4 bins, another for 8 bins, another > >> >for 16 bins, and others for 6 or 12 or 20 bins). =A0you can FFT the > >> >group (as if it started at bin 0), and then adjust the group's phase > >> >(with one big phase offset for each group), depending on where the > >> >group is offset into the whole input spectrum. > > >> >r b-j > > >> >do you mean "contiguous" groupings of 4-20 bins? > > >> YES > > >> >can two (or more) groupings overlap? =A0if they do, are they > identified > >> >simply as a single grouping? > > >> NO > > >> (note, I have the FFT coefficients and I'm trying to get the IDFT > >> computed) > > >> Interestingly, the dominant time consumption in computing the IFFT > using > >> FFTW is creating the large complex array with mostly zeros. It takes > long= > >er > >> to do this than compute the IFFT. Even if I could optimize the IFFT time > = > >to > >> be very short, I'd still have the overhead of creating the input array. > >> Maybe this is a Matlab issue, and it'll speed up once in C (anyone know > > >In MATLAB, even if X is complex, do not initialize with > > >clear all > >tic > >for i =3D 1:100 > > &#4294967295; X =3D zeros(20e6,1); > >end > >toc &#4294967295; &#4294967295;% elapsed_time =3D 25.891 > > >%Instead, just use > > >clear all > >tic > >for i =3D 1:100 > > &#4294967295; X(20e6,1) =3D 0; > >end > >toc &#4294967295; % elapsed_time =3D 0.25 > > >Then fill in the small number of nonzero complex components. > > >Hope this helps. > > >Greg > > Thanks Greg, > > I tried your suggestion but it didn't make much of a difference. From the > best I can tell, regardless of how the mostly-zero array is created, when > this array is sent into ifft (or fft), matlab converts all the zeros to > complex zeros and this is the dominant time consumer. > > It seems one can either create a complex array of mostly zeros and populate > with the few non-zero complex numbers (in which case the major time-hit > occurs up front), or one can create a fast non-complex array and then > populate it with complex numbers (which also if fast), but then the ifft > process converts the zeros to complex zeros (and the time-hit occurs at > this point).- Hide quoted text - > > - Show quoted text -
There is still a significant time difference between just intializing the last element with, approximately, a complex zero and initializing all of the elements with complex zeros. The approximation is needed because if just the last element is initialized as a comlex zero, MATLAB creates a real zero matrix. % Instead of using clc, clear all tic for i = 1:100 X3 = complex(zeros(10e6,1),zeros(10e6,1)); end toc % elapsed_time = 62.375 whos % just use clear all tic for i = 1:100 X4(10e6,1) = 0+j*realmin; % realmin = 2.2251e-308 end toc % elapsed_time = 4.437 whos Hope this helps. Greg