comp.dsp | looking for straightforward pruned IDFT algorithm reference

Hi,

I need to IDFT a large column array where only something like 0.00001% of
the FFT coefficients are non-zero. I've tried using the definition IDFT but
find that Matlab (or, FFTW) IFFT function is still faster. 

As both of these methods are still too slow, I wonder if anyone could
recommend a pruned IDFT algorithm to speed things up while maintaining
accuracy. My largest array size would be 536M rows x 1 column, and up to
4000 non-zero FFT coefficients. The data is all real. I'll implement in
Matlab as well as C.

Thanks in advance for any comments.

Reply by robert bristow-johnson ●January 28, 20112011-01-28

On Jan 28, 12:43&#4294967295;am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net>
wrote:
> Hi,
>
> I need to IDFT a large column array where only something like 0.00001% of
> the FFT coefficients are non-zero. I've tried using the definition IDFT but
> find that Matlab (or, FFTW) IFFT function is still faster.
>
> As both of these methods are still too slow, I wonder if anyone could
> recommend a pruned IDFT algorithm to speed things up while maintaining
> accuracy. My largest array size would be 536M rows x 1 column, and up to
> 4000 non-zero FFT coefficients. The data is all real. I'll implement in
> Matlab as well as C.

is there any rhyme or reason to which 4K bins outa 536M are being
used?  is there any grouping or pattern or are they at "random"
frequencies?

r b-j


> Thanks in advance for any comments.

Reply by all4dsp ●January 28, 20112011-01-28

>On Jan 28, 12:43=A0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net>
>wrote:
>> Hi,
>>
>> I need to IDFT a large column array where only something like 0.00001%
of
>> the FFT coefficients are non-zero. I've tried using the definition IDFT
b=
>ut
>> find that Matlab (or, FFTW) IFFT function is still faster.
>>
>> As both of these methods are still too slow, I wonder if anyone could
>> recommend a pruned IDFT algorithm to speed things up while maintaining
>> accuracy. My largest array size would be 536M rows x 1 column, and up
to
>> 4000 non-zero FFT coefficients. The data is all real. I'll implement in
>> Matlab as well as C.
>
>is there any rhyme or reason to which 4K bins outa 536M are being
>used?  is there any grouping or pattern or are they at "random"
>frequencies?
>
>r b-j
>
>
>> Thanks in advance for any comments.
>
>

There are two applications:

The first application has chronological groupings of 4-20 bins, each group
appearing randomly throughout the spectrum. For example if there are 100M
FFT coefficients, the first three groups have bin indices 104-108,
42812-42822, 89352-89359 (etc., there may be 200 hundred or so groups) are
non-zero.

The second application would be one spectrum containing only one group's
indices. For example, the first spectrum would contain only bins 104-108
non-zero. A separate analysis/spectrum would contain only bins 42812-42822
bins non-zero. etc. For each group there would be a separate analysis
containing only that groups non-zero indices.

Reply by robert bristow-johnson ●January 28, 20112011-01-28

On Jan 28, 1:11&#4294967295;am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net> wrote:
> >On Jan 28, 12:43=A0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net>
> >wrote:
> >> Hi,
>
> >> I need to IDFT a large column array where only something like 0.00001%
> of
> >> the FFT coefficients are non-zero. I've tried using the definition IDFT
> b=
> >ut
> >> find that Matlab (or, FFTW) IFFT function is still faster.
>
> >> As both of these methods are still too slow, I wonder if anyone could
> >> recommend a pruned IDFT algorithm to speed things up while maintaining
> >> accuracy. My largest array size would be 536M rows x 1 column, and up
> to
> >> 4000 non-zero FFT coefficients. The data is all real. I'll implement in
> >> Matlab as well as C.
>
> >is there any rhyme or reason to which 4K bins outa 536M are being
> >used? &#4294967295;is there any grouping or pattern or are they at "random"
> >frequencies?
>
> >r b-j
>
> >> Thanks in advance for any comments.
>
> There are two applications:
>
> The first application has chronological groupings of 4-20 bins, each group
> appearing randomly throughout the spectrum.

do you mean "contiguous" groupings of 4-20 bins?

> For example if there are 100M
> FFT coefficients, the first three groups have bin indices 104-108,
> 42812-42822, 89352-89359 (etc., there may be 200 hundred or so groups) are
> non-zero.

can two (or more) groupings overlap?  if they do, are they identified
simply as a single grouping?

> The second application would be one spectrum containing only one group's
> indices. For example, the first spectrum would contain only bins 104-108
> non-zero. A separate analysis/spectrum would contain only bins 42812-42822
> bins non-zero. etc. For each group there would be a separate analysis
> containing only that groups non-zero indices.

i dunno, perhaps having separate, optimized FFTs for each possible
group size (like a little FFT for 4 bins, another for 8 bins, another
for 16 bins, and others for 6 or 12 or 20 bins).  you can FFT the
group (as if it started at bin 0), and then adjust the group's phase
(with one big phase offset for each group), depending on where the
group is offset into the whole input spectrum.

r b-j

Reply by all4dsp ●January 28, 20112011-01-28

>On Jan 28, 1:11=A0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net> wrote:
>> >On Jan 28, 12:43=3DA0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net>
>> >wrote:
>> >> Hi,
>>
>> >> I need to IDFT a large column array where only something like
0.00001%
>> of
>> >> the FFT coefficients are non-zero. I've tried using the definition
IDF=
>T
>> b=3D
>> >ut
>> >> find that Matlab (or, FFTW) IFFT function is still faster.
>>
>> >> As both of these methods are still too slow, I wonder if anyone
could
>> >> recommend a pruned IDFT algorithm to speed things up while
maintaining
>> >> accuracy. My largest array size would be 536M rows x 1 column, and
up
>> to
>> >> 4000 non-zero FFT coefficients. The data is all real. I'll implement
i=
>n
>> >> Matlab as well as C.
>>
>> >is there any rhyme or reason to which 4K bins outa 536M are being
>> >used? =A0is there any grouping or pattern or are they at "random"
>> >frequencies?
>>
>> >r b-j
>>
>> >> Thanks in advance for any comments.
>>
>> There are two applications:
>>
>> The first application has chronological groupings of 4-20 bins, each
grou=
>p
>> appearing randomly throughout the spectrum.
>
>do you mean "contiguous" groupings of 4-20 bins?
>
>> For example if there are 100M
>> FFT coefficients, the first three groups have bin indices 104-108,
>> 42812-42822, 89352-89359 (etc., there may be 200 hundred or so groups)
ar=
>e
>> non-zero.
>
>can two (or more) groupings overlap?  if they do, are they identified
>simply as a single grouping?
>
>> The second application would be one spectrum containing only one
group's
>> indices. For example, the first spectrum would contain only bins
104-108
>> non-zero. A separate analysis/spectrum would contain only bins
42812-4282=
>2
>> bins non-zero. etc. For each group there would be a separate analysis
>> containing only that groups non-zero indices.
>
>i dunno, perhaps having separate, optimized FFTs for each possible
>group size (like a little FFT for 4 bins, another for 8 bins, another
>for 16 bins, and others for 6 or 12 or 20 bins).  you can FFT the
>group (as if it started at bin 0), and then adjust the group's phase
>(with one big phase offset for each group), depending on where the
>group is offset into the whole input spectrum.
>
>r b-j
>




>do you mean "contiguous" groupings of 4-20 bins?

YES

>can two (or more) groupings overlap?  if they do, are they identified
>simply as a single grouping?

NO


(note, I have the FFT coefficients and I'm trying to get the IDFT
computed)

Interestingly, the dominant time consumption in computing the IFFT using
FFTW is creating the large complex array with mostly zeros. It takes longer
to do this than compute the IFFT. Even if I could optimize the IFFT time to
be very short, I'd still have the overhead of creating the input array.
Maybe this is a Matlab issue, and it'll speed up once in C (anyone know?).

Reply by Tim Wescott ●January 28, 20112011-01-28

On 01/27/2011 09:43 PM, all4dsp wrote:
> Hi,
>
> I need to IDFT a large column array where only something like 0.00001% of
> the FFT coefficients are non-zero. I've tried using the definition IDFT but
> find that Matlab (or, FFTW) IFFT function is still faster.
>
> As both of these methods are still too slow, I wonder if anyone could
> recommend a pruned IDFT algorithm to speed things up while maintaining
> accuracy. My largest array size would be 536M rows x 1 column, and up to
> 4000 non-zero FFT coefficients. The data is all real. I'll implement in
> Matlab as well as C.
>
> Thanks in advance for any comments.

Are you trying to do your IDFT in Matlab code vs. Matlab's built-in 
IFFT?  If so, you should see significant speed up using C.

I'm not sure if you could significantly speed this up with some sort of 
bastardized IFFT routine that kept track of the butterflies with 
non-zero entries -- at some point the effort of book-keeping would 
overwhelm the savings of not having to go through everything.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

Reply by Tim Wescott ●January 28, 20112011-01-28

On 01/27/2011 10:32 PM, all4dsp wrote:
>> On Jan 28, 1:11=A0am, "all4dsp"<all4dsp@n_o_s_p_a_m.comcast.net>  wrote:
>>>> On Jan 28, 12:43=3DA0am, "all4dsp"<all4dsp@n_o_s_p_a_m.comcast.net>
>>>> wrote:
>>>>> Hi,
>>>
>>>>> I need to IDFT a large column array where only something like
> 0.00001%
>>> of
>>>>> the FFT coefficients are non-zero. I've tried using the definition
> IDF=
>> T
>>> b=3D
>>>> ut
>>>>> find that Matlab (or, FFTW) IFFT function is still faster.
>>>
>>>>> As both of these methods are still too slow, I wonder if anyone
> could
>>>>> recommend a pruned IDFT algorithm to speed things up while
> maintaining
>>>>> accuracy. My largest array size would be 536M rows x 1 column, and
> up
>>> to
>>>>> 4000 non-zero FFT coefficients. The data is all real. I'll implement
> i=
>> n
>>>>> Matlab as well as C.
>>>
>>>> is there any rhyme or reason to which 4K bins outa 536M are being
>>>> used? =A0is there any grouping or pattern or are they at "random"
>>>> frequencies?
>>>
>>>> r b-j
>>>
>>>>> Thanks in advance for any comments.
>>>
>>> There are two applications:
>>>
>>> The first application has chronological groupings of 4-20 bins, each
> grou=
>> p
>>> appearing randomly throughout the spectrum.
>>
>> do you mean "contiguous" groupings of 4-20 bins?
>>
>>> For example if there are 100M
>>> FFT coefficients, the first three groups have bin indices 104-108,
>>> 42812-42822, 89352-89359 (etc., there may be 200 hundred or so groups)
> ar=
>> e
>>> non-zero.
>>
>> can two (or more) groupings overlap?  if they do, are they identified
>> simply as a single grouping?
>>
>>> The second application would be one spectrum containing only one
> group's
>>> indices. For example, the first spectrum would contain only bins
> 104-108
>>> non-zero. A separate analysis/spectrum would contain only bins
> 42812-4282=
>> 2
>>> bins non-zero. etc. For each group there would be a separate analysis
>>> containing only that groups non-zero indices.
>>
>> i dunno, perhaps having separate, optimized FFTs for each possible
>> group size (like a little FFT for 4 bins, another for 8 bins, another
>> for 16 bins, and others for 6 or 12 or 20 bins).  you can FFT the
>> group (as if it started at bin 0), and then adjust the group's phase
>> (with one big phase offset for each group), depending on where the
>> group is offset into the whole input spectrum.
>>
>> r b-j
>>
>
>
>
>
>> do you mean "contiguous" groupings of 4-20 bins?
>
> YES
>
>> can two (or more) groupings overlap?  if they do, are they identified
>> simply as a single grouping?
>
> NO
>
>
> (note, I have the FFT coefficients and I'm trying to get the IDFT
> computed)
>
> Interestingly, the dominant time consumption in computing the IFFT using
> FFTW is creating the large complex array with mostly zeros. It takes longer
> to do this than compute the IFFT. Even if I could optimize the IFFT time to
> be very short, I'd still have the overhead of creating the input array.
> Maybe this is a Matlab issue, and it'll speed up once in C (anyone know?).

Matlab has a sparse array format -- how quick does that work?  It'll 
have to convert for IFFT, but it wouldn't have to for your sped-up version.

How are you creating the array?  In Scilab it'd be

my_array = zeros(bazzilion, 1);

with bazzilion equal to your desired (large) array size.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

Reply by all4dsp ●January 28, 20112011-01-28

>On 01/27/2011 10:32 PM, all4dsp wrote:
>>> On Jan 28, 1:11=A0am, "all4dsp"<all4dsp@n_o_s_p_a_m.comcast.net> 
wrote:
>>>>> On Jan 28, 12:43=3DA0am, "all4dsp"<all4dsp@n_o_s_p_a_m.comcast.net>
>>>>> wrote:
>>>>>> Hi,
>>>>
>>>>>> I need to IDFT a large column array where only something like
>> 0.00001%
>>>> of
>>>>>> the FFT coefficients are non-zero. I've tried using the definition
>> IDF=
>>> T
>>>> b=3D
>>>>> ut
>>>>>> find that Matlab (or, FFTW) IFFT function is still faster.
>>>>
>>>>>> As both of these methods are still too slow, I wonder if anyone
>> could
>>>>>> recommend a pruned IDFT algorithm to speed things up while
>> maintaining
>>>>>> accuracy. My largest array size would be 536M rows x 1 column, and
>> up
>>>> to
>>>>>> 4000 non-zero FFT coefficients. The data is all real. I'll
implement
>> i=
>>> n
>>>>>> Matlab as well as C.
>>>>
>>>>> is there any rhyme or reason to which 4K bins outa 536M are being
>>>>> used? =A0is there any grouping or pattern or are they at "random"
>>>>> frequencies?
>>>>
>>>>> r b-j
>>>>
>>>>>> Thanks in advance for any comments.
>>>>
>>>> There are two applications:
>>>>
>>>> The first application has chronological groupings of 4-20 bins, each
>> grou=
>>> p
>>>> appearing randomly throughout the spectrum.
>>>
>>> do you mean "contiguous" groupings of 4-20 bins?
>>>
>>>> For example if there are 100M
>>>> FFT coefficients, the first three groups have bin indices 104-108,
>>>> 42812-42822, 89352-89359 (etc., there may be 200 hundred or so
groups)
>> ar=
>>> e
>>>> non-zero.
>>>
>>> can two (or more) groupings overlap?  if they do, are they identified
>>> simply as a single grouping?
>>>
>>>> The second application would be one spectrum containing only one
>> group's
>>>> indices. For example, the first spectrum would contain only bins
>> 104-108
>>>> non-zero. A separate analysis/spectrum would contain only bins
>> 42812-4282=
>>> 2
>>>> bins non-zero. etc. For each group there would be a separate analysis
>>>> containing only that groups non-zero indices.
>>>
>>> i dunno, perhaps having separate, optimized FFTs for each possible
>>> group size (like a little FFT for 4 bins, another for 8 bins, another
>>> for 16 bins, and others for 6 or 12 or 20 bins).  you can FFT the
>>> group (as if it started at bin 0), and then adjust the group's phase
>>> (with one big phase offset for each group), depending on where the
>>> group is offset into the whole input spectrum.
>>>
>>> r b-j
>>>
>>
>>
>>
>>
>>> do you mean "contiguous" groupings of 4-20 bins?
>>
>> YES
>>
>>> can two (or more) groupings overlap?  if they do, are they identified
>>> simply as a single grouping?
>>
>> NO
>>
>>
>> (note, I have the FFT coefficients and I'm trying to get the IDFT
>> computed)
>>
>> Interestingly, the dominant time consumption in computing the IFFT
using
>> FFTW is creating the large complex array with mostly zeros. It takes
longer
>> to do this than compute the IFFT. Even if I could optimize the IFFT time
to
>> be very short, I'd still have the overhead of creating the input array.
>> Maybe this is a Matlab issue, and it'll speed up once in C (anyone
know?).
>
>Matlab has a sparse array format -- how quick does that work?  It'll 
>have to convert for IFFT, but it wouldn't have to for your sped-up
version.
>
>How are you creating the array?  In Scilab it'd be
>
>my_array = zeros(bazzilion, 1);
>
>with bazzilion equal to your desired (large) array size.
>
>-- 
>
>Tim Wescott
>Wescott Design Services
>http://www.wescottdesign.com
>
>Do you need to implement control loops in software?
>"Applied Control Theory for Embedded Systems" was written for you.
>See details at http://www.wescottdesign.com/actfes/actfes.html


I keep it pretty simple. The IDFT definition implementation in Matlab looks
something like:

for mm = 1 : length(bins) 
    ttrend =  ttrend + 2*real( fftcoef(mm).*(
exp(1j*2*pi*(bins(mm)-1)*(nn-1)/FFTn) ));
end

I could even vectorize the loop using a matrix:

ttrend =  2*real( ( exp(1j*2*pi*(nn-1)*(bins-1)/FFTn) )*fftcoef);

but I found filling the memory with the matrix increases execution time 5x,
so I leave it as a for loop above. Either way there's no need for sparse
arrays/matrices. Keeping the exp function is faster than expanding into sin
and cos, even if you only keep the two real terms after complex
multiplication with fftcoef.

The FFTW way is:

fft_dj = zeros(FFTn, 1);  % preallocate; array turns complex once populate
it with complex numbers
fft_dj(bins) = fftcoef; % populate first half of spectrum including Nyquist
bin
fft_dj(FFTn-bins+2) = conj(fft_dj(bins)); % populate other half of
spectrum
timetrend = real(ifft(fft_dj));

Unfortunately I don't know C so I'm having trouble creating a mex file for
the IDFT definition method. If anyone's willing to convert it I can supply
the full code, which is only 10 lines or so.

Reply by Tim Wescott ●January 28, 20112011-01-28

On 01/28/2011 08:36 AM, all4dsp wrote:
>> On 01/27/2011 10:32 PM, all4dsp wrote:
>>>> On Jan 28, 1:11=A0am, "all4dsp"<all4dsp@n_o_s_p_a_m.comcast.net>
> wrote:
>>>>>> On Jan 28, 12:43=3DA0am, "all4dsp"<all4dsp@n_o_s_p_a_m.comcast.net>
>>>>>> wrote:
>>>>>>> Hi,
>>>>>
>>>>>>> I need to IDFT a large column array where only something like
>>> 0.00001%
>>>>> of
>>>>>>> the FFT coefficients are non-zero. I've tried using the definition
>>> IDF=
>>>> T
>>>>> b=3D
>>>>>> ut
>>>>>>> find that Matlab (or, FFTW) IFFT function is still faster.
>>>>>
>>>>>>> As both of these methods are still too slow, I wonder if anyone
>>> could
>>>>>>> recommend a pruned IDFT algorithm to speed things up while
>>> maintaining
>>>>>>> accuracy. My largest array size would be 536M rows x 1 column, and
>>> up
>>>>> to
>>>>>>> 4000 non-zero FFT coefficients. The data is all real. I'll
> implement
>>> i=
>>>> n
>>>>>>> Matlab as well as C.
>>>>>
>>>>>> is there any rhyme or reason to which 4K bins outa 536M are being
>>>>>> used? =A0is there any grouping or pattern or are they at "random"
>>>>>> frequencies?
>>>>>
>>>>>> r b-j
>>>>>
>>>>>>> Thanks in advance for any comments.
>>>>>
>>>>> There are two applications:
>>>>>
>>>>> The first application has chronological groupings of 4-20 bins, each
>>> grou=
>>>> p
>>>>> appearing randomly throughout the spectrum.
>>>>
>>>> do you mean "contiguous" groupings of 4-20 bins?
>>>>
>>>>> For example if there are 100M
>>>>> FFT coefficients, the first three groups have bin indices 104-108,
>>>>> 42812-42822, 89352-89359 (etc., there may be 200 hundred or so
> groups)
>>> ar=
>>>> e
>>>>> non-zero.
>>>>
>>>> can two (or more) groupings overlap?  if they do, are they identified
>>>> simply as a single grouping?
>>>>
>>>>> The second application would be one spectrum containing only one
>>> group's
>>>>> indices. For example, the first spectrum would contain only bins
>>> 104-108
>>>>> non-zero. A separate analysis/spectrum would contain only bins
>>> 42812-4282=
>>>> 2
>>>>> bins non-zero. etc. For each group there would be a separate analysis
>>>>> containing only that groups non-zero indices.
>>>>
>>>> i dunno, perhaps having separate, optimized FFTs for each possible
>>>> group size (like a little FFT for 4 bins, another for 8 bins, another
>>>> for 16 bins, and others for 6 or 12 or 20 bins).  you can FFT the
>>>> group (as if it started at bin 0), and then adjust the group's phase
>>>> (with one big phase offset for each group), depending on where the
>>>> group is offset into the whole input spectrum.
>>>>
>>>> r b-j
>>>>
>>>
>>>
>>>
>>>
>>>> do you mean "contiguous" groupings of 4-20 bins?
>>>
>>> YES
>>>
>>>> can two (or more) groupings overlap?  if they do, are they identified
>>>> simply as a single grouping?
>>>
>>> NO
>>>
>>>
>>> (note, I have the FFT coefficients and I'm trying to get the IDFT
>>> computed)
>>>
>>> Interestingly, the dominant time consumption in computing the IFFT
> using
>>> FFTW is creating the large complex array with mostly zeros. It takes
> longer
>>> to do this than compute the IFFT. Even if I could optimize the IFFT time
> to
>>> be very short, I'd still have the overhead of creating the input array.
>>> Maybe this is a Matlab issue, and it'll speed up once in C (anyone
> know?).
>>
>> Matlab has a sparse array format -- how quick does that work?  It'll
>> have to convert for IFFT, but it wouldn't have to for your sped-up
> version.
>>
>> How are you creating the array?  In Scilab it'd be
>>
>> my_array = zeros(bazzilion, 1);
>>
>> with bazzilion equal to your desired (large) array size.
>>
>> --
>>
>> Tim Wescott
>> Wescott Design Services
>> http://www.wescottdesign.com
>>
>> Do you need to implement control loops in software?
>> "Applied Control Theory for Embedded Systems" was written for you.
>> See details at http://www.wescottdesign.com/actfes/actfes.html
>
>
> I keep it pretty simple. The IDFT definition implementation in Matlab looks
> something like:
>
> for mm = 1 : length(bins)
>      ttrend =  ttrend + 2*real( fftcoef(mm).*(
> exp(1j*2*pi*(bins(mm)-1)*(nn-1)/FFTn) ));
> end
>
> I could even vectorize the loop using a matrix:
>
> ttrend =  2*real( ( exp(1j*2*pi*(nn-1)*(bins-1)/FFTn) )*fftcoef);
>
> but I found filling the memory with the matrix increases execution time 5x,
> so I leave it as a for loop above. Either way there's no need for sparse
> arrays/matrices. Keeping the exp function is faster than expanding into sin
> and cos, even if you only keep the two real terms after complex
> multiplication with fftcoef.
>
> The FFTW way is:
>
> fft_dj = zeros(FFTn, 1);  % preallocate; array turns complex once populate
> it with complex numbers
> fft_dj(bins) = fftcoef; % populate first half of spectrum including Nyquist
> bin
> fft_dj(FFTn-bins+2) = conj(fft_dj(bins)); % populate other half of
> spectrum
> timetrend = real(ifft(fft_dj));
>
> Unfortunately I don't know C so I'm having trouble creating a mex file for
> the IDFT definition method. If anyone's willing to convert it I can supply
> the full code, which is only 10 lines or so.
>
Unless there's someone out there that does MEX files all the time 
this'll take an inordinate amount of time -- there's probably an hour or 
less to actually implement the algorithm, but possibly a day in all 
screwing around getting everything to compile correctly.

It might be time to learn C...

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

Reply by Greg Heath ●January 30, 20112011-01-30

On Jan 28, 1:32&#4294967295;am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net> wrote:
> >On Jan 28, 1:11=A0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net> wrote:
> >> >On Jan 28, 12:43=3DA0am, "all4dsp" <all4dsp@n_o_s_p_a_m.comcast.net>
> >> >wrote:
> >> >> Hi,
>
> >> >> I need to IDFT a large column array where only something like
> 0.00001%
> >> of
> >> >> the FFT coefficients are non-zero. I've tried using the definition
> IDF=
> >T
> >> b=3D
> >> >ut
> >> >> find that Matlab (or, FFTW) IFFT function is still faster.
>
> >> >> As both of these methods are still too slow, I wonder if anyone
> could
> >> >> recommend a pruned IDFT algorithm to speed things up while
> maintaining
> >> >> accuracy. My largest array size would be 536M rows x 1 column, and
> up
> >> to
> >> >> 4000 non-zero FFT coefficients. The data is all real. I'll implement
> i=
> >n
> >> >> Matlab as well as C.
>
> >> >is there any rhyme or reason to which 4K bins outa 536M are being
> >> >used? =A0is there any grouping or pattern or are they at "random"
> >> >frequencies?
>
> >> >r b-j
>
> >> >> Thanks in advance for any comments.
>
> >> There are two applications:
>
> >> The first application has chronological groupings of 4-20 bins, each
> grou=
> >p
> >> appearing randomly throughout the spectrum.
>
> >do you mean "contiguous" groupings of 4-20 bins?
>
> >> For example if there are 100M
> >> FFT coefficients, the first three groups have bin indices 104-108,
> >> 42812-42822, 89352-89359 (etc., there may be 200 hundred or so groups)
> ar=
> >e
> >> non-zero.
>
> >can two (or more) groupings overlap? &#4294967295;if they do, are they identified
> >simply as a single grouping?
>
> >> The second application would be one spectrum containing only one
> group's
> >> indices. For example, the first spectrum would contain only bins
> 104-108
> >> non-zero. A separate analysis/spectrum would contain only bins
> 42812-4282=
> >2
> >> bins non-zero. etc. For each group there would be a separate analysis
> >> containing only that groups non-zero indices.
>
> >i dunno, perhaps having separate, optimized FFTs for each possible
> >group size (like a little FFT for 4 bins, another for 8 bins, another
> >for 16 bins, and others for 6 or 12 or 20 bins). &#4294967295;you can FFT the
> >group (as if it started at bin 0), and then adjust the group's phase
> >(with one big phase offset for each group), depending on where the
> >group is offset into the whole input spectrum.
>
> >r b-j
>
> >do you mean "contiguous" groupings of 4-20 bins?
>
> YES
>
> >can two (or more) groupings overlap? &#4294967295;if they do, are they identified
> >simply as a single grouping?
>
> NO
>
> (note, I have the FFT coefficients and I'm trying to get the IDFT
> computed)
>
> Interestingly, the dominant time consumption in computing the IFFT using
> FFTW is creating the large complex array with mostly zeros. It takes longer
> to do this than compute the IFFT. Even if I could optimize the IFFT time to
> be very short, I'd still have the overhead of creating the input array.
> Maybe this is a Matlab issue, and it'll speed up once in C (anyone know

In MATLAB, even if X is complex, do not initialize with

clear all
tic
for i = 1:100
   X = zeros(20e6,1);
end
toc    % elapsed_time = 25.891

%Instead, just use

clear all
tic
for i = 1:100
   X(20e6,1) = 0;
end
toc   % elapsed_time = 0.25

Then fill in the small number of nonzero complex components.

Hope this helps.

Greg

Previous12 Next

looking for straightforward pruned IDFT algorithm reference

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group