comp.dsp | Large FFT --- NOT radix-2

Any pointers to doing non-radix 2 FFT's.
Specifically I wish to do 44100 pt FFT's on 16 bit data.
Simplicity of coding OUTWEIGHS speed!
[ Scilab is slow but adequate ]

Reply by Rick Lyons ●January 1, 20042004-01-01

On Thu, 01 Jan 2004 18:39:31 -0600, Richard Owlett
<rowlett@atlascomm.net> wrote:

>Any pointers to doing non-radix 2 FFT's.
>Specifically I wish to do 44100 pt FFT's on 16 bit data.
>Simplicity of coding OUTWEIGHS speed!
>[ Scilab is slow but adequate ]

Hi Richard,
  I'm ignorant of these schemes,
but ya' might do an Internet search 
on "Prime Factor" and "Split-radix"
FFTs.
Good Luck,

[-Rick-]

Reply by Fred Marshall ●January 2, 20042004-01-02

"Richard Owlett" <rowlett@atlascomm.net> wrote in message
news:vv9fams78p7438@corp.supernews.com...
> Any pointers to doing non-radix 2 FFT's.
> Specifically I wish to do 44100 pt FFT's on 16 bit data.
> Simplicity of coding OUTWEIGHS speed!
> [ Scilab is slow but adequate ]
>

Doesn't SCILAB just do them?  I tried it and it seems to.  It took 0.421
seconds on a dual processor Pentium 200MHz.

Fred

Reply by Richard Owlett ●January 2, 20042004-01-02

Fred Marshall wrote:

> "Richard Owlett" <rowlett@atlascomm.net> wrote in message
> news:vv9fams78p7438@corp.supernews.com...
> 
>>Any pointers to doing non-radix 2 FFT's.
>>Specifically I wish to do 44100 pt FFT's on 16 bit data.
>>Simplicity of coding OUTWEIGHS speed!
>>[ Scilab is slow but adequate ]
>>
> 
> 
> Doesn't SCILAB just do them?  I tried it and it seems to.  It took 0.421
> seconds on a dual processor Pentium 200MHz.
> 
> Fred
> 
> 

I'm going to have a second look at the framework I was using. My 
Pentium 4 1.6 GHz seems to do it no faster than your machine. Well, I 
did do ~1800 of them in a batch ;}

BTW, it's already been pointed out to me in another forum that I could 
make my life simpler by padding to 64k ;/

Reply by Richard Owlett ●January 2, 20042004-01-02

Richard Owlett wrote:
> Fred Marshall wrote:
> 
>> "Richard Owlett" <rowlett@atlascomm.net> wrote in message
>> news:vv9fams78p7438@corp.supernews.com...
>>
>>> Any pointers to doing non-radix 2 FFT's.
>>> Specifically I wish to do 44100 pt FFT's on 16 bit data.
>>> Simplicity of coding OUTWEIGHS speed!
>>> [ Scilab is slow but adequate ]
>>>
>>
>>
>> Doesn't SCILAB just do them?  I tried it and it seems to.  It took 0.421
>> seconds on a dual processor Pentium 200MHz.
>>
>> Fred
>>
>>
> 
> I'm going to have a second look at the framework I was using. My Pentium 
> 4 1.6 GHz seems to do it no faster than your machine. Well, I did do 
> ~1800 of them in a batch ;}
> 
> BTW, it's already been pointed out to me in another forum that I could 
> make my life simpler by padding to 64k ;/
> 
> 

Just did a timing check using Scilab's timer(). My FFT's usually take 
.14 seconds with a few as fast as .125 seconds. Not the speed 
improvement I would expect. What's your OS? I'm using Winxp Pro SP1.

Doing the timing pointed out another problem with my setup. It takes 
.94 seconds to move the 44k samples from the input data array to the 
array on which the FFT is performed.

Reply by Alex Gibson ●January 2, 20042004-01-02

"Richard Owlett" <rowlett@atlascomm.net> wrote in message
news:vvar7epmfqmuee@corp.supernews.com...
> Richard Owlett wrote:
> > Fred Marshall wrote:
> >
> >> "Richard Owlett" <rowlett@atlascomm.net> wrote in message
> >> news:vv9fams78p7438@corp.supernews.com...
> >>
> >>> Any pointers to doing non-radix 2 FFT's.
> >>> Specifically I wish to do 44100 pt FFT's on 16 bit data.
> >>> Simplicity of coding OUTWEIGHS speed!
> >>> [ Scilab is slow but adequate ]
> >>>
> >>
> >>
> >> Doesn't SCILAB just do them?  I tried it and it seems to.  It took
0.421
> >> seconds on a dual processor Pentium 200MHz.
> >>
> >> Fred
> >>
> >>
> >
> > I'm going to have a second look at the framework I was using. My Pentium
> > 4 1.6 GHz seems to do it no faster than your machine. Well, I did do
> > ~1800 of them in a batch ;}
> >
> > BTW, it's already been pointed out to me in another forum that I could
> > make my life simpler by padding to 64k ;/
> >
> >
>
> Just did a timing check using Scilab's timer(). My FFT's usually take
> .14 seconds with a few as fast as .125 seconds. Not the speed
> improvement I would expect. What's your OS? I'm using Winxp Pro SP1.
>
> Doing the timing pointed out another problem with my setup. It takes
> .94 seconds to move the 44k samples from the input data array to the
> array on which the FFT is performed.
>

Well early p4's (most p4's) aren't very good at floating point
unless your code is recompiled for sse2.

Athlon's do a lot better with x87 floating point.

(athlon64 + 1 or 2 GB ram  are looking like a real nice upgrade
for a lot of things not just gaming)

A p3 at 1.3GHz will beat a p4 1.6GHz
on somethings about equal on others.

A p4 should beat a dual pentium 200 on most things
(should even beat dual pentium pro 200)

How much memory do you have installed ?
With xp of any sort 256MB is the absolute minimum for any
serious sort of work.
Especially as xp itself usually grabs around 128MB.

Also what else is running in the background ?
Did you kill off fast.exe which is the background loader for internet
explorer.

Been a while since I last used / ran scilab.

Does scilab use atlas ?
If so is it configured for your machine ?
http://math-atlas.sourceforge.net/
I known matlab uses it for some things.

Or have your recompiled scilab to tune it for
the machine your on ?
Shouldn't be necessary for most machines.

Alex Gibson

Reply by Fred Marshall ●January 2, 20042004-01-02

"Richard Owlett" <rowlett@atlascomm.net> wrote in message
news:vvar7epmfqmuee@corp.supernews.com...
> Richard Owlett wrote:
> > Fred Marshall wrote:
> >
> >> "Richard Owlett" <rowlett@atlascomm.net> wrote in message
> >> news:vv9fams78p7438@corp.supernews.com...
> >>
> >>> Any pointers to doing non-radix 2 FFT's.
> >>> Specifically I wish to do 44100 pt FFT's on 16 bit data.
> >>> Simplicity of coding OUTWEIGHS speed!
> >>> [ Scilab is slow but adequate ]
> >>>
> >>
> >>
> >> Doesn't SCILAB just do them?  I tried it and it seems to.  It took
0.421
> >> seconds on a dual processor Pentium 200MHz.
> >>
> >> Fred
> >>
> >>
> >
> > I'm going to have a second look at the framework I was using. My Pentium
> > 4 1.6 GHz seems to do it no faster than your machine. Well, I did do
> > ~1800 of them in a batch ;}
> >
> > BTW, it's already been pointed out to me in another forum that I could
> > make my life simpler by padding to 64k ;/
> >
> >
>
> Just did a timing check using Scilab's timer(). My FFT's usually take
> .14 seconds with a few as fast as .125 seconds. Not the speed
> improvement I would expect. What's your OS? I'm using Winxp Pro SP1.
>
> Doing the timing pointed out another problem with my setup. It takes
> .94 seconds to move the 44k samples from the input data array to the
> array on which the FFT is performed.

Richard,

First, the test I did was structured like this:

time=0
i=1
p(1:44100)=1
while i<=100
   timer();
   pf=fft(p,-1);
   time=time+timer()
   i=i+1
end

So, there's no moving of arrrays in this code unless it's being done in the
fft - it's just a simple timing loop.
The machine I ran it on runs NT4sp6a.  Well it *is* only a 200MHz machine!
But it may be able to use both processors in this case - I don't know and
would frankly be a little surprised if it did.  Oh, but it appears that it
is using both processors! So, either the speed difference would be 4
assuming you have a single processor machine and we're seeing 3.5 which is
close enough to 4 for me.  Maybe XP is using more resources than NT.

I do note that 44100 breaks into a series of twice repeated primes 2,3,5,7 -
so if you want to do something fancier, maybe you can take advantage of
this - but I'm not at all sure it will be worth it.  You *did* say that
speed isn't the first issue.

Just for fun, I tried decreasing the array size to
32768 -> 0.235 secs.
44100 -> 0.437 secs.
65536 -> 0.781 secs.

I don't know, it looks like there must be some swapping going on for the
largest array:

44100/32768 = 1.346    .437/.235 = 1.859  1.859/1.346 = 1.381 factor over
linear increase

65536/44100 = 1.486    .781/.437 = 1.787  1.787/1.486 = 1.2 factor over
linear increase

65536/32768 = 2        .781/.235 = 3.32   3.32/2 = 1.66 factor over linear
increase.

N*log2(N) @ 32768 = 32768*15 = 491,520
          @ 44100 = 44100*15.43 = 680,463 (just for reference even though
it's not a power of 2)
          @ 65536 = 65536*16 = 1,048,576

44/32 > 680,463/491,520   = 1.384  predicted vs. 1.859 measured; or 1.34
higher than predicted increase
65/44 > 1,048,576/680,463 = 1.54   predicted vs. 1.787 measured; or 1.16
higher than predicted increase
65/32 > 1,048,576/491,520 = 2.133  predicted vs. 3.32  measured; or 1.56
higher than predicted increase

If we assume that 44100 would be less efficient, then it makes sense that
44/32 is higher than 65/44 doesn't it?  (That is, if 44100 is less efficient
it should be closer to 65K and further from 32K).  What I don't figure yet
is why 65/32 would be the worst situation.  Perhaps it's because there's
swapping going on with such a large array - I don't know, this system has
192MB of memory.

Fred

Reply by Fred Marshall ●January 2, 20042004-01-02

"Richard Owlett" <rowlett@atlascomm.net> wrote in message
news:vvar7epmfqmuee@corp.supernews.com...
> Richard Owlett wrote:
> > Fred Marshall wrote:
> >
> >> "Richard Owlett" <rowlett@atlascomm.net> wrote in message
> >> news:vv9fams78p7438@corp.supernews.com...
> >>
> >>> Any pointers to doing non-radix 2 FFT's.
> >>> Specifically I wish to do 44100 pt FFT's on 16 bit data.
> >>> Simplicity of coding OUTWEIGHS speed!
> >>> [ Scilab is slow but adequate ]
> >>>
> >>
> >>
> >> Doesn't SCILAB just do them?  I tried it and it seems to.  It took
0.421
> >> seconds on a dual processor Pentium 200MHz.
> >>
> >> Fred
> >>
> >>
> >
> > I'm going to have a second look at the framework I was using. My Pentium
> > 4 1.6 GHz seems to do it no faster than your machine. Well, I did do
> > ~1800 of them in a batch ;}
> >
> > BTW, it's already been pointed out to me in another forum that I could
> > make my life simpler by padding to 64k ;/
> >
> >

Richard,

I forgot to address the padding to 64K.  I don't see why you'd want to do
that unless there performance was greatly affected or unless you need to
zero pad for other reasons - like for circular convolution considerations.
44100 is a product of repeat numbers of primes: 2x2x3x3x5x5x7x7 so its
performance might be expected to be OK if the implemenation is good.  It
appears it is in Scilab.  On the other hand, try 4091 or 4093!!

Here's the recent code I used.  The runs times come out a little bit longer:

function [rowtime,coltime]=fftloop(N,loops)
i=1;
rowtime=0;
coltime=0;
while i<=loops
   p=rand(1,N);
   timer()
   pf=fft(p,-1);
   rowtime=rowtime+timer();
   i=i+1;
end
rowtime=rowtime/loops
i=1;
while i<=loops
   q=rand(N,1);
   timer();
   pf=fft(q,-1);
   coltime=coltime+timer();
   i=i+1;
end
coltime=coltime/loops

Fred

Reply by Steven G. Johnson ●January 2, 20042004-01-02

r.lyons@REMOVE.ieee.org (Rick Lyons) wrote...
> Richard Owlett <rowlett@atlascomm.net> wrote:
> >Any pointers to doing non-radix 2 FFT's.
> >Specifically I wish to do 44100 pt FFT's on 16 bit data.
> >Simplicity of coding OUTWEIGHS speed!
> 
>   I'm ignorant of these schemes,
> but ya' might do an Internet search 
> on "Prime Factor" and "Split-radix"
> FFTs.

The split-radix algorithm is essentially a combination of radices 2
and 4, and only works for power-of-two sizes.  The Prime-Factor
Algorithm (PFA) only decomposes a DFT into relatively prime factors
(so e.g. it cannot further decompose the factor of 7^2 in 44100) and
is fairly complicated to implement.

What the original poster wants is a "mixed-radix" FFT, which is simply
the generalized version of the standard Cooley-Tukey algorithm to
arbitrary factorizations.  Many implementations can be found online
(see www.fftw.org/links.html)

Of course, for spectral estimation you should also be doing windowing,
etcetera, in order to reduce edge artifacts.  Ditto if you are doing
zero-padding.  See e.g. Oppenheim and Schafer, "Digital Signal
Processing".

Cordially,
Steven G. Johnson

Reply by Mark Borgerding ●January 4, 20042004-01-04

Richard Owlett <rowlett@atlascomm.net> wrote in message news:<vv9fams78p7438@corp.supernews.com>...
> Any pointers to doing non-radix 2 FFT's.
> Specifically I wish to do 44100 pt FFT's on 16 bit data.
> Simplicity of coding OUTWEIGHS speed!
> [ Scilab is slow but adequate ]

I'm not sure what form of solution you seek.  A script? A program? A
manual method?

If you want to make a C program to accomplish the task, you could use
my KISS FFT library ( http://sourceforge.net/projects/kissfft/ ),
specifically designed for simplicity.

If you know Python (great lang imho), you could use the FFT module,
bundled with NumPy/Numeric.

Another scripting alternative is octave, which mostly works like
Matlab.

Good Luck,
Mark Borgerding

Previous12 3 Next

Large FFT --- NOT radix-2

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group