comp.dsp | fixed-point real FFT for embedded

Hi !

I'm working on real-time speech recongnition application for embedded
systems (with ARM compatible processor).
I'm looking for a very fast (for real-time) real FFT fixed-point
version, with 16-32 bit input percision, up to 512 array-size.

I found fftw to be one of the fastest, but no fixed-point version :(

help would be very appreciated.

Regards,
Shlomi

Reply by Mark Borgerding ●April 4, 20042004-04-04

Shlomi wrote:
> Hi !
> 
> I'm working on real-time speech recongnition application for embedded
> systems (with ARM compatible processor).
> I'm looking for a very fast (for real-time) real FFT fixed-point
> version, with 16-32 bit input percision, up to 512 array-size.
> 
> I found fftw to be one of the fastest, but no fixed-point version :(
> 
> help would be very appreciated.
> 
> Regards,
> Shlomi

Check out KISSFFT on sourceforge.net:
* ANSI C
* floating or fixed point processing
* optional code for real-only fft (see tools/kiss_fftr.c )
* small code size
* BSD open source license

The fixed point code currently uses Q15 format.  It could easily be made 
to work with Q31.  Beware, Q31 multiplies may be very slow without a 64 
bit accumulator.

There should be a version coming out rsn that will be more 
embedded-friendly, thanks to Frank van der Hulst.  He is using kissfft 
on a microcontroller that makes many ARMs look like supercomputers 
(Renesas M16c).

-- Mark Borgerding

Reply by Mark Borgerding ●April 4, 20042004-04-04

 >Shlomi wrote:
 >
 > Hi !
 >
 > I'm working on real-time speech recongnition application for embedded
 > systems (with ARM compatible processor).
 > I'm looking for a very fast (for real-time) real FFT fixed-point
 > version, with 16-32 bit input percision, up to 512 array-size.

I just released KISSFFT version 1.2.1, which focuses on embedded & code 
size issues.

See
http://sourceforge.net/projects/kissfft/

At the risk of dissuading you from using my fine little library ...
  Have you looked on developer.intel.com for FFT libraries for your 
flavor of ARM?  If they provide an FFT, it is *gulp* probably better 
than portable C code such as kissfft.  Using assembly code, one can most 
efficiently implement bit-reversed indexing, overflow managment, 
rounding, scaling, etc.

-- Mark Borgerding

Reply by Shlomi ●April 7, 20042004-04-07

Mark Borgerding <mark@borgerding.net> wrote in message news:<og0cc.116831$8G2.81413@fe3.columbus.rr.com>...
> >Shlomi wrote:
>  >
>  > Hi !
>  >
>  > I'm working on real-time speech recongnition application for embedded
>  > systems (with ARM compatible processor).
>  > I'm looking for a very fast (for real-time) real FFT fixed-point
>  > version, with 16-32 bit input percision, up to 512 array-size.
> 
> 
> I just released KISSFFT version 1.2.1, which focuses on embedded & code 
> size issues.
> 
> See
> http://sourceforge.net/projects/kissfft/
> 
> 
> At the risk of dissuading you from using my fine little library ...
>   Have you looked on developer.intel.com for FFT libraries for your 
> flavor of ARM?  If they provide an FFT, it is *gulp* probably better 
> than portable C code such as kissfft.  Using assembly code, one can most 
> efficiently implement bit-reversed indexing, overflow managment, 
> rounding, scaling, etc.
> 
> -- Mark Borgerding

Thanks Mark, 
I'm going to try your library, though with 32 bits accuracy,
and check it on the iPaq.
but I need some other fixed-point sources to compare it with.

I think we should do a contest for fixed-point FFT by benchmarking on
iPaq/Palm (Windows CE/Palm OS) etc..
what do you think ? you have a good starting point with your _small_
library !

if anyone is up to the challenge please post here,
I believe the embedded community will definitely harvest the fruits.

good luck !

Reply by Robert Scott ●April 7, 20042004-04-07

Mark Borgerding wrote:

>...Using assembly code, one can most 
>efficiently implement bit-reversed indexing, overflow managment, 
>rounding, scaling, etc.

As for bit-reversed indexing, here is a C-implementation that is as
fast as can be because it uses table look-up for most of the
reversals:

// Reorder the DataArray[] array according to bit-reversed indices.
baseN = dataN/128;
for(RegularBase=0; RegularBase<baseN; RegularBase+=2)
{
    unsigned int RemainingRegularBase;
    unsigned int Regular;
    int BitsToMove,seven;
 static const unsigned char bitrev128[] = {
  0,128, 64,192, 32,160, 96,224, 16,144, 80,208, 48,176,112,240,
  8,136, 72,200, 40,168,104,232, 24,152, 88,216, 56,184,120,248,
  4,132, 68,196, 36,164,100,228, 20,148, 84,212, 52,180,116,244,
 12,140, 76,204, 44,172,108,236, 28,156, 92,220, 60,188,124,252,
  2,130, 66,194, 34,162, 98,226, 18,146, 82,210, 50,178,114,242,
 10,138, 74,202, 42,170,106,234, 26,154, 90,218, 58,186,122,250,
  6,134, 70,198, 38,166,102,230, 22,150, 86,214, 54,182,118,246,
 14,142, 78,206, 46,174,110,238, 30,158, 94,222, 62,190,126,254};
ReversedBase=0;
RemainingRegularBase = RegularBase>>1;
BitsToMove = nBits - 8;
while(BitsToMove >= 7)
{
  ReversedBase = (ReversedBase<<7) | 
    bitrev128[RemainingRegularBase & 127];
  RemainingRegularBase >>=7;
  BitsToMove -= 7;
}
while(BitsToMove)
{
  ReversedBase = (ReversedBase|(RemainingRegularBase&1)) << 1;
  RemainingRegularBase >>= 1;
  BitsToMove--;
}
ReversedBase <<= 7;
for(seven=0,Regular=RegularBase; seven<128; seven++,Regular+=baseN)
{
    unsigned int Reversed;
  Reversed = ReversedBase | bitrev128[seven];
  if(Regular < Reversed)
  {
    DataType t;
    t = DataPtr[Regular];
    DataPtr[Regular] = DataPtr[Reversed];
    DataPtr[Reversed] = t;
    t = DataPtr[Regular+1];
    DataPtr[Regular+1] = DataPtr[Reversed+1];
    DataPtr[Reversed+1] = t;
  }
}


-Robert Scott
 Ypsilanti, Michigan
(Reply through this forum, not by direct e-mail to me, as automatic reply address is fake.)

fixed-point real FFT for embedded

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group