Hi ! I'm working on real-time speech recongnition application for embedded systems (with ARM compatible processor). I'm looking for a very fast (for real-time) real FFT fixed-point version, with 16-32 bit input percision, up to 512 array-size. I found fftw to be one of the fastest, but no fixed-point version :( help would be very appreciated. Regards, Shlomi
fixed-point real FFT for embedded
Started by ●April 4, 2004
Reply by ●April 4, 20042004-04-04
Shlomi wrote:> Hi ! > > I'm working on real-time speech recongnition application for embedded > systems (with ARM compatible processor). > I'm looking for a very fast (for real-time) real FFT fixed-point > version, with 16-32 bit input percision, up to 512 array-size. > > I found fftw to be one of the fastest, but no fixed-point version :( > > help would be very appreciated. > > Regards, > ShlomiCheck out KISSFFT on sourceforge.net: * ANSI C * floating or fixed point processing * optional code for real-only fft (see tools/kiss_fftr.c ) * small code size * BSD open source license The fixed point code currently uses Q15 format. It could easily be made to work with Q31. Beware, Q31 multiplies may be very slow without a 64 bit accumulator. There should be a version coming out rsn that will be more embedded-friendly, thanks to Frank van der Hulst. He is using kissfft on a microcontroller that makes many ARMs look like supercomputers (Renesas M16c). -- Mark Borgerding
Reply by ●April 4, 20042004-04-04
>Shlomi wrote: > > Hi ! > > I'm working on real-time speech recongnition application for embedded > systems (with ARM compatible processor). > I'm looking for a very fast (for real-time) real FFT fixed-point > version, with 16-32 bit input percision, up to 512 array-size. I just released KISSFFT version 1.2.1, which focuses on embedded & code size issues. See http://sourceforge.net/projects/kissfft/ At the risk of dissuading you from using my fine little library ... Have you looked on developer.intel.com for FFT libraries for your flavor of ARM? If they provide an FFT, it is *gulp* probably better than portable C code such as kissfft. Using assembly code, one can most efficiently implement bit-reversed indexing, overflow managment, rounding, scaling, etc. -- Mark Borgerding
Reply by ●April 7, 20042004-04-07
Mark Borgerding <mark@borgerding.net> wrote in message news:<og0cc.116831$8G2.81413@fe3.columbus.rr.com>...> >Shlomi wrote: > > > > Hi ! > > > > I'm working on real-time speech recongnition application for embedded > > systems (with ARM compatible processor). > > I'm looking for a very fast (for real-time) real FFT fixed-point > > version, with 16-32 bit input percision, up to 512 array-size. > > > I just released KISSFFT version 1.2.1, which focuses on embedded & code > size issues. > > See > http://sourceforge.net/projects/kissfft/ > > > At the risk of dissuading you from using my fine little library ... > Have you looked on developer.intel.com for FFT libraries for your > flavor of ARM? If they provide an FFT, it is *gulp* probably better > than portable C code such as kissfft. Using assembly code, one can most > efficiently implement bit-reversed indexing, overflow managment, > rounding, scaling, etc. > > -- Mark BorgerdingThanks Mark, I'm going to try your library, though with 32 bits accuracy, and check it on the iPaq. but I need some other fixed-point sources to compare it with. I think we should do a contest for fixed-point FFT by benchmarking on iPaq/Palm (Windows CE/Palm OS) etc.. what do you think ? you have a good starting point with your _small_ library ! if anyone is up to the challenge please post here, I believe the embedded community will definitely harvest the fruits. good luck !
Reply by ●April 7, 20042004-04-07
Mark Borgerding wrote:>...Using assembly code, one can most >efficiently implement bit-reversed indexing, overflow managment, >rounding, scaling, etc.As for bit-reversed indexing, here is a C-implementation that is as fast as can be because it uses table look-up for most of the reversals: // Reorder the DataArray[] array according to bit-reversed indices. baseN = dataN/128; for(RegularBase=0; RegularBase<baseN; RegularBase+=2) { unsigned int RemainingRegularBase; unsigned int Regular; int BitsToMove,seven; static const unsigned char bitrev128[] = { 0,128, 64,192, 32,160, 96,224, 16,144, 80,208, 48,176,112,240, 8,136, 72,200, 40,168,104,232, 24,152, 88,216, 56,184,120,248, 4,132, 68,196, 36,164,100,228, 20,148, 84,212, 52,180,116,244, 12,140, 76,204, 44,172,108,236, 28,156, 92,220, 60,188,124,252, 2,130, 66,194, 34,162, 98,226, 18,146, 82,210, 50,178,114,242, 10,138, 74,202, 42,170,106,234, 26,154, 90,218, 58,186,122,250, 6,134, 70,198, 38,166,102,230, 22,150, 86,214, 54,182,118,246, 14,142, 78,206, 46,174,110,238, 30,158, 94,222, 62,190,126,254}; ReversedBase=0; RemainingRegularBase = RegularBase>>1; BitsToMove = nBits - 8; while(BitsToMove >= 7) { ReversedBase = (ReversedBase<<7) | bitrev128[RemainingRegularBase & 127]; RemainingRegularBase >>=7; BitsToMove -= 7; } while(BitsToMove) { ReversedBase = (ReversedBase|(RemainingRegularBase&1)) << 1; RemainingRegularBase >>= 1; BitsToMove--; } ReversedBase <<= 7; for(seven=0,Regular=RegularBase; seven<128; seven++,Regular+=baseN) { unsigned int Reversed; Reversed = ReversedBase | bitrev128[seven]; if(Regular < Reversed) { DataType t; t = DataPtr[Regular]; DataPtr[Regular] = DataPtr[Reversed]; DataPtr[Reversed] = t; t = DataPtr[Regular+1]; DataPtr[Regular+1] = DataPtr[Reversed+1]; DataPtr[Reversed+1] = t; } } -Robert Scott Ypsilanti, Michigan (Reply through this forum, not by direct e-mail to me, as automatic reply address is fake.)