>>glen herrmannsfeldt wrote: >>> Jerry Avins <jya@ieee.org> wrote: >>> (big snip) >>> >>>> Zero padding doesn't change the returned values. The extra zeros are>>>> place holders and can be removed from the final result, which are >>>> otherwise bit exact. You should do a little reading (or use FFTW, >>>> reading the directions carefully and taking no steps for granted. >>> >>> It seems to me that in some cases zero padding is right, and other >>> times it is wrong. >>> >>> In this case, it would seem to me that you get the frequencies >>> at different points in the zero padded case. You could interpolate >>> them, assuming that it approximates a continuous spectrum. >>> >>> If you have 120 sample points over some time period, the FFT bins >>> will be 0, 1, 2, up to 119 cycles over that time period. >>> (or -60 to 59 if you like that better). >>> >>> If you zero pad, the frequency bins will be cycles over the >>> now extended time period. If you want to graph the spectrum >>> then that is probably fine. Also, if the samples represent one >>> period of a periodic signal, zero padding will give a different >>> result. >> >>I'm simply wrong. I was thinking FFT all along, not IFFT. Some days are>>like that. I owe apologies all around. Dspdummy should go with FFTW.I'm> >>sure Gordon Sande will offer advice if that becomes necessary. >> >>Jerry >>-- >>Engineering is the art of making what you want from things you can get. >>??????????????????????????????????????????????????????????????????????? >> >Hello Jerry, >No pb. I was talking fft and not ifft at the beginning because of their >mathematical similarites but I DO need an IFFT. I'll check out that FFTW >but who is Gordon Sande? >Thanks >BillHello again: That FFTW material is quite challenging to understand quickly but I'll go at it again later. In the meantime, I did an experiment using 64 15-pt dft's and then post twiddling the outputs for several passes to get the 960 pts. It works OK but of course the number of operations is probably not optimal. MIPS are based on number of mac required per second. Theoretical MIPS (log2(N)*N/2) =.2 15-pt dft experiment MIPS = .8 Theoretical 960-pt dft MIPS = 43 If you're interested I am including my Matlab scripts. I would be grateful for any comments on this experiment especially on how I can easily decrease the MIPS. Bill Matlab stuff: clear close all N=960; Nprime=15; randn('seed',0); x=randn(N,1); Ndft=N/Nprime; Npass=log2(Ndft); %----------------- do all the prime dft's -------- stride=Ndft; out1=[]; primecycles=0; for iter=0:Ndft-1 startadr=bitrev(iter,Ndft); for iter1=0:Nprime-1 indx=startadr+iter1*stride+1; dftin(iter1+1)=x(indx); end out1=[out1,dftin*dftmtx(Nprime)]; primecycles=primecycles+Nprime^2; end %---------------- do post twiddling for each pass ------------- Ntwid=Ndft/2; Ncoef=Nprime; for iter=1:Npass coef1=exp(-j*2*pi*[0:(Ncoef)-1].'/(2*Ncoef)); startindx=0; for iter1=1:Ntwid for iter2=1:Ncoef indx=startindx+iter2; temp1=out1(indx); temp2=out1(indx+Ncoef); out1(indx)= temp1+temp2*coef1(iter2); out1(indx+Ncoef)=temp1-temp2*coef1(iter2); primecycles=primecycles+1; end startindx=startindx+2*Ncoef; end Ntwid=Ntwid/2; Ncoef=Ncoef*2; end %-------------------------- check results ------------------------------- X = x'*dftmtx(N); err=max(abs(out1-X)); fftcycles=floor(log2(N)*N/2); dftcycles=N^2; frameduration=1024/48000; fftmips=fftcycles/(frameduration*1e6); primemips=primecycles/(frameduration*1e6); dftmips=dftcycles/(frameduration*1e6); err fftmips primemips dftmips function [output] = bitrev(input,N); %---------------------------------------- % function [output] = bitrev(input,N); % N is the max. range (power of 2) %---------------------------------------- nbits=log2(N); inval=input; outval=0; for i=1:nbits outval=outval*2; if (rem(inval,2)~=0) outval=outval+1; end inval=floor(inval/2); end output=outval;
N-pt DFT where n != power of 2
Started by ●August 12, 2009
Reply by ●August 14, 20092009-08-14
Reply by ●August 14, 20092009-08-14
>>glen herrmannsfeldt wrote: >>> Jerry Avins <jya@ieee.org> wrote: >>> (big snip) >>> >>>> Zero padding doesn't change the returned values. The extra zeros are>>>> place holders and can be removed from the final result, which are >>>> otherwise bit exact. You should do a little reading (or use FFTW, >>>> reading the directions carefully and taking no steps for granted. >>> >>> It seems to me that in some cases zero padding is right, and other >>> times it is wrong. >>> >>> In this case, it would seem to me that you get the frequencies >>> at different points in the zero padded case. You could interpolate >>> them, assuming that it approximates a continuous spectrum. >>> >>> If you have 120 sample points over some time period, the FFT bins >>> will be 0, 1, 2, up to 119 cycles over that time period. >>> (or -60 to 59 if you like that better). >>> >>> If you zero pad, the frequency bins will be cycles over the >>> now extended time period. If you want to graph the spectrum >>> then that is probably fine. Also, if the samples represent one >>> period of a periodic signal, zero padding will give a different >>> result. >> >>I'm simply wrong. I was thinking FFT all along, not IFFT. Some days are>>like that. I owe apologies all around. Dspdummy should go with FFTW.I'm> >>sure Gordon Sande will offer advice if that becomes necessary. >> >>Jerry >>-- >>Engineering is the art of making what you want from things you can get. >>??????????????????????????????????????????????????????????????????????? >> >Hello Jerry, >No pb. I was talking fft and not ifft at the beginning because of their >mathematical similarites but I DO need an IFFT. I'll check out that FFTW >but who is Gordon Sande? >Thanks >BillHello again: That FFTW material is quite challenging to understand quickly but I'll go at it again later. In the meantime, I did an experiment using 64 15-pt dft's and then post twiddling the outputs for several passes to get the 960 pts. It works OK but of course the number of operations is probably not optimal. MIPS are based on number of mac required per second. Theoretical MIPS (log2(N)*N/2) =.2 15-pt dft experiment MIPS = .8 Theoretical 960-pt dft MIPS = 43 If you're interested I am including my Matlab scripts. I would be grateful for any comments on this experiment especially on how I can easily decrease the MIPS. Bill Matlab stuff: clear close all N=960; Nprime=15; randn('seed',0); x=randn(N,1); Ndft=N/Nprime; Npass=log2(Ndft); %----------------- do all the prime dft's -------- stride=Ndft; out1=[]; primecycles=0; for iter=0:Ndft-1 startadr=bitrev(iter,Ndft); for iter1=0:Nprime-1 indx=startadr+iter1*stride+1; dftin(iter1+1)=x(indx); end out1=[out1,dftin*dftmtx(Nprime)]; primecycles=primecycles+Nprime^2; end %---------------- do post twiddling for each pass ------------- Ntwid=Ndft/2; Ncoef=Nprime; for iter=1:Npass coef1=exp(-j*2*pi*[0:(Ncoef)-1].'/(2*Ncoef)); startindx=0; for iter1=1:Ntwid for iter2=1:Ncoef indx=startindx+iter2; temp1=out1(indx); temp2=out1(indx+Ncoef); out1(indx)= temp1+temp2*coef1(iter2); out1(indx+Ncoef)=temp1-temp2*coef1(iter2); primecycles=primecycles+1; end startindx=startindx+2*Ncoef; end Ntwid=Ntwid/2; Ncoef=Ncoef*2; end %-------------------------- check results ------------------------------- X = x'*dftmtx(N); err=max(abs(out1-X)); fftcycles=floor(log2(N)*N/2); dftcycles=N^2; frameduration=1024/48000; fftmips=fftcycles/(frameduration*1e6); primemips=primecycles/(frameduration*1e6); dftmips=dftcycles/(frameduration*1e6); err fftmips primemips dftmips function [output] = bitrev(input,N); %---------------------------------------- % function [output] = bitrev(input,N); % N is the max. range (power of 2) %---------------------------------------- nbits=log2(N); inval=input; outval=0; for i=1:nbits outval=outval*2; if (rem(inval,2)~=0) outval=outval+1; end inval=floor(inval/2); end output=outval;
Reply by ●August 14, 20092009-08-14
dspdummy wrote: ...> Hello Jerry, > No pb. I was talking fft and not ifft at the beginning because of their > mathematical similarites but I DO need an IFFT. I'll check out that FFTW > but who is Gordon Sande?Gordon Sande and Steven Johnson maintain and improve FFTW. Both have contributed to this thread, so you can probably see their email addresses. *Netiquette: don't bug them privately unless invited to.* Jerry -- Engineering is the art of making what you want from things you can get.
Reply by ●August 14, 20092009-08-14
On 2009-08-14 10:49:41 -0300, Jerry Avins <jya@ieee.org> said:> dspdummy wrote: > > ... > >> Hello Jerry, >> No pb. I was talking fft and not ifft at the beginning because of their >> mathematical similarites but I DO need an IFFT. I'll check out that FFTW >> but who is Gordon Sande? > > Gordon Sande and Steven Johnson maintain and improve FFTW. Both have > contributed to this thread, so you can probably see their email > addresses. *Netiquette: don't bug them privately unless invited to.* > > JerryGive credit for FFTW to Steven Johnson. My code is much older as it dates to 1966. It is used in places like Xray crystallography on heavy iron if I understand what I am told by my Xtalgraphy friends. FFTW is for things like workstations like Suns. It is heavy on the software engineering of dealing with limited caches.
Reply by ●August 14, 20092009-08-14
>On 2009-08-14 10:49:41 -0300, Jerry Avins <jya@ieee.org> said: > >> dspdummy wrote: >> >> ... >> >>> Hello Jerry, >>> No pb. I was talking fft and not ifft at the beginning because oftheir>>> mathematical similarites but I DO need an IFFT. I'll check out thatFFTW>>> but who is Gordon Sande? >> >> Gordon Sande and Steven Johnson maintain and improve FFTW. Both have >> contributed to this thread, so you can probably see their email >> addresses. *Netiquette: don't bug them privately unless invited to.* >> >> Jerry > >Give credit for FFTW to Steven Johnson. My code is much older as it >dates to 1966. >It is used in places like Xray crystallography on heavy iron if Iunderstand>what I am told by my Xtalgraphy friends. FFTW is for things likeworkstations>like Suns. It is heavy on the software engineering of dealing with >limited caches. > > >Hello again again Still trying to understand FFTW but not quite there yet. My final question concerning my problem is: How can I efficiently(low MIPS) perform a 15-pt fft instead of a dft? Prime factor or other algorithm? It would be great to see the algorithm coded in Matlab or C but only for this specific case. Any suggestions or examples? Thanks Bill
Reply by ●August 15, 20092009-08-15
On Aug 14, 4:33�pm, "dspdummy" <chris.gl...@orange.fr> wrote:> Still trying to understand FFTW but not quite there yet. > My final question concerning my problem is: > How can I efficiently(low MIPS) perform a 15-ptfftinstead of a dft? > Prime factor or other algorithm? > It would be great to see the algorithm coded in Matlab or C but only for > this specific case.You can use mixed-radix Cooley-Tukey for that case, or you can use the prime-factor algorithm (commonly confused with mixed-radix C-T, but not the same thing; google it). And there are a few more exotic algorithms, too, but I can't recommend them as a practical matter. C code for a (prime-factor based) size-15 FFT, for purely real input data, with 64 real additions and 25 real multiplications can be found in the FFTW 3.2.2 package, in the file rdft/scalar/r2cf/r2cf_15.c for example. (This is designed to be used internally within FFTW, so you'll have to decipher a few macros, but it's not too bad.) I don't claim that this is the lowest possible operation count for that size, but it shouldn't be too far off. (There is also the complex-data case in dft/scalar/codelets/n1_15.c, but now that I look at it I notice that the operation counts seem to be slightly suboptimal; we may want to tweak our generator for the next release.) But a size-15 hard-coded FFT is still a little too complicated to easily make sense of by hand; writing the mixed-radix Cooley-Tukey algorithm for that size as loops (rather than unrolled as in FFTW) may be easier if you want to follow the algorithm. Regards, Steven G. Johnson
Reply by ●August 15, 20092009-08-15
On Aug 13, 5:59�pm, Ian Shef <inva...@avoiding.spam> wrote:> "dspdummy" <chris.gl...@orange.fr> wrote in news: > Are you certain that you want bit-exact? �There is evidence that (other > things being equal)FFTresults are more "correct" than DFT because of the > reduction in the number of operations where roundoff takes place.There is not just evidence, there is rigorous theory (dating back to Gordon Sande's classic 1966 paper) explaining how and why FFT algorithms (at least, Cooley-Tukey and several others) have vastly better rounding characteristics than the O(N^2) naive algorithm with floating-point arithmetic. The FFT rounding errors grow as O(sqrt(log N)) on average, while for the O(N^2) algorithm they grow as O(sqrt (N)). (It is also easy to verify this experimentally; see fftw.org/ accuracy) It's not really a matter of the reduced number of operations, it's really a property of the ordering. This is easy to see if you compute only one of the FFT outputs and prune away all the other operations, say the DC output, in which case it requires exactly the same number of operations as the naive algorithms (just summing the N inputs in sequence for the DC output), but the error is still vastly better [sqrt (log N) vs. sqrt(N)]. There is some discussion of this in the penultimate section of our chapter of Sid Burrus' book (http://cnx.org/ content/m16336/latest/), with references. (Fixed point is another matter entirely; roundoff errors are not nearly so good there.) Regards, Steven G. Johnson
Reply by ●August 15, 20092009-08-15
On Aug 13, 7:12�pm, Oli Charlesworth <ca...@olifilth.co.uk> wrote:> Isn't there only one rounding point in a DFT? �(At the end of the > multiply-accumulate for each output bin.) �This is versus log(N) > rounding points in anFFT.No.
Reply by ●August 15, 20092009-08-15
On Aug 13, 7:12�pm, Oli Charlesworth <ca...@olifilth.co.uk> wrote:> Isn't there only one rounding point in a DFT? �(At the end of the > multiply-accumulate for each output bin.) �This is versus log(N) > rounding points in anFFT.(I suppose you're assuming that the intermediate calculations for the naive DFT are done in greater precision and are only rounded at the end. Of course, you could do the same thing for an FFT, albeit requiring a bit more storage. But this seems like comparing apples and oranges to me, and anyway most people on general-purpose CPUs compute everything in double precision these days.)
Reply by ●August 15, 20092009-08-15
>On Aug 14, 4:33=A0pm, "dspdummy" <chris.gl...@orange.fr> wrote: >> Still trying to understand FFTW but not quite there yet. >> My final question concerning my problem is: >> How can I efficiently(low MIPS) perform a 15-ptfftinstead of a dft? >> Prime factor or other algorithm? >> It would be great to see the algorithm coded in Matlab or C but onlyfor>> this specific case. > >You can use mixed-radix Cooley-Tukey for that case, or you can use the >prime-factor algorithm (commonly confused with mixed-radix C-T, but >not the same thing; google it). And there are a few more exotic >algorithms, too, but I can't recommend them as a practical matter. > >C code for a (prime-factor based) size-15 FFT, for purely real input >data, with 64 real additions and 25 real multiplications can be found >in the FFTW 3.2.2 package, in the file rdft/scalar/r2cf/r2cf_15.c for >example. (This is designed to be used internally within FFTW, so >you'll have to decipher a few macros, but it's not too bad.) I don't >claim that this is the lowest possible operation count for that size, >but it shouldn't be too far off. (There is also the complex-data >case in dft/scalar/codelets/n1_15.c, but now that I look at it I >notice that the operation counts seem to be slightly suboptimal; we >may want to tweak our generator for the next release.) > >But a size-15 hard-coded FFT is still a little too complicated to >easily make sense of by hand; writing the mixed-radix Cooley-Tukey >algorithm for that size as loops (rather than unrolled as in FFTW) may >be easier if you want to follow the algorithm. > >Regards, >Steven G. JohnsonHello Steven OK thanks. I'll try and decipher that fftw code with macros next step unless someone out there has already done it with Matlab for a 15-pt fft. thanks Bill