I don't see this anywhere on dspguru. So here it is. Feedback
welcome. I used this durning my masters project.
============================================================
--- DSP Trick Submission Form ---
============================================================
THIS WORK IS PLACED IN THE PUBLIC DOMAIN
Name: The Two FFTs Trick
Category: Algorithmic Trick.
Application: I used this this trick to make fast convolution even
faster.
Introduction: Use this trick to perform two simultaneous real-valued
FFTs using only a single complex-valued FFT.
The Trick:
You can take two real-valued FFTs at the same time by making use of
the imaginary part of the transform.
Suppose we have the time-indexed series
X[n] = a[n] + jb[n]
The transform is (r is real and i is imaginary)
X[k] = Xr[k] + jXi[j]
By splitting both the real and imaginary parts into even and odd
(e and o) parts, you can retrieve the transforms of a and b.
Xa[k] = Xre[k] + jXio[k]
Xb[k] = (Xro[k] + jXie[k])/j
I divided Xb by j because it's the imaginary part, but we want it to
be real. If you put a real-valued sequence, a[n], in the real part of
x[n] and
a real-valued sequence, b[n], in the the real part of x[n], you'll get
the transforms in Xa[k] and Xb[k].
An example of circular convolution using this trick in matlab is
below. It's very easy to see what's going on using this high level
syntax. Remember that the even and odd parts are
Xe[n] = (X[n] + X[-n])/2
Xo[n] = (X[n] - X[-n])/2
The -n is simply an axis reversal. You won't see any divide by twos.
That's because I've multiplied them and moved the result outside the
final
inverse transform. The reason for this is that division is slow. If
you remember that the definition of the Inverse DFT has a division as
the last step, this is easy to understand. You can merge the divide
by four into this division. Then, with a little effort, you can get
two fast convolutions for the cost of two FFTs, one IFFT, and some
extra additions.
%code
a=rand(1,6);b=rand(1,6);c=rand(1,6);d=rand(1,6);
ifft(fft(a).*fft(c))
ifft(fft(b).*fft(d))
xtmp1=fft(a+b*j);
xtmp2=fft(c+d*j);
x1=xtmp1(2:end);
x2=xtmp2(2:end);
re1=real(x1+x1(end:-1:1))+imag(x1-x1(end:-1:1))*j;
im1=real(x1-x1(end:-1:1))/(j)+imag(x1+x1(end:-1:1));
re2=real(x2+x2(end:-1:1))+imag(x2-x2(end:-1:1))*j;
im2=real(x2-x2(end:-1:1))/(j)+imag(x2+x2(end:-1:1));
y1=[real(xtmp1(1))*4 re1].*[real(xtmp2(1)) re2];
y2=[imag(xtmp1(1))*4 im1].*[imag(xtmp2(1)) im2];
ifft(y1+y2*j)/4
www.signalsguru.net
first post - first trick
Started by ●November 9, 2005
Reply by ●November 10, 20052005-11-10
Hello Goldfita,> Name: The Two FFTs TrickYour trick is a good one! However, it has been around for a while already, together with a two other methods to compute DFTs on real vectors (odd/even interleaving in time or frequency domain) using complex FFT algorithms on half-length vectors.> You won't see any divide by twos. ... The reason for this is that division is slow.Division by a positive or negative power of two is a one-instruction process, provided you can access the exponent of a floating-point number on a given hardware. On DSPs, this is usually the case. It can't be guaranteed that the compiler recognizes this, however, so being careful is always a good idea. Regards, Andor
Reply by ●November 10, 20052005-11-10
I thought the goal was to document known tricks (http://dspguru.com/comp.dsp/index.htm). About the division - it's still better to do it this way. Otherwise you'll need an extra loop the length of the DFT. But I never considered changing the exponent. Is this something you can do on a c67? I often find myself wishing I could do a bit shift on a floating point number.
Reply by ●November 10, 20052005-11-10
> I thought the goal was to document known tricks > (http://dspguru.com/comp.dsp/index.htm).Indeed. Thanks for taking the time.> About the division - it's still better to do it this way. Otherwise > you'll need an extra loop the length of the DFT. But I never > considered changing the exponent. Is this something you can do on a > c67? I often find myself wishing I could do a bit shift on a floating > point number.I'm not familiar with the TI DSPs, but you can do it on the ADI floating-point DSPs. There is a specific instruction to shift the exponent of floating-point numbers - a reason to switch to ADI :-) ? Regards, Andor
Reply by ●November 10, 20052005-11-10
abariska@student.ethz.ch wrote:>>I thought the goal was to document known tricks >>(http://dspguru.com/comp.dsp/index.htm). > > > Indeed. Thanks for taking the time. > > >>About the division - it's still better to do it this way. Otherwise >>you'll need an extra loop the length of the DFT. But I never >>considered changing the exponent. Is this something you can do on a >>c67? I often find myself wishing I could do a bit shift on a floating >>point number. > > > I'm not familiar with the TI DSPs, but you can do it on the ADI > floating-point DSPs. There is a specific instruction to shift the > exponent of floating-point numbers - a reason to switch to ADI :-) ? > > Regards, > AndorOn TI's 'C3x, the exponent is also available. Halving it makes a fine trial guess for square roots. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●November 10, 20052005-11-10
Jerry Avins wrote:> abariska@student.ethz.ch wrote: > >>> I thought the goal was to document known tricks >>> (http://dspguru.com/comp.dsp/index.htm). >> >> >> >> Indeed. Thanks for taking the time. >> >> >>> About the division - it's still better to do it this way. Otherwise >>> you'll need an extra loop the length of the DFT. But I never >>> considered changing the exponent. Is this something you can do on a >>> c67? I often find myself wishing I could do a bit shift on a floating >>> point number. >> >> >> >> I'm not familiar with the TI DSPs, but you can do it on the ADI >> floating-point DSPs. There is a specific instruction to shift the >> exponent of floating-point numbers - a reason to switch to ADI :-) ? >> >> Regards, >> Andor > > > On TI's 'C3x, the exponent is also available. Halving it makes a fine > trial guess for square roots. > > JerryYes it does, and if you follow up by halving the non-sign part of the mantissa (leaving the implied 'most significant non-sign bit' intact) then you generate an even better guess. Regards, John
Reply by ●November 11, 20052005-11-11
>I'm not familiar with the TI DSPs, but you can do it on the ADI >floating-point DSPs. There is a specific instruction to shift the >exponent of floating-point numbers - a reason to switch to ADI :-) ? > >Regards, >AndorHear about the new kid in the block? Meet RCPSP(Reciprocal Single Precision)! Hail TMS320C6000! --Bhooshan
Reply by ●November 11, 20052005-11-11
John Monro wrote:> Jerry Avins wrote:> >> I'm not familiar with the TI DSPs, but you can do it on the ADI > >> floating-point DSPs. There is a specific instruction to shift the > >> exponent of floating-point numbers - a reason to switch to ADI :-) ? > >> > >> Regards, > >> Andor > > > > > > On TI's 'C3x, the exponent is also available. Halving it makes a fine > > trial guess for square roots. > > > > Jerry > > Yes it does, and if you follow up by halving the non-sign part of the > mantissa (leaving the implied 'most significant non-sign bit' intact) > then you generate an even better guess.Ah yes, there is a single cycle instruction available on the SHARC DSPs to do that as well :-)
Reply by ●November 11, 20052005-11-11
> Hear about the new kid in the block? Meet RCPSP(Reciprocal Single > Precision)! Hail TMS320C6000!That doesn't sound like arbitrary exponent shifts ...
Reply by ●November 11, 20052005-11-11
I believe that's how the optimized mixed radix ifft works on c67. It uses a single precision, single cycle, approximate inverse (rcpsp?) instruction. It's supposed to be correct for powers of two. Since the ifft has to be a power of two, this is all you need for the division. I just multiplied the constant by four before the division. I hadn't thought about how it actually finds the approximate inverse. But it seems you don't get access to the exponent. The instruction does all the work for you.






