Forums

Calculate MACs in MATLAB

Started by pikey April 29, 2008
I am doing a project in matlab to compare the computational complexities of
two systems the problem I have is that I need to work out how many multiply
and adds a 150 point ifft uses in matlab from what I can work out MATLAB
uses the Prime Factor algorithm to do this but I am having trouble working
out how many MACS are done. Does matlab have the ability to feed back this
information?

Thanks

Pikey 
>I am doing a project in matlab to compare the computational complexities
of
>two systems the problem I have is that I need to work out how many
multiply
>and adds a 150 point ifft uses in matlab from what I can work out MATLAB >uses the Prime Factor algorithm to do this but I am having trouble
working
>out how many MACS are done. Does matlab have the ability to feed back
this
>information? > >Thanks > >Pikey >
Well, MATLAB used to allow you to calculate the number of flops (floating point operations), but as the help on flops indicates, since the inclusion of LAPACK it is no longer practical. You can time the two functions using tic (before the function call) and toc (after the function call) and compare times, which should give you an approximation of relative complexity assuming the arrays have the same memory structure. Mark
>>I am doing a project in matlab to compare the computational complexities >of >>two systems the problem I have is that I need to work out how many >multiply >>and adds a 150 point ifft uses in matlab from what I can work out
MATLAB
>>uses the Prime Factor algorithm to do this but I am having trouble >working >>out how many MACS are done. Does matlab have the ability to feed back >this >>information? >> >>Thanks >> >>Pikey >> > >Well, MATLAB used to allow you to calculate the number of flops
(floating
>point operations), but as the help on flops indicates, since the
inclusion
>of LAPACK it is no longer practical. You can time the two functions
using
>tic (before the function call) and toc (after the function call) and >compare times, which should give you an approximation of relative >complexity assuming the arrays have the same memory structure. > >Mark
Agree the tic toc is useful but I am doing one method by hand and the other practically on matlab so I can't use it. Pikey
>Agree the tic toc is useful but I am doing one method by hand and the >other practically on matlab so I can't use it.
Well, you can time your functions then use an approximation to calculate the equivalent FLOP count. FFTW uses MFLOPS = 5*N*log2(N)/time (in us) for complex data, divide by 2 for real data. You can force MATLAB to use FFTW, though I do not know if there are any timing differences between FFTs within MATLAB, or whether the command-line executable for FFTW is faster or slower. Mark
>>Agree the tic toc is useful but I am doing one method by hand and the >>other practically on matlab so I can't use it. > >Well, you can time your functions then use an approximation to calculate >the equivalent FLOP count. FFTW uses MFLOPS = 5*N*log2(N)/time (in us)
for
>complex data, divide by 2 for real data. You can force MATLAB to use
FFTW,
>though I do not know if there are any timing differences between FFTs >within MATLAB, or whether the command-line executable for FFTW is faster
or
>slower. > >Mark >
thats ok for an approximation I think I might ignore the matlab part and strive to find out what the least number of MACS for a 150 point fft using complex data which will solve my problem(s) If any one know the answer I would appreciate it cheers Pikey
>thats ok for an approximation I think I might ignore the matlab part and >strive to find out what the least number of MACS for a 150 point fft
using
>complex data which will solve my problem(s) If any one know the answer I >would appreciate it cheers >Pikey
5*N*log2(N) is the asymptotic number of FLOPS for a radix-2 Cooley-Tukey algorithm (approaches this value for large N). Different algorithms will perform (speed) differently, sometimes using more FLOPS but at faster speeds. This is usually due to memory architecture and the number of reads/writes that your algorithm requires. There are many ways to do a 150 point FFT, so there's no single answer. N = 150 decomposes into 5*5*3*2, and you can configure your FFT in just about any combination of these primes, each probably has a different number of FLOPS. Good luck. :) Mark
On Apr 29, 1:53 pm, "pikey" <pi...@amor888.fsnet.co.uk> wrote:
> I am doing a project in matlab to compare the computational complexities of > two systems the problem I have is that I need to work out how many multiply > and adds a 150 point ifft uses in matlab from what I can work out MATLAB > uses the Prime Factor algorithm to do this but I am having trouble working > out how many MACS are done. Does matlab have the ability to feed back this > information?
You don't need to try to reverse-engineer what algorithm Matlab's FFT is using; you can just download the source code from fftw.org (or just read the papers on the web site, which describe the algorithms used). Actually, "doc fft" within Matlab has some information on the algorithms too. Short answer: FFTW uses the prime-factor algorithm for small sizes (<= 16) within its codelets at the leaves of its recursion. Larger composite sizes are broken down into smaller sizes using mixed-radix Cooley-Tukey. It is O(N log N) for all N. If you call FFTW from C, there is a function fftw_flops to get an count of arithmetic operations if you really need that for some reason. (Some other posters have suggested timing the FFT, but that's pretty useless except for very crude estimates of flop counts.) Running it on my machine, FFTW 3.1.2 reports that it requires 3100 additions and 1480 multiplications for an N=150 complex-data FFT. If you really care about N=150, you can generate a hard-coded routine using FFTW's code generator. For complex data it generates a routine with 3012 additions and 1304 multiplications. (This is not optimal, although it should be reasonably close. If I remember correctly, it turned out that the algorithm with the lowest-known arithmetic count for size 5 actually performed worse in practice than an algorithm with more arithmetic, as the latter pipelined better last we checked.) Regards, Steven G. Johnson
On May 2, 1:27 pm, "Steven G. Johnson" <stev...@alum.mit.edu> wrote:
> If you really care about N=150, you can generate a hard-coded routine > using FFTW's code generator. For complex data it generates a routine > with 3012 additions and 1304 multiplications. (This is not optimal, > although it should be reasonably close. If I remember correctly, it > turned out that the algorithm with the lowest-known arithmetic count > for size 5 actually performed worse in practice than an algorithm with > more arithmetic, as the latter pipelined better last we checked.)
Actually, I take that back; the difference was for sizes 7 and 11. For sizes 3 and 5 our kernels match the lowest known flop count, as far as I can tell (although there is no proof that any of these algorithms are optimal). Steven
>(although there is no proof that any of these >algorithms are optimal).
Yeah, as you note, even "optimal" in terms of the least amount of FLOPS is not necessarily "optimal" in terms of speed. I had a professor once that used to get quite ticked every time he heard somebody mention "optimal" without reference to some criteria. Set your minimum FLOP count algorithm up in such a way that your next instruction is always waiting for completion of the current instruction (a pipeline stall) and you'll find it may not work very well compared to another algorithm with more required FLOPS, but a better overall flow. Mark
Steven G. Johnson wrote:
(snip)

> Short answer: FFTW uses the prime-factor algorithm for small sizes (<= > 16) within its codelets at the leaves of its recursion. Larger > composite sizes are broken down into smaller sizes using mixed-radix > Cooley-Tukey. It is O(N log N) for all N.
An interesting statement. O() notation gives the order as N goes to infinity, and it may or may not be of any use for small N. Now, is infinity prime or composite? -- glen