hai, I am searching for a fast convolution algorithm which has order less than O(MN). I have already tried FFT. Is there any other method available to compute the correlation in a faster manner? Awaiting reponse.. venkat Mistakes are not end of the world but repeating them is |
Fast convolution
Started by ●February 23, 2005
Reply by ●February 24, 20052005-02-24
Venkat-- Can I ask you, as what do you mean by O(MN)? Am not sure Fast convolution-- which is normally made of 3 FFT's + N mutiplies and a little more computation--can be described as O(MN). Either you are off the mark there or am not understanding your notation. Anyways my understanding is that Fast Convolution is a better method(duh!) to do convolution when compared to the direct, time domain version which costs in the order of N^2(Almost never possible to realize for rela systems).But this is true only when you have a fairly large N and that N itself is un known apriori,which is mostly the case. The fast convolution itself costs in the range of N + 3*N/2 log N *complex* multiplies and 3( N log N) *complex* adds. And if Impulse response H(K) can be calculated apriori, its just the cost of 2 FFT's + N multiplies to deal with. I know of no better method than this to achieve convolution. --Bhooshan venkat ramanan <> wrote: hai, I am searching for a fast convolution algorithm which has order less than O(MN). I have already tried FFT. Is there any other method available to compute the correlation in a faster manner? Awaiting reponse.. venkat Mistakes are not end of the world but repeating them is To --------------------------------- |
Reply by ●February 24, 20052005-02-24
Bhooshan- > I know of no better method than this to achieve convolution. Search for "fast Hartley transform", or DHT. Input length has to be a power of 2, but in some cases it seems the DHT might be faster for convolution. Also from the planet's FFT expert: http://www.fftw.org/burrus-notes.html "A very interesting general length FFT system called the FFTW has been developed by Frigo and Johnson at MIT which uses a library of efficient "codelets" which are composed for a very efficient calculation of the DFT on a wide variety of computers [6, 7]. For most lengths and on most computers, this is the fastest FFT today" -Jeff > venkat ramanan <> wrote: > > hai, > I am searching for a fast convolution algorithm which has order less than O(MN). I have already tried FFT. Is there any other method available to compute the correlation in a faster manner? > Awaiting reponse.. > > venkat > > Mistakes are not end of the world but repeating them is |
Reply by ●February 26, 20052005-02-26
Dear Jeff, Venkat, Shree, I am putting below a few random notes on the numerical convolution. (so it's not about an integral k(x-y)f(y)dy): > Date: 25 Feb 2005 03:31:46 -0000 > From: "venkat ramanan" <> > Subject: Re: Re: [speechcoding] Fast convolution > > I have a difficulty in taking fft. I have to take a 40 point fft. (all the inputs are > always real only). So i padded 24 zeros and make in 2 power and take 64 point fft for > both impulse response and the input. and then multiplied the both and take inverse > transform. But the result is not much accurate. (i am using fixed point arithmetic). > Can anybody suggest a very good way for implementing a 40-point FFT? Try modeling the process in floating point in either single or double precision, estimate errors with fixed point. If errors are big, try to increase bit length of the fixed point representation. If results are off the dynamic range of the fixed point system, do increase it at the expense of lesser fraction part. A fixed point representation, while keeping a reasonable number of fractional bits might not have enough dynamic range anyway, so you might want to use floating point numbers in your production code. > Date: Thu, 24 Feb 2005 20:10:42 -0600 > From: Jeff Brower <> > Subject: Re: Fast convolution > > Search for "fast Hartley transform", or DHT. Input length has to be a power of 2, > but in some cases it seems the DHT might be faster for convolution. I am not sure if this is satisfied for the actual number of arithmetic operations, however it might well be correct for the actual runtime, due to e.g. better locality/cacheability of FHT (is it true?). Otherwise, the only difference of DHT from DFT is in its kernel function, which is defined as cos(x) + sin(x) instead of cos(x) + i*sin(x), therefore the DHT of a real input is always real. There is a simple formula to transform DHT to DFT. There is also a relation between convolution in the function domain and the DHT image domain. Don't remember it exactly, but since DHT and DFT mathematically are somewhat equivalent, I suppose the relation must be the similar, therefore the number of arithmetic operations would be the same up to O(1) term probably. > Date: Thu, 24 Feb 2005 20:46:12 -0800 (PST) > From: Shree Jaisimha <> > Subject: Re: Fast convolution] > > You mentioned you are doing a 40 point FFT. Please note that any type > of fast convolution has some disadvantages. Just like any tool we might use. In russian they say (in loose translation) that it is not practical to shoot sparrows with an 18" gun :) > They are as follows: > (i) Increased Memory locations on your DSP RAM to do the block > processing, example : Y(w) = H(w)X(w) Yep, but memory is a cheap resource these days. > (ii) There is also inherent processing delay. You need to store the > Frame and make sure its full before you do the frequency domain > multiplication as compared to time domain convolution which is sample by > sample. I guess the delay would be the same as in the function domain. Once you collected the right size buffer to perform convolution, you always can apply the circular buffer method to perform "sliding" FFT by deleting the "oldest" sample and adding the "latest" sample, thus obtaining convolution result sample by sample. > With regards to computational savings as Booshan pointed out, with fast convolution > its O(4*([3N/2 lg(base2)N + N)]) while normal time domain convolution will use up > 2N - 1 assuming equal length sequences of length N. I am not sure if you meant N^2 here instead of 2N? Numerically, the convolution is a dot product of two vectors of length N. It might be calculated for a sequence of samples of length M, but for every output sample the dot product complexity is O(N), which gives total complexity of O(MN). If M = N, we end up with O(N^2) number of operations. > Fast Convolution is ONLY faster if the following inequality is satisfied: > > 6Nlog(base2) N + 4N < N^2 Assume the inequality is written correct > In your case N@, please do the computation above and see if the > equality is satisfied. If not fast convolution is not for your end > application as its not going to help in terms of computational savings. > However since N is a power of 2, you need to append zeroes and the > nearest Nd. Let's check it: log2(64) = 6, thus 6*64*6 + 4*64 = (36 + 4)*64 = 40 * 64, which is less than 64^2. So even for small N, convolution in the image domain seem to be profitable, and more and more profitable for larger N. Sorry for a long post. Andrew |