# fastest cyclic convolution algorithm

Started by February 9, 2006
```Hello everyone:

First of all, thanks to all those who have posted before and for saving
the archives.  As you know this is my first post here.  I searched the
archives by terms and by circular convolution but did not find anything,
hope I didn't miss a post. You see, I'm trying to implement a parallel
cyclic convolution algorithm and was wondering if someone can point out
the fastest cyclic convolution algorithm available for a composite length,
that is N=R*S where both R and S can also be composite integers.  I know we
can use the FFT as:

x cconv h =invFFT{ FFT(x) .* FFT(h) }

but is there anything else.  I'm planning to use typical lengths but will
increase it as much as I can.  Any help will be appreciated, thanks

Ivan

```
```Ivan72 wrote:
> I know we can use the FFT as:
>
> x cconv h =invFFT{ FFT(x) .* FFT(h) }
>
> but is there anything else.

No, that is basically it. If you find anything else, please report!

```
```>
>Ivan72 wrote:
>> I know we can use the FFT as:
>>
>> x cconv h =invFFT{ FFT(x) .* FFT(h) }
>>
>> but is there anything else.
>
>No, that is basically it. If you find anything else, please report!
>
>
Trust me I will.

One more question,
Besides the obvious N^2 multiplications, what options would you consider
if you don't want to go to the Frequency domain?.  Thanks,

Ivan
```
```Ivan72 wrote:
> >
> >Ivan72 wrote:
> >> I know we can use the FFT as:
> >>
> >> x cconv h =invFFT{ FFT(x) .* FFT(h) }
> >>
> >> but is there anything else.
> >
> >No, that is basically it. If you find anything else, please report!
> >
> >
> Trust me I will.
>
> One more question,
> Besides the obvious N^2 multiplications, what options would you consider
> if you don't want to go to the Frequency domain?.

There don't seem to be any other algorithmic tricks that work in
general. Perhaps you can reduce the size of the filter h by using a
sparse approximation that can be computed more efficiently. That
depends on what you want to do, and how accurate the result has to be.

```
```"Ivan72" <utility72@hotmail.com> wrote in message
news:eoWdnecePvIoyXbenZ2dnUVZ_sednZ2d@giganews.com...
> >
>>Ivan72 wrote:
>>> I know we can use the FFT as:
>>>
>>> x cconv h =invFFT{ FFT(x) .* FFT(h) }
>>>
>>> but is there anything else.
>>
>>No, that is basically it. If you find anything else, please report!
>>
>>
> Trust me I will.
>
> One more question,
> Besides the obvious N^2 multiplications, what options would you consider
> if you don't want to go to the Frequency domain?.  Thanks,
>
> Ivan

There are a few tricks:

- don't continually FFT the filter.  Just do it once and for all.  The
formulation above implies that you're doing FFTs all the time.  Better like
this:

HF=FFT(h)

[new x loop]
XF=FFT(x)
ZF=XF.*HF
z=ifft(ZF)

- if you're thinking of zero-stuffing to increase the sample rate and
planning to FFT thereafter, don't.  Simply replicate the FFT after it's
computed.  The result is exactly the same and replication can be done with
indexing.  Computational efficiency results vary according to processor and
algorithm.

- if you're thinking of filtering by multiplication as above and if the
filter has large stop bands and if the data has natural locations where the
data is zero or nearly so, then consider using a "gate" filter instead.
That is, replace everything in the stop band with zeros.  This saves
multiplies.  It works best if the edges of the gate (perfect 1,0 filter) are
located where the data is also zero or very small.  It causes temporal
aliasing if the edges are on top of or near nonzero data - so use prudently.

(all of the tricks above can be used together for interpolation)

Fred

```
```There are numerous methods for fast convolutions besides the usual FFT
idea.

For example, one simple variation is that you can perform a cyclic
convolution exactly by zero-padding it to a cyclic convolution of a
larger size.  This is useful if your original cyclic convolution size
has large prime factors.

If you are convolving a long sequence with a short one, you should
investigate overlap-add/save methods (to do a sequence of short
convolutions instead of one long one).

There are other transforms that have convolution theorems similar to
the DFT.  e.g. the Discrete Hartley Transform, etc., although it's not
clear that these have any convincing advantages.  Number-theoretic
transforms are somewhat attractive for integer convolutions, especially
if you don't have floating-point hardware.

Depending on the transform size, there are convolution algorithms that
are not O(N log N) but are better than O(N^2), such as Karatsuba's
algorithm and Toom's algorithm. These can have better performance for
small-to-moderate N due to the constant factors, and are especially
useful for linear convolutions.

As JVB pointed out in another thread recently, you can also apply an
FFT algorithm "partway" so that you reduce the problem to a set of
smaller convolutions, which you may then evaluate by another method
and/or hard-coded convolutions of small sizes.  (The usual FFT
convolution method is the special case where you reduce to a set of
convolutions of length 1, i.e. multiplications.)

More generally, FFTs and many other fast cyclic convolution algorithms
(see e.g. the book by Nussbaumer) can be expressed via polynomial
products that are recursively reduced modulo factorizations of z^N - 1
and related polynomials.  The Cooley-Tukey FFT corresponds to one
factorization, but there are others due to Bruun, Winograd, etcetera.

Cordially,
Steven G. Johnson

```