# FFT convolution giving different results from convolution

Started by July 25, 2005
```lol, yeah i know 3 samples for a FIR is nothing, but it's only to test
if my convolutin formula is giving valid results. when it'll work right
i'll use it with FIRs near a million samples (cuz i'll need very sharp
filters). the guy before was talking about powers of two, i was just
wondering if there was something wrong with the sizes that i got. of
course i wouldn't use a FFT convolution if i really needed to convolve
with 3 samples filters. i'm just still trying to figure out why I still
got my aliasing problem

Jerry Avins wrote:
> Michel Rouzic wrote:
> > um... i'm trying to do a convolution with a input signal of 2,103,154
> > samples and a kernel filter of 3 samples, both brought back to
> > 2,103,156 samples. is that wrong?
>
> Yes. FFT convolution is useful for long filter kernels; the signal
> length is nearly immaterial. You are going after mosquitoes with a
> rifle, wondering if there may be better bullets. Your swatter is a
> simple MAC.
>
> Jerry
> --
> Engineering is the art of making what you want from things you can get.
> =AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=
=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=
=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF

```
```Michel Rouzic wrote:
> lol, yeah i know 3 samples for a FIR is nothing, but it's only to test
> if my convolutin formula is giving valid results. when it'll work right

...

Oops! Pardon me.

Jerry
--
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
```
```
Michel Rouzic wrote:
> um... i'm trying to do a convolution with a input signal of 2,103,154
> samples and a kernel filter of 3 samples, both brought back to
> 2,103,156 samples. is that wrong?

If you are under the constraint of dealing with arbitrary
length sequences then for the FFT to work, padding to a
power of two greater than or equal to the sum of the lengths
is correct.  Your numbers sum to just a bit greater than
2^21 requiring you to pad both to 2^24 (4,194,204) to do it
directly without an overlap add/save type implementation of
the convolution.  If one sequence is always going to be a
good deal smaller than the other, using overlap add/save
will give signifigant savings.

Actually, generalized FFT's can be done on arbitary lengths
but the saving in calculation is highly variable and depends
on the number of factors of the number.  The more the
better.  If your length is prime, the generalized FFT won't
be any faster than a naive DFT.

Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
```
```
Bob Cain wrote:
>
>
> Michel Rouzic wrote:
>
>> um... i'm trying to do a convolution with a input signal of 2,103,154
>> samples and a kernel filter of 3 samples, both brought back to
>> 2,103,156 samples. is that wrong?
>
>
> If you are under the constraint of dealing with arbitrary length
> sequences then for the FFT to work, padding to a power of two greater
> than or equal to the sum of the lengths is correct.  Your numbers sum to
> just a bit greater than 2^21 requiring you to pad both to 2^24
> (4,194,204) to do it directly without an overlap add/save type
> implementation of the convolution.  If one sequence is always going to
> be a good deal smaller than the other, using overlap add/save will give
> signifigant savings.
>
> Actually, generalized FFT's can be done on arbitary lengths but the
> saving in calculation is highly variable and depends on the number of
> factors of the number.  The more the better.  If your length is prime,
> the generalized FFT won't be any faster than a naive DFT.
>

Good FFTs based on 2s, 3s and 5s are easy to come by once you realize
that they exist. The next composite number having only factors of 2, 3
and 5 and that is larger than 2,000,000 will almost surely be less than
5% larger. If I recall the details, by the time your get to around
10,000 the increase is always below 5% and it only gets better as the
size goes up. And you only need the sum of the supports of the two
sequences to avoid the circular convolution artifacts.

>
> Bob
```
```"Bob Cain" <arcane@arcanemethods.com> wrote in message
news:dcbb5u0g85@enews3.newsguy.com...

>          If your length is prime, the generalized FFT won't
> be any faster than a naive DFT.

No.  Compare arithmetic operation counts of

http://home.comcast.net/~kmbtib/fft257.i90

with FFTs of nearby orders.  (Test program at

http://home.comcast.net/~kmbtib/fft257.f90 )

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end

```
```OK, so I see that my problem has nothing to do with the length of my
signals. so, do you have an idea on what I gotta do so I obtian a M+L-1
length output signal without the time-domain aliasing problem that I
can't get rid of?

Bob Cain wrote:
> Michel Rouzic wrote:
> > um... i'm trying to do a convolution with a input signal of 2,103,154
> > samples and a kernel filter of 3 samples, both brought back to
> > 2,103,156 samples. is that wrong?
>
> If you are under the constraint of dealing with arbitrary
> length sequences then for the FFT to work, padding to a
> power of two greater than or equal to the sum of the lengths
> is correct.  Your numbers sum to just a bit greater than
> 2^21 requiring you to pad both to 2^24 (4,194,204) to do it
> directly without an overlap add/save type implementation of
> the convolution.  If one sequence is always going to be a
> good deal smaller than the other, using overlap add/save
> will give signifigant savings.
>
> Actually, generalized FFT's can be done on arbitary lengths
> but the saving in calculation is highly variable and depends
> on the number of factors of the number.  The more the
> better.  If your length is prime, the generalized FFT won't
> be any faster than a naive DFT.
>
>
> Bob
> --
>
> "Things should be described as simply as possible, but no
> simpler."
>
>                                               A. Einstein

```
```
Michel Rouzic wrote:
> OK, so I see that my problem has nothing to do with the length of my
> signals. so, do you have an idea on what I gotta do so I obtian a M+L-1
> length output signal without the time-domain aliasing problem that I
> can't get rid of?

If you want to do it in one shot, pad both with zeros to the
first power of two greater than or equal to M+L-1.

If you want to do it more efficiently, use overlap add or
overlap save with the small one, M, padded to the next power
of two and the longer, L, padded to the next power of two
greater than M+L-1.  For OA/S you may need to do a bit of
studying.  While not terribly difficult, it's a bit more
involved than should be described here.

As to choosing among the two ways of doing overlap, add is a
bit easier to understand and save is a bit more efficient.

Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
```
```ok thank you now that's really what i wanted to know from the start of
only one i understood, so it is the one i'm going to implement. thank
you very much for help

Bob Cain wrote:
> Michel Rouzic wrote:
> > OK, so I see that my problem has nothing to do with the length of my
> > signals. so, do you have an idea on what I gotta do so I obtian a M+L-1
> > length output signal without the time-domain aliasing problem that I
> > can't get rid of?
>
> If you want to do it in one shot, pad both with zeros to the
> first power of two greater than or equal to M+L-1.
>
> If you want to do it more efficiently, use overlap add or
> overlap save with the small one, M, padded to the next power
> of two and the longer, L, padded to the next power of two
> greater than M+L-1.  For OA/S you may need to do a bit of
> studying.  While not terribly difficult, it's a bit more
> involved than should be described here.
>
> As to choosing among the two ways of doing overlap, add is a
> bit easier to understand and save is a bit more efficient.
>
>
> Bob
> --
>
> "Things should be described as simply as possible, but no
> simpler."
>
>                                               A. Einstein

```
```
Michel Rouzic wrote:
> ok thank you now that's really what i wanted to know from the start of
> only one i understood, so it is the one i'm going to implement. thank
> you very much for help

Actually, on rereading it I see that I was quite wrong about
the lengths to be used in OA/S; don't know what I was

The short argument, M, is zero extended to the first power
of two greater than or equal to twice it's length and the
long argument, L, is processed in blocks of half that length
zero padded to that length.  You seem to understand the
processing that is then applied.  The extra half block at
the end, catenated to the output rather than added to the
next buffer, will accomodate the M+L-1 result and then some.

Hope I haven't utterly confused you.

Bob
--

"Things should be described as simply as possible, but no
simpler."

A. Einstein
```
```it's ok, i'm used to be confused. but to make it sure i got it right,
i'll take the example i gave before to see if i got it right.

so I got my short signal, of length 3, that I have to zero pad to 8,
right?

and my long signal, of length 2,103,154, has to be cut into 525,789
blocks of length 4, each zero-padded to 8, right?

and as for you were saying the the last half of the last block, i think
you meant I didn't have to add it to anything and just have to use it
as the last samples of my output signal which will thus have a length
of 2,103,156, which is not different in anyway to way i was thinkin

i dont think I need to ask about the rest of what has to be done cuz i
think i understood the rest of what has to be done.

Bob Cain wrote:
> Michel Rouzic wrote:
> > ok thank you now that's really what i wanted to know from the start of
> > only one i understood, so it is the one i'm going to implement. thank
> > you very much for help
>
> Actually, on rereading it I see that I was quite wrong about
> the lengths to be used in OA/S; don't know what I was
>
> The short argument, M, is zero extended to the first power
> of two greater than or equal to twice it's length and the
> long argument, L, is processed in blocks of half that length
> zero padded to that length.  You seem to understand the
> processing that is then applied.  The extra half block at
> the end, catenated to the output rather than added to the
> next buffer, will accomodate the M+L-1 result and then some.
>
> Hope I haven't utterly confused you.
>
>
> Bob
> --
>
> "Things should be described as simply as possible, but no
> simpler."
>
>                                               A. Einstein

```