DSPRelated.com
Forums

FFT convolution giving different results from convolution

Started by Michel Rouzic July 25, 2005
lol, yeah i know 3 samples for a FIR is nothing, but it's only to test
if my convolutin formula is giving valid results. when it'll work right
i'll use it with FIRs near a million samples (cuz i'll need very sharp
filters). the guy before was talking about powers of two, i was just
wondering if there was something wrong with the sizes that i got. of
course i wouldn't use a FFT convolution if i really needed to convolve
with 3 samples filters. i'm just still trying to figure out why I still
got my aliasing problem

Jerry Avins wrote:
> Michel Rouzic wrote: > > um... i'm trying to do a convolution with a input signal of 2,103,154 > > samples and a kernel filter of 3 samples, both brought back to > > 2,103,156 samples. is that wrong? > > Yes. FFT convolution is useful for long filter kernels; the signal > length is nearly immaterial. You are going after mosquitoes with a > rifle, wondering if there may be better bullets. Your swatter is a > simple MAC. > > Jerry > -- > Engineering is the art of making what you want from things you can get. > =AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=
=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF= =AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF=AF
Michel Rouzic wrote:
> lol, yeah i know 3 samples for a FIR is nothing, but it's only to test > if my convolutin formula is giving valid results. when it'll work right
... Oops! Pardon me. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������

Michel Rouzic wrote:
> um... i'm trying to do a convolution with a input signal of 2,103,154 > samples and a kernel filter of 3 samples, both brought back to > 2,103,156 samples. is that wrong?
If you are under the constraint of dealing with arbitrary length sequences then for the FFT to work, padding to a power of two greater than or equal to the sum of the lengths is correct. Your numbers sum to just a bit greater than 2^21 requiring you to pad both to 2^24 (4,194,204) to do it directly without an overlap add/save type implementation of the convolution. If one sequence is always going to be a good deal smaller than the other, using overlap add/save will give signifigant savings. Actually, generalized FFT's can be done on arbitary lengths but the saving in calculation is highly variable and depends on the number of factors of the number. The more the better. If your length is prime, the generalized FFT won't be any faster than a naive DFT. Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein

Bob Cain wrote:
> > > Michel Rouzic wrote: > >> um... i'm trying to do a convolution with a input signal of 2,103,154 >> samples and a kernel filter of 3 samples, both brought back to >> 2,103,156 samples. is that wrong? > > > If you are under the constraint of dealing with arbitrary length > sequences then for the FFT to work, padding to a power of two greater > than or equal to the sum of the lengths is correct. Your numbers sum to > just a bit greater than 2^21 requiring you to pad both to 2^24 > (4,194,204) to do it directly without an overlap add/save type > implementation of the convolution. If one sequence is always going to > be a good deal smaller than the other, using overlap add/save will give > signifigant savings. > > Actually, generalized FFT's can be done on arbitary lengths but the > saving in calculation is highly variable and depends on the number of > factors of the number. The more the better. If your length is prime, > the generalized FFT won't be any faster than a naive DFT. >
Good FFTs based on 2s, 3s and 5s are easy to come by once you realize that they exist. The next composite number having only factors of 2, 3 and 5 and that is larger than 2,000,000 will almost surely be less than 5% larger. If I recall the details, by the time your get to around 10,000 the increase is always below 5% and it only gets better as the size goes up. And you only need the sum of the supports of the two sequences to avoid the circular convolution artifacts.
> > Bob
"Bob Cain" <arcane@arcanemethods.com> wrote in message
news:dcbb5u0g85@enews3.newsguy.com...

> If your length is prime, the generalized FFT won't > be any faster than a naive DFT.
No. Compare arithmetic operation counts of http://home.comcast.net/~kmbtib/fft257.i90 with FFTs of nearby orders. (Test program at http://home.comcast.net/~kmbtib/fft257.f90 ) -- write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, & 6.0134700243160014d-154/),(/'x'/)); end
OK, so I see that my problem has nothing to do with the length of my
signals. so, do you have an idea on what I gotta do so I obtian a M+L-1
length output signal without the time-domain aliasing problem that I
can't get rid of?

Bob Cain wrote:
> Michel Rouzic wrote: > > um... i'm trying to do a convolution with a input signal of 2,103,154 > > samples and a kernel filter of 3 samples, both brought back to > > 2,103,156 samples. is that wrong? > > If you are under the constraint of dealing with arbitrary > length sequences then for the FFT to work, padding to a > power of two greater than or equal to the sum of the lengths > is correct. Your numbers sum to just a bit greater than > 2^21 requiring you to pad both to 2^24 (4,194,204) to do it > directly without an overlap add/save type implementation of > the convolution. If one sequence is always going to be a > good deal smaller than the other, using overlap add/save > will give signifigant savings. > > Actually, generalized FFT's can be done on arbitary lengths > but the saving in calculation is highly variable and depends > on the number of factors of the number. The more the > better. If your length is prime, the generalized FFT won't > be any faster than a naive DFT. > > > Bob > -- > > "Things should be described as simply as possible, but no > simpler." > > A. Einstein

Michel Rouzic wrote:
> OK, so I see that my problem has nothing to do with the length of my > signals. so, do you have an idea on what I gotta do so I obtian a M+L-1 > length output signal without the time-domain aliasing problem that I > can't get rid of?
If you want to do it in one shot, pad both with zeros to the first power of two greater than or equal to M+L-1. If you want to do it more efficiently, use overlap add or overlap save with the small one, M, padded to the next power of two and the longer, L, padded to the next power of two greater than M+L-1. For OA/S you may need to do a bit of studying. While not terribly difficult, it's a bit more involved than should be described here. As to choosing among the two ways of doing overlap, add is a bit easier to understand and save is a bit more efficient. Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein
ok thank you now that's really what i wanted to know from the start of
this thread. and as for overlap add or save, well, overlap-add is the
only one i understood, so it is the one i'm going to implement. thank
you very much for help

Bob Cain wrote:
> Michel Rouzic wrote: > > OK, so I see that my problem has nothing to do with the length of my > > signals. so, do you have an idea on what I gotta do so I obtian a M+L-1 > > length output signal without the time-domain aliasing problem that I > > can't get rid of? > > If you want to do it in one shot, pad both with zeros to the > first power of two greater than or equal to M+L-1. > > If you want to do it more efficiently, use overlap add or > overlap save with the small one, M, padded to the next power > of two and the longer, L, padded to the next power of two > greater than M+L-1. For OA/S you may need to do a bit of > studying. While not terribly difficult, it's a bit more > involved than should be described here. > > As to choosing among the two ways of doing overlap, add is a > bit easier to understand and save is a bit more efficient. > > > Bob > -- > > "Things should be described as simply as possible, but no > simpler." > > A. Einstein

Michel Rouzic wrote:
> ok thank you now that's really what i wanted to know from the start of > this thread. and as for overlap add or save, well, overlap-add is the > only one i understood, so it is the one i'm going to implement. thank > you very much for help
Actually, on rereading it I see that I was quite wrong about the lengths to be used in OA/S; don't know what I was thinking about. The short argument, M, is zero extended to the first power of two greater than or equal to twice it's length and the long argument, L, is processed in blocks of half that length zero padded to that length. You seem to understand the processing that is then applied. The extra half block at the end, catenated to the output rather than added to the next buffer, will accomodate the M+L-1 result and then some. Hope I haven't utterly confused you. Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein
it's ok, i'm used to be confused. but to make it sure i got it right,
i'll take the example i gave before to see if i got it right.

so I got my short signal, of length 3, that I have to zero pad to 8,
right?

and my long signal, of length 2,103,154, has to be cut into 525,789
blocks of length 4, each zero-padded to 8, right?

and as for you were saying the the last half of the last block, i think
you meant I didn't have to add it to anything and just have to use it
as the last samples of my output signal which will thus have a length
of 2,103,156, which is not different in anyway to way i was thinkin
about.

i dont think I need to ask about the rest of what has to be done cuz i
think i understood the rest of what has to be done.

Bob Cain wrote:
> Michel Rouzic wrote: > > ok thank you now that's really what i wanted to know from the start of > > this thread. and as for overlap add or save, well, overlap-add is the > > only one i understood, so it is the one i'm going to implement. thank > > you very much for help > > Actually, on rereading it I see that I was quite wrong about > the lengths to be used in OA/S; don't know what I was > thinking about. > > The short argument, M, is zero extended to the first power > of two greater than or equal to twice it's length and the > long argument, L, is processed in blocks of half that length > zero padded to that length. You seem to understand the > processing that is then applied. The extra half block at > the end, catenated to the output rather than added to the > next buffer, will accomodate the M+L-1 result and then some. > > Hope I haven't utterly confused you. > > > Bob > -- > > "Things should be described as simply as possible, but no > simpler." > > A. Einstein