DSPRelated.com
Forums

Fast Convolution: does it matter whether I normalize to block length (N) in the FFT versus in the iFFT?

Started by Ron Gerhards August 31, 2005
Hi all,

I have recently implemented a fast convolution algorithm using the
FFT/iFFT to convert input signal and filter into the frequency domain,
multiply them out, and iFFT them back to the time domain to get my result.
It all seems to work fine, however, I recently got a little confused on one
point. 

I need to determine the gain that my filter will be applying to the
signal. I've done this by importing my coefficients into Matlab and
calling freqz (sig proc toolbox, I think). It works fine and I get gains
goings from about 50 dB at DC to around 0 dB at Nyquist (fs/2). I get the
same result if I call fft and plot the magnitude of the coefficients.

That all seemed to make perfect sense. Then I thought I'd have a look at
the spectrum of my filter in an Audio Editor like Sound Forge or GoldWave.
When I look at it there, the gains are down by about 95 dB. Now my filter
is very long (impulse response reverb); in this case around 62,000
samples. I can account for the difference by assuming that these audio
editors, when they calculate the spectrum, do the divide by N (FFT/iFFT
length) in the FFT, not the iFFT, since 20 * log10(62000) is around the 95
dB (unless this is a weird coincidence). I know that Matlab's FFT/iFFT
combo does the divide by N in the iFFT (which, I believe, is where it's
normally done (e.g., Oppenheim/Schafer), and where I do it in my
implementation. Is that a reasonable conclusion?

In any case, what is the real gain of the filter? If I would do my divide
by N in the FFT instead, or perhaps divide by sqrt(N) in both FFT and
iFFT, I would see a different magnitude response for this filter. My
understanding is that for the FFT to conserve energy, you need to
normalize by sqrt(N), yet I don't see that happening in the Matlab case. I
might have thought that doing so would give a magnitude response for which
you could conclude that those values are the gain of the filter.

More importantly, wouldn't the result you get in doing fast convolution
differ depending on where the FFT/IFFT 1/N normalization was done? Is it
perhaps a given in the definition of it that the iFFT will do the 1/N
normalization? I can say that when I do it this way, the result is correct
(i.e., if I do a time domain convolution I get the same result as with the
fast convolution in the freq domain).

Can anyone clarify my thinking here?

Thanks very much,

Ron




		
This message was sent using the Comp.DSP web interface on
www.DSPRelated.com
Ron Gerhards wrote:
> Hi all, > > I have recently implemented a fast convolution algorithm using the > FFT/iFFT to convert input signal and filter into the frequency domain, > multiply them out, and iFFT them back to the time domain to get my result. > It all seems to work fine, however, I recently got a little confused on one > point. > > I need to determine the gain that my filter will be applying to the > signal. I've done this by importing my coefficients into Matlab and > calling freqz (sig proc toolbox, I think). It works fine and I get gains > goings from about 50 dB at DC to around 0 dB at Nyquist (fs/2). I get the > same result if I call fft and plot the magnitude of the coefficients. > > That all seemed to make perfect sense. Then I thought I'd have a look at > the spectrum of my filter in an Audio Editor like Sound Forge or GoldWave. > When I look at it there, the gains are down by about 95 dB. Now my filter > is very long (impulse response reverb); in this case around 62,000 > samples. I can account for the difference by assuming that these audio > editors, when they calculate the spectrum, do the divide by N (FFT/iFFT > length) in the FFT, not the iFFT, since 20 * log10(62000) is around the 95 > dB (unless this is a weird coincidence). I know that Matlab's FFT/iFFT > combo does the divide by N in the iFFT (which, I believe, is where it's > normally done (e.g., Oppenheim/Schafer), and where I do it in my > implementation. Is that a reasonable conclusion?
You are right what matlab is concerned, the rest I have no experience with. Your reasoning seems to make sense. Formally, you ought to divide by sqrt(N) in both the FFT and the IFFT. People tend to do the division in the step that is least used, to save flops. You don't say what number formats you use. If you have recorded data on some fixed-point format, it could be that some confusion between number formats have occured.
> In any case, what is the real gain of the filter? If I would do my divide > by N in the FFT instead, or perhaps divide by sqrt(N) in both FFT and > iFFT, I would see a different magnitude response for this filter. My > understanding is that for the FFT to conserve energy, you need to > normalize by sqrt(N),
Yes.
> yet I don't see that happening in the Matlab case. I
I find that I use the FFT some 10000 times for each time I use the IFFT. Most of the time I don't need to mess with calibrated or normalized data, so on average it saves flops to divide by sqrt(N) only when explicitly called for.
> might have thought that doing so would give a magnitude response for which > you could conclude that those values are the gain of the filter. > > More importantly, wouldn't the result you get in doing fast convolution > differ depending on where the FFT/IFFT 1/N normalization was done?
Formally, no, it does not matter where the division is made. Both the FFT and IFFT are linear operators, so any constant scaling factor can be factored out from the operation, ending in an overall 1/N factor for the FFT/IFFT combo. In practice there may be concerns about overflow etc that might cause one way of doing things to be preferred over another. Most of the time, one tries to avoid spending flops on unnecessary computations of square roots or divisions.
> Is it > perhaps a given in the definition of it that the iFFT will do the 1/N > normalization?
Nope. It is purely a question of convenience.
> I can say that when I do it this way, the result is correct > (i.e., if I do a time domain convolution I get the same result as with the > fast convolution in the freq domain).
That's what matters, isn't it. Rune
in article 1125528388.263453.151170@z14g2000cwz.googlegroups.com, Rune
Allnor at allnor@tele.ntnu.no wrote on 08/31/2005 18:46:

> You are right what matlab is concerned, the rest I have no experience > with. Your reasoning seems to make sense. Formally, you ought to divide > by sqrt(N) in both the FFT and the IFFT. People tend to do the division > in the step that is least used, to save flops. > > You don't say what number formats you use. If you have recorded data > on some fixed-point format, it could be that some confusion between > number formats have occured. > >> In any case, what is the real gain of the filter? If I would do my divide >> by N in the FFT instead, or perhaps divide by sqrt(N) in both FFT and >> iFFT, I would see a different magnitude response for this filter. My >> understanding is that for the FFT to conserve energy, you need to >> normalize by sqrt(N), > > Yes.
what difference does this make in a floating-point context? and i haven't read anything from the OP that it isn't floating point. normally when i see the word "FFT" i am assuming a floating-point (or at least block floating-point) because the FFT (of any decent size) is such a bitch in fixed-point. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Thanks very much for the quick response Rune. That helps. 

Supplementary questions interspersed.

Ron


>Ron Gerhards wrote: >> Hi all, >> >> I have recently implemented a fast convolution algorithm using the >> FFT/iFFT to convert input signal and filter into the frequency domain, >> multiply them out, and iFFT them back to the time domain to get my
result.
>> It all seems to work fine, however, I recently got a little confused on
one
>> point. >> >> I need to determine the gain that my filter will be applying to the >> signal. I've done this by importing my coefficients into Matlab and >> calling freqz (sig proc toolbox, I think). It works fine and I get
gains
>> goings from about 50 dB at DC to around 0 dB at Nyquist (fs/2). I get
the
>> same result if I call fft and plot the magnitude of the coefficients. >> >> That all seemed to make perfect sense. Then I thought I'd have a look
at
>> the spectrum of my filter in an Audio Editor like Sound Forge or
GoldWave.
>> When I look at it there, the gains are down by about 95 dB. Now my
filter
>> is very long (impulse response reverb); in this case around 62,000 >> samples. I can account for the difference by assuming that these audio >> editors, when they calculate the spectrum, do the divide by N
(FFT/iFFT
>> length) in the FFT, not the iFFT, since 20 * log10(62000) is around the
95
>> dB (unless this is a weird coincidence). I know that Matlab's FFT/iFFT >> combo does the divide by N in the iFFT (which, I believe, is where
it's
>> normally done (e.g., Oppenheim/Schafer), and where I do it in my >> implementation. Is that a reasonable conclusion? > >You are right what matlab is concerned, the rest I have no experience >with. Your reasoning seems to make sense. Formally, you ought to divide > >by sqrt(N) in both the FFT and the IFFT. People tend to do the division > >in the step that is least used, to save flops.
Yes, which is also why we're doing it in the iFFT.
> >You don't say what number formats you use. If you have recorded data >on some fixed-point format, it could be that some confusion between >number formats have occured. >
Ya, I wondered about that too, but I don't think that's it. It's a wav file (shorts), and both within Matlab and the Audio Editors the time domain waveform look to have the same levels.
>> In any case, what is the real gain of the filter? If I would do my
divide
>> by N in the FFT instead, or perhaps divide by sqrt(N) in both FFT and >> iFFT, I would see a different magnitude response for this filter. My >> understanding is that for the FFT to conserve energy, you need to >> normalize by sqrt(N), > >Yes. > >> yet I don't see that happening in the Matlab case. I > >I find that I use the FFT some 10000 times for each time I use >the IFFT. Most of the time I don't need to mess with calibrated or >normalized data, so on average it saves flops to divide by sqrt(N) >only when explicitly called for. > >> might have thought that doing so would give a magnitude response for
which
>> you could conclude that those values are the gain of the filter. >> >> More importantly, wouldn't the result you get in doing fast
convolution
>> differ depending on where the FFT/IFFT 1/N normalization was done? > >Formally, no, it does not matter where the division is made. Both the >FFT and IFFT are linear operators, so any constant scaling factor >can be factored out from the operation, ending in an overall 1/N factor >for the FFT/IFFT combo. >
Right, but when I'm doing fast convolution, I'm multiplying the result of 2 FFT'd signals (input and filter). If my 1/N normalization was in the FFT, I would get a factor of (1/N) x (1/N) in the result (and the iFFT in this case would do no normalizing). If my 1/N normalization was in the iFFT, I would get only a factor of (1/N) due to the iFFT. Does that make sense? I realize it would only be a scale factor, but if true, the end result of the fast convolution would be a much smaller signal in the case where my normalization is in the iFFT. I think.
>In practice there may be concerns about overflow etc that might >cause one way of doing things to be preferred over another. Most >of the time, one tries to avoid spending flops on unnecessary >computations of square roots or divisions. > >> Is it >> perhaps a given in the definition of it that the iFFT will do the 1/N >> normalization? > >Nope. It is purely a question of convenience. >
To clarify, I meant "in the definition of fast convolution", though I suspect your answer is the same.
>> I can say that when I do it this way, the result is correct >> (i.e., if I do a time domain convolution I get the same result as with
the
>> fast convolution in the freq domain). > >That's what matters, isn't it. >
Mostly, for sure! However, it would be nice to be able to predict concretely what the gain of a given filter is (via say looking at its frequency response), so that a suitable gain (attenuation) can be applied to the result (or input signal) to avoid clipping distortion. I feel that it's a bit hard for me to do that with certainty given this point of confusion. Any other thoughts? Thanks again, Ron This message was sent using the Comp.DSP web interface on www.DSPRelated.com
robert bristow-johnson wrote:
> in article 1125528388.263453.151170@z14g2000cwz.googlegroups.com, Rune > Allnor at allnor@tele.ntnu.no wrote on 08/31/2005 18:46: > > > You are right what matlab is concerned, the rest I have no experience > > with. Your reasoning seems to make sense. Formally, you ought to divide > > by sqrt(N) in both the FFT and the IFFT. People tend to do the division > > in the step that is least used, to save flops. > > > > You don't say what number formats you use. If you have recorded data > > on some fixed-point format, it could be that some confusion between > > number formats have occured. > > > >> In any case, what is the real gain of the filter? If I would do my divide > >> by N in the FFT instead, or perhaps divide by sqrt(N) in both FFT and > >> iFFT, I would see a different magnitude response for this filter. My > >> understanding is that for the FFT to conserve energy, you need to > >> normalize by sqrt(N), > > > > Yes. > > what difference does this make in a floating-point context? and i haven't > read anything from the OP that it isn't floating point. normally when i see > the word "FFT" i am assuming a floating-point (or at least block > floating-point) because the FFT (of any decent size) is such a bitch in > fixed-point.
I know, I assume that too, but when sound systems are used you never know. My experience is with seismic data, where one of the most widely useddata formats were originally (1974) defined in terms of EBCDIC text symbols and IEEE proprietary floating point formats. Only recently (2002) was the standard revised to (formally) allow ASCII or IEEE 754, even though those had be used "unofficially" at least since the late 80ies. It's an utter nightmare when one gets a data set and have to search through all those variations to get workable data out of it. Rune
Ron Gerhards wrote:

> >> More importantly, wouldn't the result you get in doing fast > convolution > >> differ depending on where the FFT/IFFT 1/N normalization was done? > > > >Formally, no, it does not matter where the division is made. Both the > >FFT and IFFT are linear operators, so any constant scaling factor > >can be factored out from the operation, ending in an overall 1/N factor > >for the FFT/IFFT combo. > > > > Right, but when I'm doing fast convolution, I'm multiplying the result of > 2 FFT'd signals (input and filter). If my 1/N normalization was in the > FFT, I would get a factor of (1/N) x (1/N) in the result (and the iFFT in > this case would do no normalizing). If my 1/N normalization was in the > iFFT, I would get only a factor of (1/N) due to the iFFT. Does that make > sense? I realize it would only be a scale factor, but if true, the end > result of the fast convolution would be a much smaller signal in the case > where my normalization is in the iFFT. I think.
What you FFT, is a convolution of x[n] and h[n]: Y[k] = FFT{x[n](*)h[n]} = 1/sqrt(N)FFT'{x[n](*)h[n]} where FFT'{} is the "unproperly scaled" version of the FFT. The 1/sqrt(N) factor appears only once. I think... Rune
>in article 1125528388.263453.151170@z14g2000cwz.googlegroups.com, Rune >Allnor at allnor@tele.ntnu.no wrote on 08/31/2005 18:46: > >> You are right what matlab is concerned, the rest I have no experience >> with. Your reasoning seems to make sense. Formally, you ought to
divide
>> by sqrt(N) in both the FFT and the IFFT. People tend to do the
division
>> in the step that is least used, to save flops. >> >> You don't say what number formats you use. If you have recorded data >> on some fixed-point format, it could be that some confusion between >> number formats have occured. >> >>> In any case, what is the real gain of the filter? If I would do my
divide
>>> by N in the FFT instead, or perhaps divide by sqrt(N) in both FFT and >>> iFFT, I would see a different magnitude response for this filter. My >>> understanding is that for the FFT to conserve energy, you need to >>> normalize by sqrt(N), >> >> Yes. > >what difference does this make in a floating-point context? and i
haven't
>read anything from the OP that it isn't floating point. normally when i
see
>the word "FFT" i am assuming a floating-point (or at least block >floating-point) because the FFT (of any decent size) is such a bitch in >fixed-point. > >-- > >r b-j rbj@audioimagination.com > >"Imagination is more important than knowledge." > > >
Yes, my FFT's are floating point. The shorts in the wav files are converted to floats prior to processing. Ron This message was sent using the Comp.DSP web interface on www.DSPRelated.com
> >Ron Gerhards wrote: > >> >> More importantly, wouldn't the result you get in doing fast >> convolution >> >> differ depending on where the FFT/IFFT 1/N normalization was done? >> > >> >Formally, no, it does not matter where the division is made. Both the >> >FFT and IFFT are linear operators, so any constant scaling factor >> >can be factored out from the operation, ending in an overall 1/N
factor
>> >for the FFT/IFFT combo. >> > >> >> Right, but when I'm doing fast convolution, I'm multiplying the result
of
>> 2 FFT'd signals (input and filter). If my 1/N normalization was in the >> FFT, I would get a factor of (1/N) x (1/N) in the result (and the iFFT
in
>> this case would do no normalizing). If my 1/N normalization was in the >> iFFT, I would get only a factor of (1/N) due to the iFFT. Does that
make
>> sense? I realize it would only be a scale factor, but if true, the end >> result of the fast convolution would be a much smaller signal in the
case
>> where my normalization is in the iFFT. I think. > >What you FFT, is a convolution of x[n] and h[n]: > >Y[k] = FFT{x[n](*)h[n]} = 1/sqrt(N)FFT'{x[n](*)h[n]} > >where FFT'{} is the "unproperly scaled" version of the FFT. >The 1/sqrt(N) factor appears only once. I think... > >Rune > >
I'm not sure what you mean by "unproperly scaled" version of the FFT. Perhaps let me put the question this way: Let FFT' be the FFT without ANY normalization to N. Let iFFT' be the iFFT without ANY normalization to N. In case where normalization to N occurs in iFFT, I have: y1[n]=i[n]*h[n] = (1/N)x iFFT'(FFT'(i) x FFT'(h)) In case where normalization to N occurs in FFT, I have: y2[n]=i[n]*h[n] = iFFT'((1/N)x FFT'(i) x (1/N)x FFT'(h)) = (1/N)x (1/N) x iFFT'(FFT'(i) x FFT'(h)) and therefore y2[n] is not equal to y1[n]. Have I done something wrong here, or are they really not equal?? Thanks, Ron This message was sent using the Comp.DSP web interface on www.DSPRelated.com
in article VYydnS3rs43wxIreRVn-1g@giganews.com, Ron Gerhards at
rgerhards@ea.com wrote on 09/01/2005 15:41:

>> in article 1125528388.263453.151170@z14g2000cwz.googlegroups.com, Rune >> Allnor at allnor@tele.ntnu.no wrote on 08/31/2005 18:46: >> >>> You are right what matlab is concerned, the rest I have no experience >>> with. Your reasoning seems to make sense. Formally, you ought to divide >>> by sqrt(N) in both the FFT and the IFFT. People tend to do the division >>> in the step that is least used, to save flops. >>> >>> You don't say what number formats you use. If you have recorded data >>> on some fixed-point format, it could be that some confusion between >>> number formats have occured. >>> >>>> In any case, what is the real gain of the filter? If I would do my divide >>>> by N in the FFT instead, or perhaps divide by sqrt(N) in both FFT and >>>> iFFT, I would see a different magnitude response for this filter. My >>>> understanding is that for the FFT to conserve energy, you need to >>>> normalize by sqrt(N), >>> >>> Yes. >> >> what difference does this make in a floating-point context? and i haven't >> read anything from the OP that it isn't floating point. normally when i see >> the word "FFT" i am assuming a floating-point (or at least block >> floating-point) because the FFT (of any decent size) is such a bitch in >> fixed-point. > > Yes, my FFT's are floating point. The shorts in the wav files are > converted to floats prior to processing.
then, i think it's pretty safe to say that it does *not* matter where you normalize with the 1/N factor. you don't have to do it at all if you adjust the transfer function H[k] = DFT{ h[n] } where h[n] is your FIR by multiplying them by 1/N. and there is no need to divide anywhere. (division and "fast" are nearly mutually exclusive.) if you are getting nasty numbers that manage to overflow even your floating-point format, then there has to be something else wrong. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Hi Ron,

As you've discovered, for FFT convolution, it makes a big difference where 
the normalization goes, because the multiplication in the frequency domain 
squares any normalization that has been applied up to that point.

I always remember the right place for normalization like this: convolution 
with the unit impulse is a noop, so if my FFT and iFFT are inverses, then 
multiplication by the FFT of the unit impulse must be a noop, i.e., 
FFT([1,0,0,0...]) = [1,1,1,1].  That is an unnormalized FFT.

--
Matt