Reply by Rune Allnor September 8, 20052005-09-08
James Van Buskirk wrote:
> "Rune Allnor" <allnor@tele.ntnu.no> wrote in message > news:1126089385.367928.304160@g14g2000cwa.googlegroups.com... > > > I don't know what you mean by comp.dsp being "orthodox" > > Well, start off with the fact that in a thread about fast > convolution, your post refers almost exclusively to radix- > 2 FFTs.
Hmmm.... OK, I could agree it might be due to orthodoxy. But then, Occam's Razor is a very good principle in explanations and teaching: Don't use a more complex example than necessary, to communicate the essence of whatever question is under debate. Most of the time the radix-2 FFT is discussed because that's what people have heard of, and that's what people read about in the textbooks. Most of the time, the elaborate variations obfuscate the issue rather than clarify. And some times those elaborate details are what is crucial to understand, to get on. Rune
Reply by James Van Buskirk September 7, 20052005-09-07
"Rune Allnor" <allnor@tele.ntnu.no> wrote in message
news:1126089385.367928.304160@g14g2000cwa.googlegroups.com...

> I don't know what you mean by comp.dsp being "orthodox"
Well, start off with the fact that in a thread about fast convolution, your post refers almost exclusively to radix- 2 FFTs. -- write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, & 6.0134700243160014d-154/),(/'x'/)); end
Reply by Rune Allnor September 7, 20052005-09-07
stevenj@alum.mit.edu wrote:
> Rune Allnor wrote: > > The radix-2 FFT is probably the fastest FFT [where] run-time is > > concerned. > > Um, this is not even close to true, in my experience. Unless by > "radix-2" you just mean "any FFT for 2^n", which is not the standard > terminology. > > > There are prime-factor FFTs around that are more > > general, but probably not quite as fast as the radix-2 version. > > I think you mean mixed-radix FFTs (which don't require prime factors, > by the way). The Prime-Factor Algorithm is *less* general, because it > requires its factors to be relatively prime (e.g. it doesn't work for > 2^n.) > > Keeping your data within factors of 2/3/5 is generally faster, given a > good mixed-radix code, than restricting yourself to 2^n, if the latter > would require you to significantly increase the length of your array > beyond what you need (e.g. you may need to almost double the length). > The situation is even more dramatic in 2d and 3d, where adjacent > power-of-two sizes differ by factors of 4 and 8, respectively. > > (Note that the comparison is muddied by the fact that there is a cycle: > most people are ingrained to use 2^n, so 2^n FFTs get most of the > optimization effort, so most people use 2^n, so...) > > Steven
OK, I got the details wrong. Still, most users of an FFT package would not be concerned about the possibility of the FFT going a few millisecond faster if the FFT lengths comprised certain factors. Only the really demanding users would care, and I don't there are very many such users. Rune
Reply by September 7, 20052005-09-07
Rune Allnor wrote:
> The radix-2 FFT is probably the fastest FFT [where] run-time is > concerned.
Um, this is not even close to true, in my experience. Unless by "radix-2" you just mean "any FFT for 2^n", which is not the standard terminology.
> There are prime-factor FFTs around that are more > general, but probably not quite as fast as the radix-2 version.
I think you mean mixed-radix FFTs (which don't require prime factors, by the way). The Prime-Factor Algorithm is *less* general, because it requires its factors to be relatively prime (e.g. it doesn't work for 2^n.) Keeping your data within factors of 2/3/5 is generally faster, given a good mixed-radix code, than restricting yourself to 2^n, if the latter would require you to significantly increase the length of your array beyond what you need (e.g. you may need to almost double the length). The situation is even more dramatic in 2d and 3d, where adjacent power-of-two sizes differ by factors of 4 and 8, respectively. (Note that the comparison is muddied by the fact that there is a cycle: most people are ingrained to use 2^n, so 2^n FFTs get most of the optimization effort, so most people use 2^n, so...) Steven
Reply by Rune Allnor September 7, 20052005-09-07
James Van Buskirk wrote:
> <stevenj@alum.mit.edu> wrote in message > news:1125943989.790065.231790@o13g2000cwo.googlegroups.com... > > > Of course. This is why, for computational purposes, I've always felt > > that the best normalization for FFTs is no normalization at all---any > > overall scale factor can, in most practical applications, be absorbed > > elsewhere at little or no cost (e.g. in the convolution kernel). > > But continuing along these lines is dangerous because it can lead > to computing convolutions in fewer operations than would be the case > if the FFT and its inverse (or transpose) were used, as in > > http://home.comcast.net/~kmbtib/conv2c.f90 > > and this would violate the orthodoxy contraints of the current > forum...
I don't know what you mean by comp.dsp being "orthodox", but as a general rule, an improvement of a general algorithm takes advantage of some particular property of some subset of the measurements, and the improvement comes at the expense of general usefulness of the algorithm. The radix-2 FFT is probably the fastest FFT what run-time is concerned. There are prime-factor FFTs around that are more general, but probably not quite as fast as the radix-2 version. Does that mean that the radix-2 FFT is a better tool than the general FFT? It depends on the cost of restricting the data to be of length 2^N for some integer N. Some times that restriction is more expensive, in some way other than run-time, than using a general length FFT. Are people who prefer the general FFT "orthodox" because they use a tool with more flexibility? You tell me. Rune
Reply by robert bristow-johnson September 6, 20052005-09-06
stev...@alum.mit.edu wrote:
> James Van Buskirk wrote: > > But continuing along these lines is dangerous because it can lead > > to computing convolutions in fewer operations than would be the case > > if the FFT and its inverse (or transpose) were used [...] > > and this would violate the orthodoxy contraints of the current > > forum... > > James, you know that I respect your work on reducing op counts, but the > general idea that a specialized convolution algorithm can be more > efficient (either in arithmetic counts or by other measures) than a > general-purpose FFT/IFFT pair has been known for decades and is not > especially controversial. However, this kind of specialization usually > comes at the cost of generality. > > The classic example, of course, is the idea that one can eliminate > bit-reversal passes if you are just doing a convolution (or, even if > your FFT doesn't have a separate bit-reversal "pass", a lack of > re-ordering can have other benefits). In principle, such a permuted > FFT can be used for just about any other FFT application as well...you > just have to absorb the permutation into other parts of the > computation. Requiring the user to deal with a permuted ordering in > general, however, imposes nontrivial additional coding complexity as > well as (depending on how the user implements it) a possibly > non-negligible time as well (changing memory access patterns is not > free).
i dunno what James's issue is (you might have had some discussion that i wasn't watching), but one idea i used (in the olden days when i actually did some work in fast convolution) was to have two different routines for the FFT and iFFT, one based on decimation-in-frequency (that required normal ordering in and yielded bit-reversed ordering going out) for the FFT and decimation-in-time (bit-reversed input, normal output) for the iFFT. (the transfer function had to be pre-bit-reversed.) the extra code was very, very small and they could use the same twiddle factor table. that saved in the bit reverse process (by avoiding it) since memory accesses were pretty slow. the only other trick was that it was written so that multiplication by 1 or by j was avoided. other than that, it was a pretty vanilla flavored radix-2 FFT). -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Reply by September 6, 20052005-09-06
James Van Buskirk wrote:
> But continuing along these lines is dangerous because it can lead > to computing convolutions in fewer operations than would be the case > if the FFT and its inverse (or transpose) were used [...] > and this would violate the orthodoxy contraints of the current > forum...
James, you know that I respect your work on reducing op counts, but the general idea that a specialized convolution algorithm can be more efficient (either in arithmetic counts or by other measures) than a general-purpose FFT/IFFT pair has been known for decades and is not especially controversial. However, this kind of specialization usually comes at the cost of generality. The classic example, of course, is the idea that one can eliminate bit-reversal passes if you are just doing a convolution (or, even if your FFT doesn't have a separate bit-reversal "pass", a lack of re-ordering can have other benefits). In principle, such a permuted FFT can be used for just about any other FFT application as well...you just have to absorb the permutation into other parts of the computation. Requiring the user to deal with a permuted ordering in general, however, imposes nontrivial additional coding complexity as well as (depending on how the user implements it) a possibly non-negligible time as well (changing memory access patterns is not free). For a *general-purpose* FFT routine, the most I feel justified in pushing onto the user is a trivial overall scale factor, largely because (unlike the output ordering or a per-output scaling) there is no general agreement on what the "right" scale factor should be. (Even so, for things like Matlab a normalized fft/ifft pair is more appropriate.) Steven
Reply by James Van Buskirk September 6, 20052005-09-06
<stevenj@alum.mit.edu> wrote in message
news:1125943989.790065.231790@o13g2000cwo.googlegroups.com...

> Of course. This is why, for computational purposes, I've always felt > that the best normalization for FFTs is no normalization at all---any > overall scale factor can, in most practical applications, be absorbed > elsewhere at little or no cost (e.g. in the convolution kernel).
But continuing along these lines is dangerous because it can lead to computing convolutions in fewer operations than would be the case if the FFT and its inverse (or transpose) were used, as in http://home.comcast.net/~kmbtib/conv2c.f90 and this would violate the orthodoxy contraints of the current forum... -- write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, & 6.0134700243160014d-154/),(/'x'/)); end
Reply by September 5, 20052005-09-05
> strictly speaking, i guess you're right, Steven, but since > the transfer function, H[k] = DFT{ h[n] } (where h[n] is the > FIR) is calculated in advance and is constant for fast > convolution, you can deal with all of the scaling in H[k] > and you don't need any 1/N anywhere, in either the > forward or inverse FFT.
Of course. This is why, for computational purposes, I've always felt that the best normalization for FFTs is no normalization at all---any overall scale factor can, in most practical applications, be absorbed elsewhere at little or no cost (e.g. in the convolution kernel). Steven
Reply by robert bristow-johnson September 5, 20052005-09-05
in article 1125910850.954658.133700@g14g2000cwa.googlegroups.com, Rune
Allnor at allnor@tele.ntnu.no wrote on 09/05/2005 05:00:

> > stevenj@alum.mit.edu wrote: >> Using 1/sqrt(N) will *not* give you the >> correct normalization for a convolution a * b = IFFT(FFT(a) FFT(b)), >> and it *does* matter whether you put the 1/N in the FFT or the IFFT (it >> should go in the IFFT).
strictly speaking, i guess you're right, Steven, but since the transfer function, H[k] = DFT{ h[n] } (where h[n] is the FIR) is calculated in advance and is constant for fast convolution, you can deal with all of the scaling in H[k] and you don't need any 1/N anywhere, in either the forward or inverse FFT.
>> The reason is that the normalization factor of >> the FFT is squared in this expression while that of the IFFT is not. > > I don't understand what you mean by "squared FFT"? > >> This is a good example of why the idea that there is only one "proper" >> normalization of the DFT is misguided, I think. > > Agreed. I have checked my books, and all DSP books define the > N-point DFT/IDFT pair as > > N-1 > X[k] = sum x[n] exp(-j2*pi*n*k/N) [1.a] > n=0 > > 1 N-1 > x[n] = - sum X[k] exp( j2*pi*n*k/N) [1.b] > N k=0
as some others have recently pointed out, this seemed bass-ackwards to me, too. at least at first. ostensibly, the most natural definition of the DFT would start with: N-1 x[n] = sum X[k] exp(+j*2*pi*n*k/N) k=0 resulting in: 1 N-1 X[k] = - sum x[n] exp(-j*2*pi*n*k/N) N n=0 this definition is nice to compare the DFT to Fourier Series but the problem with it is that, for circular convolution (in the "time domain"), it results in: DFT{ h[n] (*) x[n] } = N * H[k] * X[k] (*) means "discrete convolution" so you gotta remember dragging that N into this (usually by scaling H[k]) whereas for the "conventional" definition DFT{ h[n] (*) x[n] } = H[k] * X[k] no "N" factor, at least for "time domain" circular convolution. you'll get it for iDFT{ H[k] (*) X[k] } = N * h[n] * x[n] when using the "conventional" definition of DFT and iDFT. all this is just housekeeping.
> With this convention, Parseval's identity is just about the only > formula that needs to take the number of points into account, > > N-1 1 N-1 > sum |x[n]|^2 = - sum |X[k]|^2 [2] > n=0 N k=0 > > I saw somewhere that Gauss himself had developed something similar > to the FFT some 200 years ago, so it makes sense to minimize the > use of scale factors etc, in these formulas.
the only way to do that for the DFT/iDFT is to use the 1/sqrt(N) factor as you say so below.
> If you look at the Fourier transform, however, the use of different > scaling factors is more easily observed. There is a classic book > on the FT, > > Papoulis: "The Fourier Integral and its Applications" > McGraw-Hill, 1962 > > where the FT/IFT pair is defined as > > +inf > X(w) = integral x(t) exp(-jwt) dt [2.a] > -inf > > 1 +inf > x(t) = ---- integral X(w) exp(jwt) dw [2.b] > 2*pi -inf
with a slight mod of this definition: +inf X(jw) = integral{ x(t) exp(-jwt) dt} -inf 1 +inf x(t) = ---- integral{ X(jw) exp(jwt) dw} 2*pi -inf you get one that is identical to the two-sided Laplace Transform with the substitution of s -> jw. sometimes that's nice.
> In a mathematics textbook, > > Kreyzig: "Advanced Engineering Mathematics", 6th ed., > Wiley, 1988, p 632
i gotta earlier version of that one. 1975 i think.
> the definitions are > > 1 +inf > X(w) = --------- integral x(t) exp(-jwt) dt [3.a] > sqrt(2pi) -inf > > 1 +inf > x(t) = --------- integral X(w) exp(jwt) dw [3.b] > sqrt(2pi) -inf
here is a nice, virtually isometric definition that you see commonly in EE communications textbooks (e.g. Carlson): +inf X(f) = integral{ x(t) * exp(-j*2*pi*f*t) dt} -inf +inf x(t) = integral{ X(f) * exp(+j*2*pi*f*t) df} -inf like the Kreyzig def, no funky 2*pi or sqrt(2*pi) factors to worry about in either use of Parseval's Theorem or the Duality Theorem, but this has better normalization properties to get the DC component or power of the x(t) (the Kreyzig def has ugly sqrt(2*pi) factors). in addition it's Hz and not rad/sec. this is IMO the cleanest definition.
> Other texts still, switch the signs in the exp() terms of the > FT and IFT.
i've only seen that in the context of the so-called "Characteristic Function" of a p.d.f. of a random variable. i wish we could get everyone to agree on the definition of both the continuous and discrete Fourier Transform, and i want to dictate the universal definition. :-)
> As for why I prefer to think in terms of the 1/sqrt(N) > normalization, I like to think of the DFT/IDFT as a unitary > transform pair, defined as > > X = Wx [4.a] > x = W^HX [4.b] > > where x and X are Nx1 vectors and W is an NxN unitary matrix, > > W[n,k] = 1/sqrt(N) exp(-j2*pi*n*k/N) [5] > > With this convention, we find > > W^HW = I [6.a] > WW^H = I [6.b] > > and so the DFT preserves the norm and all that. The problem with > the definition [5] is that one needs to drag that 1/sqrt(N) > around everywhere, which is cumbersome analytically, as well as > numerically expensive.
depends on what you're doing. in a MATLAB-like environment, perhaps you wanna drag it around and pay for the scaling expense and not have to think about it. in a real-time fast convolution application, i would leave the 1/N out of *both* forward and inverse FFT and put all of the necessary scaling into H[k]. why not?
> So there are several very good practical reasons for doing things > the way they are done now, and have been for at least four decades,
namely to reduce confusion. but i don't agree with that for everything in general. Mathworks still should fix their indexing deficit in MATLAB. even if MATLAB has been around for a couple of decades and people are getting used to finding their DC component in X(1). there is still no excuse for it. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."