Convolving with Long SignalsWe saw that we can perform efficient convolution of two finite-length sequences using a Fast Fourier Transform (FFT). There are some situations, however, in which it is impractical to use a single FFT for each convolution operand:
- One or both of the signals being convolved is very long.
- The filter must operate in real time. (We can't wait until the input signal ends before providing an output signal.)
This is the constant-overlap-add (COLA)9.6 constraint for the FFT analysis window . It has also been called the partition of unity property. 8.9 illustrates the appearance of 50% overlap-add for the Bartlett (triangular) window. The Bartlett window is clearly COLA for a wide variety of hop sizes, such as , , and so on, provided is an integer (otherwise the underlying continuous triangular window must be resampled). However, when using windows defined in a library, the COLA condition should be carefully checked. For example, the following Matlab/Octave script shows that there is a problem with the standard Hamming window:
M = 33; % window length R = (M-1)/2; % hop size N = 3*M; % overlap-add span w = hamming(M); % window z = zeros(N,1); plot(z,'-k'); hold on; s = z; for so=0:R:N-M ndx = so+1:so+M; % current window location s(ndx) = s(ndx) + w; % window overlap-add wzp = z; wzp(ndx) = w; % for plot only plot(wzp,'--ok'); % plot just this window end plot(s,'ok'); hold off; % plot window overlap-addThe figure produced by this matlab code is shown in Fig.8.10. As can be seen, the equal end-points sum to form an impulse in each frame of the overlap-add. periodic'. The periodic case is equivalent to
w = hamming(M+1); % symmetric case w = w(1:M); % delete last sample for periodic caseThe periodic variant solves the non-constant overlap-add problem for even and , but not for odd . The problem can be solved for odd and while retaining symmetry as follows:
w = hamming(M); % symmetric case w(1) = w(1)/2; % repair constant-overlap-add for R=(M-1)/2 w(M) = w(M)/2;Since different window types may add or subtract 1 to/from internally, it is best to check the result using test code as above to make sure the window is COLA at the desired hop size. E.g., in Matlab:
.54 - .46*cos(2*pi*(0:M-1)'/(M-1));
gives constant overlap-add for , , etc., when endpoints are divided by 2 or one endpoint is zeroed
.5*(1 - cos(2*pi*(1:M)'/(M+1)));
does not give constant overlap-add for , but does for
.42 - .5*cos(2*pi*m)' + .08*cos(4*pi*m)';
where m = (0:M-1)/(M-1), gives constant overlap-add for when is odd and is an integer, and when is even and is integer.
- Rectangular window at 0% overlap (hop size = window size )
- Bartlett window at 50% overlap ( ) (Since normally is odd, `` '' means ``R=(M-1)/2,'' etc.)
- Hamming window at 50% overlap ( )
- Rectangular window at 50% overlap ( )
- Hamming window at 75% overlap ( % hop size)
- Any member of the Generalized Hamming family at 50% overlap
- Any member of the Blackman family at 2/3 overlap (1/3 hop size); e.g., blackman(33,'periodic'),
- Any member of the -term Blackman-Harris family with .
- Any window with R=1 (``sliding FFT'')
FFT implementations, it is preferable to shift the frame back to the time origin:
This is summarized in Fig.8.11. Zero-based frames are needed because the leftmost input sample is assigned to time zero by FFT algorithms. In other words, a hopping FFT effectively redefines time zero on each hop. Thus, a practical STFT is a sequence of FFTs of the zero-based frames . On the other hand, papers in the literature (such as [7,9]) work with the fixed time-origin case ( ). Since they differ only by a time shift, it is not hard to translate back and forth.
Note that we may sample the DTFT of both and , because both are time-limited to nonzero samples. The minimum information-preserving sampling interval along the unit circle in both cases is . In practice, we often oversample to some extent, using with instead. For , we get
where . For we have
acyclic convolution, we may write it as
or less along the unit circle. This is the dual of the usual sampling theorem. We conclude that practical FFT acyclic convolution may be carried out using an FFT of any length satisfying
where is the frame size and is the filter length. Our final expression for is
Example of Overlap-Add ConvolutionLet's look now at a specific example of FFT convolution:
- Impulse-train test signal, 4000 Hz sampling-rate
- Length causal lowpass filter, 600 Hz cut-off
- Length rectangular window
- Hop size (no overlap)
L = 31; % FIR filter length in taps fc = 600; % lowpass cutoff frequency in Hz fs = 4000; % sampling rate in Hz Nsig = 150; % signal length in samples period = round(L/3); % signal period in samplesFFT processing parameters:
M = L; % nominal window length Nfft = 2^(ceil(log2(M+L-1))); % FFT Length M = Nfft-L+1 % efficient window length R = M; % hop size for rectangular window Nframes = 1+floor((Nsig-M)/R); % no. complete framesGenerate the impulse-train test signal:
sig = zeros(1,Nsig); sig(1:period:Nsig) = ones(size(1:period:Nsig));Design the lowpass filter using the window method:
epsilon = .0001; % avoids 0 / 0 nfilt = (-(L-1)/2:(L-1)/2) + epsilon; hideal = sin(2*pi*fc*nfilt/fs) ./ (pi*nfilt); w = hamming(L); % FIR filter design by window method h = w' .* hideal; % window the ideal impulse response hzp = [h zeros(1,Nfft-L)]; % zero-pad h to FFT size H = fft(hzp); % filter frequency responseCarry out the overlap-add FFT processing:
y = zeros(1,Nsig + Nfft); % allocate output+'ringing' vector for m = 0:(Nframes-1) index = m*R+1:min(m*R+M,Nsig); % indices for the mth frame xm = sig(index); % windowed mth frame (rectangular window) xmzp = [xm zeros(1,Nfft-length(xm))]; % zero pad the signal Xm = fft(xmzp); Ym = Xm .* H; % freq domain multiplication ym = real(ifft(Ym)) % inverse transform outindex = m*R+1:(m*R+Nfft); y(outindex) = y(outindex) + ym; % overlap add endThe time waveforms for the first three frames ( ) are shown in Figures 8.12 through 8.14. Notice how the causal linear-phase filtering results in an overall signal delay by half the filter length. Also, note how frames 0 and 2 contain four impulses, while frame 1 only happens to catch three; this causes no difficulty, and the filtered result remains correct by superposition.
FFT processors provide efficient implementations for FIR filters longer than 100 or so taps on single CPUs. Specifically, we ended up with:
where is acyclic in this context. Stated as a procedure, we have the following steps in an overlap-add FFT processor:
- Extract the th length frame of data at time .
- Shift it to the base time interval (or ).
- Optionally apply a length analysis window (causal or zero phase, as preferred). For simple LTI filtering, the rectangular window is fine.
- Zero-pad the windowed data out to the FFT size (a power of 2), such that , where is the FIR filter length.
- Take the -point FFT.
- Apply the filter frequency-response as a windowing operation in the frequency domain.
- Take the -point inverse FFT.
- Shift the origin of the -point result out to sample where it belongs.
- Sum into the output buffer containing the results from prior frames (OLA step).
Dual of Constant Overlap-Add
Convolution of Short Signals