OverlapAdd (OLA) STFT Processing
This chapter discusses use of the ShortTime Fourier Transform (STFT) to implement linear filtering in the frequency domain. Due to the speed of FFT convolution, the STFT provides the most efficient singleCPU implementation engine for most FIR filters encountered in audio signal processing.Recall from §7.1 the STFT:
where
(9.2) 
then the sum of the successive DTFTs over time equals the DTFT of the whole signal :
(9.3) 
Consequently, the inverseSTFT is simply the inverseDTFT of this sum:
(9.4) 
Note that can be different for each frame, giving a certain class of timevarying filters. The filtered output signal spectrum is then
(9.5) 
so that
(9.6) 
where
(9.7) 
This chapter discusses practical implementation of the above relations using a Fast Fourier Transform (FFT). In particular, we use an FFT to compute efficiently what may be regarded as a sampled DTFT. We will look at how sampling density must be increased along the unit circle when spectral modifications are to be performed, and we will discuss further the COLA condition on the analysis window and hopsize. In the end, our practical FFTconvolution engine will look as follows:
(9.8) 
The sum over may be interpreted as adding separately filtered frames . Due to this filtering, the frames must overlap, even when the rectangular window is used. As a result, the overall system is often called an overlapadd FFT processor, or ``OLA processor'' for short. It is regarded as a sequence of FFTs which may be modified, inversetransformed, and summed. This ``hopping transform'' view of the STFT is the Fourier dual of the ``filterbank'' interpretation to be discussed in Chapter 9.
Convolution of Short Signals

(9.9) 
or,
(9.10) 
where and are arbitrary real or complex sequences, and and are the DTFTs of and , respectively. The convolution of and is defined by
(9.11) 
In practice, we always use the DFT (preferably an FFT) in place of the DTFT, in which case we may write
(9.12) 
where now (length complex sequences). It is important to remember that the specific form of convolution implied in the DFT case is cyclic (also called circular) convolution [264]:
where means `` modulo .'' Another way to look at convolution is as the inner product of , and , where , i.e.,
(9.14) 
This form describes graphical convolution in which the output sample at time is computed as an inner product of the impulse response after flipping it about time 0 and shifting time 0 to time . See [264, p. 105] for an illustration of graphical convolution.
Cyclic FFT Convolution
Thanks to the convolution theorem, we have two alternate ways to perform cyclic convolution in practice: Direct calculation in the time domain using (8.13)
 Frequencydomain convolution:
 Fourier Transform both signals
 Perform term by term multiplication of the transformed signals
 Inverse transform the result to get back to the time domain
Acyclic FFT Convolution
If we add enough trailing zeros to the signals being convolved, we can obtain acyclic convolution embedded within a cyclic convolution. How many zeros do we need to add? Suppose the signal consists of contiguous nonzero samples at times 0 to , preceded and followed by zeros, and suppose is nonzero only over a block of samples starting at time 0. Then the acyclic convolution of with reduces to(9.15) 
which is zero for and . Thus,
The number is easily checked for signals of length 1 since , where is 1 at time zero and 0 at all other times. Similarly,
(9.16) 
and so on. When or is infinity, the convolution result can be as small as 1. For example, consider , with , and . Then . This is an example of what is called deconvolution. In the frequency domain, deconvolution always involves a polezero cancellation. Therefore, it is only possible when or is infinite. In practice, deconvolution can sometimes be accomplished approximately, particularly within narrow frequency bands [119]. We thus conclude that, to embed acyclic convolution within a cyclic convolution (as provided by an FFT), we need to zeropad both operands out to length , where is at least the sum of the operand lengths (minus one).
Acyclic Convolution in Matlab
In Matlab or Octave, the conv function implements acyclic convolution:octave:1> conv([1 2],[3 4]) ans = 3 10 8Note that it returns an output vector which is long enough to accommodate the entire result of the convolution, unlike the filter primitive, which always returns an output signal equal in length to the input signal:
octave:2> filter([1 2],1,[3 4]) ans = 3 10 octave:3> filter([1 2],1,[3 4 0]) ans = 3 10 8
Pictorial View of Acyclic Convolution
Figure 8.2 shows schematically the result of convolving two zeropadded signals and . In this case, the signal starts some time after , say at . Since begins at time 0 , the output starts promptly at time , but it takes some time to ``ramp up'' to full amplitude. (This is the transient response of the FIR filter .) If the length of is , then the transient response is finished at time . Next, when the input signal goes to zero at time , the output reaches zero samples later (after the filter ``decay time''), or time . Thus, the total number of nonzero output samples is . If we don't add enough zeros, some of our convolution terms ``wrap around'' and add back upon others (due to modulo indexing). This can be called timedomain aliasing. Zeropadding in the time domain results in more samples (closer spacing) in the frequency domain, i.e., a higher `sampling rate' in the frequency domain. If we have a high enough spectral sampling rate, we can avoid time aliasing. The motivation for implementing acyclic convolution using a zeropadded cyclic convolution is that we can use a CooleyTukey Fast Fourier Transform (FFT) to implement cyclic convolution when its length is a power of 2.Acyclic FFT Convolution in Matlab
The following example illustrates the implementation of acyclic convolution using a CooleyTukey FFT in matlab:x = [1 2 3 4]; h = [1 1 1]; nx = length(x); nh = length(h); nfft = 2^nextpow2(nx+nh1) xzp = [x, zeros(1,nfftnx)]; hzp = [h, zeros(1,nfftnh)]; X = fft(xzp); H = fft(hzp); Y = H .* X; format bank; y = real(ifft(Y)) % zeropadded result yt = y(1:nx+nh1) % trim and print yc = conv(x,h) % for comparisonProgram output:
nfft = 8 y = 1.00 3.00 6.00 9.00 7.00 4.00 0.00 0.00 yt = 1.00 3.00 6.00 9.00 7.00 4.00 yc = 1 3 6 9 7 4
FFT versus Direct Convolution
Using the Matlab test program in [264],^{9.1}FFT convolution was found to be faster than direct convolution starting at length (looking only at powers of 2 for the length ).^{9.2} FFT convolution was also never significantly slower at shorter lengths for which ``calling overhead'' dominates. Running the same test program in 2011,^{9.3} FFT convolution using the fft function was found to be faster than conv for all (powerof2) lengths. The speed of FFT convolution divided by that of direct convolution started out at 14 for , fell to a minimum of at , above which it started to climb as expected, reaching at . Note that this comparison is unfair because the Octave fft function is a dynamically linked, separately compiled module, while conv is written in the matlab language and thus suffers more overhead from the matlab interpreter. An analysis reported in Strum and Kirk [279, p. 521], based on the number of real multiplies, predicts that the fft is faster starting at length , and that direct convolution is significantly faster for very short convolutions (e.g., 16 operations for a direct length4 convolution, versus 176 for the fft function). See [264]^{9.4}for further discussion of FFT algorithms and their applications. In digital audio, FIR filters are often hundreds of taps long. For such filters, the FFT method is much faster than direct convolution in the time domain on single CPUs. On GPUs, FFT convolution is faster than direct convolution only for much longer FIRfilter lengths (in the thousands of taps [242]); this is because massively parallel hardware can perform an algorithm (direct convolution) faster than a single CPU can perform an algorithm (FFT convolution).Audio FIR Filters
FIR filters shorter than the ear's ``integration time'' can generally be characterized by their magnitude frequency response (no perceivable ``delay effects''). The nominal ``integration time'' of the ear can be defined as the reciprocal of a critical bandwidth of hearing. Using Zwicker's definition of critical bandwidth [305], the smallest critical bandwidth of hearing is approximately 100 Hz (below 500 Hz). Thus, the nominal integration time of the ear is 10ms below 500 Hz. (Using the equivalentrectangularbandwidth (ERB) definition of critical bandwidth [179,269], longer values are obtained). At a 50 kHz sampling rate, this is 500 samples. Therefore, FIR filters shorter than the ear's ``integration time,'' i.e., perceptually ``instantaneous,'' can easily be hundreds of taps long (as discussed in the next section). FFT convolution is consequently an important implementation tool for FIR filters in digital audio applications.Example 1: LowPass Filtering by FFT Convolution
In this example, we design and implement a length FIR lowpass filter having a cutoff frequency at Hz. The filter is tested on an input signal consisting of a sum of sinusoidal components at frequencies Hz. We'll filter a single input frame of length , which allows the FFT to be samples (no wasted zeropadding).% Signal parameters: f = [ 440 880 1000 2000 ]; % frequencies M = 256; % signal length Fs = 5000; % sampling rate % Generate a signal by adding up sinusoids: x = zeros(1,M); % preallocate 'accumulator' n = 0:(M1); % discretetime grid for fk = f; x = x + sin(2*pi*n*fk/Fs); endNext we design the lowpass filter using the window method:
% Filter parameters: L = 257; % filter length fc = 600; % cutoff frequency % Design the filter using the window method: hsupp = ((L1)/2:(L1)/2); hideal = (2*fc/Fs)*sinc(2*fc*hsupp/Fs); h = hamming(L)' .* hideal; % h is our filterFigure 8.3 plots the impulse response and amplitude response of our FIR filter designed by the window method. Next, the signal frame and filter impulse response are zeropadded out to the FFT size and transformed:
% Choose the next power of 2 greater than L+M1 Nfft = 2^(ceil(log2(L+M1))); % or 2^nextpow2(L+M1) % Zero pad the signal and impulse response: xzp = [ x zeros(1,NfftM) ]; hzp = [ h zeros(1,NfftL) ]; X = fft(xzp); % signal H = fft(hzp); % filterFigure 8.4 shows the input signal spectrum and the filter amplitude response overlaid. We see that only one sinusoidal component falls within the passband.
Y = X .* H;The modified spectrum is shown in Fig.8.5. The final acyclic convolution is the inverse transform of the pointwise product in the frequency domain. The imaginary part is not quite zero as it should be due to finite numerical precision:
y = ifft(Y); relrmserr = norm(imag(y))/norm(y) % check... should be zero y = real(y);
Example 2: Time Domain Aliasing
Figure 8.7 shows the effect of insufficient zero padding, which can be thought of as undersampling in the frequency domain. We will see aliasing in the time domain results. The lowpass filter length is and the input signal consists of an impulse at times and , where the data frame length is . To avoid time aliasing (i.e., to implement acyclic convolution using an FFT), we must use an FFT size at least as large as . In the figure, the FFT sizes , , and are used. Thus, the first case is heavily time aliased, the second only slightly time aliased (involving only some of the filter's ``ringing'' after the second pulse), and the third is free of time aliasing altogether.Convolving with Long Signals
We saw that we can perform efficient convolution of two finitelength sequences using a Fast Fourier Transform (FFT). There are some situations, however, in which it is impractical to use a single FFT for each convolution operand: One or both of the signals being convolved is very long.
 The filter must operate in real time. (We can't wait until the input signal ends before providing an output signal.)
OverlapAdd Decomposition
Consider breaking an input signal into frames using a finite, zerophase, length window . Then we may express the th windowed data frame as(9.17) 
or
(9.18) 
where
(9.19) 
This is the constantoverlapadd (COLA)^{9.6} constraint for the FFT analysis window . It has also been called the partition of unity property. Figure 8.9 illustrates the appearance of 50% overlapadd for the Bartlett (triangular) window. The Bartlett window is clearly COLA for a wide variety of hop sizes, such as , , and so on, provided is an integer (otherwise the underlying continuous triangular window must be resampled). However, when using windows defined in a library, the COLA condition should be carefully checked. For example, the following Matlab/Octave script shows that there is a problem with the standard Hamming window:
M = 33; % window length R = (M1)/2; % hop size N = 3*M; % overlapadd span w = hamming(M); % window z = zeros(N,1); plot(z,'k'); hold on; s = z; for so=0:R:NM ndx = so+1:so+M; % current window location s(ndx) = s(ndx) + w; % window overlapadd wzp = z; wzp(ndx) = w; % for plot only plot(wzp,'ok'); % plot just this window end plot(s,'ok'); hold off; % plot window overlapaddThe figure produced by this matlab code is shown in Fig.8.10. As can be seen, the equal endpoints sum to form an impulse in each frame of the overlapadd. The Matlab window functions (such as hamming) have an optional second argument which can be either 'symmetric' (the default), or 'periodic'. The periodic case is equivalent to
w = hamming(M+1); % symmetric case w = w(1:M); % delete last sample for periodic caseThe periodic variant solves the nonconstant overlapadd problem for even and , but not for odd . The problem can be solved for odd and while retaining symmetry as follows:
w = hamming(M); % symmetric case w(1) = w(1)/2; % repair constantoverlapadd for R=(M1)/2 w(M) = w(M)/2;Since different window types may add or subtract 1 to/from internally, it is best to check the result using test code as above to make sure the window is COLA at the desired hop size. E.g., in Matlab:
 hamming(M)
.54  .46*cos(2*pi*(0:M1)'/(M1));
gives constant overlapadd for , , etc., when endpoints are divided by 2 or one endpoint is zeroed  hanning(M)
.5*(1  cos(2*pi*(1:M)'/(M+1)));
does not give constant overlapadd for , but does for  blackman(M)
.42  .5*cos(2*pi*m)' + .08*cos(4*pi*m)';
where m = (0:M1)/(M1), gives constant overlapadd for when is odd and is an integer, and when is even and is integer.
COLA Examples
So far we've seen the following constantoverlapadd examples: Rectangular window at 0% overlap (hop size = window size )
 Bartlett window at 50% overlap ( ) (Since normally is odd, `` '' means ``R=(M1)/2,'' etc.)
 Hamming window at 50% overlap ( )
 Rectangular window at 50% overlap ( )
 Hamming window at 75% overlap ( % hop size)
 Any member of the Generalized Hamming family at 50% overlap
 Any member of the Blackman family at 2/3 overlap (1/3 hop size); e.g., blackman(33,'periodic'),
 Any member of the term BlackmanHarris family with .
 Any window with R=1 (``sliding FFT'')
STFT of COLA Decomposition
To represent practical FFT implementations, it is preferable to shift the frame back to the time origin:(9.20) 
This is summarized in Fig.8.11. Zerobased frames are needed because the leftmost input sample is assigned to time zero by FFT algorithms. In other words, a hopping FFT effectively redefines time zero on each hop. Thus, a practical STFT is a sequence of FFTs of the zerobased frames . On the other hand, papers in the literature (such as [7,9]) work with the fixed timeorigin case ( ). Since they differ only by a time shift, it is not hard to translate back and forth.
Note that we may sample the DTFT of both and , because both are timelimited to nonzero samples. The minimum informationpreserving sampling interval along the unit circle in both cases is . In practice, we often oversample to some extent, using with instead. For , we get
where . For we have
Acyclic Convolution
Getting back to acyclic convolution, we may write it as(9.22) 
or less along the unit circle. This is the dual of the usual sampling theorem. We conclude that practical FFT acyclic convolution may be carried out using an FFT of any length satisfying
(9.23) 
where is the frame size and is the filter length. Our final expression for is
Example of OverlapAdd Convolution
Let's look now at a specific example of FFT convolution: Impulsetrain test signal, 4000 Hz samplingrate
 Length causal lowpass filter, 600 Hz cutoff
 Length rectangular window
 Hop size (no overlap)
L = 31; % FIR filter length in taps fc = 600; % lowpass cutoff frequency in Hz fs = 4000; % sampling rate in Hz Nsig = 150; % signal length in samples period = round(L/3); % signal period in samplesFFT processing parameters:
M = L; % nominal window length Nfft = 2^(ceil(log2(M+L1))); % FFT Length M = NfftL+1 % efficient window length R = M; % hop size for rectangular window Nframes = 1+floor((NsigM)/R); % no. complete framesGenerate the impulsetrain test signal:
sig = zeros(1,Nsig); sig(1:period:Nsig) = ones(size(1:period:Nsig));Design the lowpass filter using the window method:
epsilon = .0001; % avoids 0 / 0 nfilt = ((L1)/2:(L1)/2) + epsilon; hideal = sin(2*pi*fc*nfilt/fs) ./ (pi*nfilt); w = hamming(L); % FIR filter design by window method h = w' .* hideal; % window the ideal impulse response hzp = [h zeros(1,NfftL)]; % zeropad h to FFT size H = fft(hzp); % filter frequency responseCarry out the overlapadd FFT processing:
y = zeros(1,Nsig + Nfft); % allocate output+'ringing' vector for m = 0:(Nframes1) index = m*R+1:min(m*R+M,Nsig); % indices for the mth frame xm = sig(index); % windowed mth frame (rectangular window) xmzp = [xm zeros(1,Nfftlength(xm))]; % zero pad the signal Xm = fft(xmzp); Ym = Xm .* H; % freq domain multiplication ym = real(ifft(Ym)) % inverse transform outindex = m*R+1:(m*R+Nfft); y(outindex) = y(outindex) + ym; % overlap add endThe time waveforms for the first three frames ( ) are shown in Figures 8.12 through 8.14. Notice how the causal linearphase filtering results in an overall signal delay by half the filter length. Also, note how frames 0 and 2 contain four impulses, while frame 1 only happens to catch three; this causes no difficulty, and the filtered result remains correct by superposition.
Summary of OverlapAdd FFT Processing
Overlapadd FFT processors provide efficient implementations for FIR filters longer than 100 or so taps on single CPUs. Specifically, we ended up with:(9.24) 
where is acyclic in this context. Stated as a procedure, we have the following steps in an overlapadd FFT processor:
 (1)
 Extract the th length frame of data at time .
 (2)
 Shift it to the base time interval (or ).
 (3)
 Optionally apply a length analysis window (causal or zero phase, as preferred). For simple LTI filtering, the rectangular window is fine.
 (4)
 Zeropad the windowed data out to the FFT size (a power of 2), such that , where is the FIR filter length.
 (5)
 Take the point FFT.
 (6)
 Apply the filter frequencyresponse as a windowing operation in the frequency domain.
 (7)
 Take the point inverse FFT.
 (8)
 Shift the origin of the point result out to sample where it belongs.
 (9)
 Sum into the output buffer containing the results from prior frames (OLA step).
(9.25) 
Dual of Constant OverlapAdd
In this section, we will derive the Fourier dual of the Constant OverLapAdd (COLA) condition for STFT analysis windows (discussed in §7.1). Recall that for perfect reconstruction using a hopsize of samples, the window must be . We will find that the equivalent frequencydomain condition is that the window transform must have spectral zeros at all frequencies which are a nonzero multiple of . Following established nomenclature for filter banks, we will say that such a window transform is .Poisson Summation Formula
Consider the summation of N complex sinusoids having frequencies uniformly spaced around the unit circle [264]:(9.26) 
where (harmonics of the frame rate). Let us now consider these equivalent signals as inputs to an LTI system, with an impulse response given by , and frequency response equal to . Looking across the top of Fig.8.16, for the case of input signal we have
(9.27) 
Looking across the bottom of the figure, for the case of input signal
(9.28) 
we have the output signal
(9.29) 
This second form follows from the fact that complex sinusoids are eigenfunctions of linear systemsa basic result from linear systems theory [264,263]. Since the inputs were equal, the corresponding outputs must be equal too. This derives the Poisson Summation Formula (PSF):
Note that the PSF is the Fourier dual of the sampling theorem [270], [264, Appendix G]. The continuoustime PSF is derived in §B.15.
FrequencyDomain COLA Constraints
Recall that for errorfree OLA processing, we required the constantoverlapadd (COLA) window constraint:(9.31) 
Thanks to the PSF, we may now express the COLA constraint in the frequency domain:
(9.32) 
In other terms,
Notation:
The ``Nyquist( )'' property for a function simply means that is zero at all nonzero multiples of (all harmonics of the frame rate here). We may also refer to (8.33) as the ``weak COLA constraint'' in the frequency domain. It gives necessary and sufficient conditions for perfect reconstruction in overlapadd FFT processors. However, when the shorttime spectrum is being modified, these conditions no longer apply, and a stronger COLA constraint is preferable.
Strong COLA
An overly strong (but sufficient) condition is to require that the window transform be bandlimited consistent with downsampling by :(9.34) 
That is, the overlapadd of the window at hopsize is equal numerically to the dc gain of the window divided by .
PSF Dual and Graphical Equalizers
Above, we used the Poisson Summation Formula to show that the constantoverlapadd of a window in the time domain is equivalent to the condition that the window transform have zerocrossings at all harmonics of the frame rate. In this section, we look briefly at the dual case: If the window transform is COLA in the frequency domain, what is the corresponding property of the window in the time domain? As one should expect, being COLA in the frequency domain corresponds to having specific uniform zerocrossings in the time domain. Bandpass filters that sum to a constant provides an ideal basis for a graphic equalizer. In such a filter bank, when all the ``sliders'' of the equalizer are set to the same level, the filter bank reduces to no filtering at all, as desired. Let denote the number of (complex) filters in our filter bank, with passbands uniformly distributed around the unit circle. (We will be using an FFT to implement such a filter bank.) Denote the frequency response of the ``dc channel'' by . Then the constant overlapadd property of the channel filter bank can be expressed as(9.35) 
which means
(9.36) 
where as usual. By the dual of the Poisson summation formula, we have
where means that is zero at all nonzero integer multiples of , i.e.,
(9.38) 
Thus, using the dual of the PSF, we have found that a good channel equalizer filter bank can be made using bandpass filters which have zerocrossings at multiples of samples, because that property guarantees that the filter bank sums to a constant frequency response when all channel gains are equal. The duality introduced in this section is the basis of the FilterBank Summation (FBS) interpretation of the shorttime Fourier transform, and it is precisely the Fourier dual of the OverLapAdd (OLA) interpretation [9]. The FBS interpretation of the STFT is the subject of Chapter 9.
PSF and Weighted Overlap Add
Using ``squareroot windows'' in the WOLA context, the valid hop sizes are identical to those for in the OLA case. More generally, given any window for use in a WOLA system, it is of interest to determine the hop sizes which yield perfect reconstruction. Recall that, by the Poisson Summation Formula (PSF),(9.39) 
For WOLA, this is easily modified to become
(9.40) 
where is the analysis window and is the synthesis window. When , this becomes
(9.41) 
Example COLA Windows for WOLA
In a weighted overlapadd system, the following windows can be used to satisfy the constantoverlapadd condition: For the rectangular window, , and (since is a sinc function which reduces to when , and .
 For the Hamming window, the critically sampled window transform has three nonzero samples (where the rectangularwindow transform has one). Therefore, has nonzero samples at critical sampling. Measuring mainlobe width from zerocrossing to zerocrossing as usual, we get radians per sample, or ``6 side lobes'', for the width of .
 The squaredBlackman window transform width is .
 The square of a length term BlackmanHarrisfamily window (where rect is , Hann is , etc.) has a main lobe of width , measured from zerocrossing to zerocrossing in ``sidelobe units'' ( ). This is up from for the original term window.
 The width of the main lobe can be used to determine the hop size in the STFT, as will be discussed further in Chapter 9.
OverlapSave Method
The classical overlapsave method [198,277], unlike OLA, uses no zero padding to prevent time aliasing. Instead, it (1)
 discards output samples corrupted by time aliasing each frame, and
 (2)
 overlaps the input frames by the same amount.
 If the input frame size is and the filter length is , then a length FFT and IFFT are used.
 As a result, samples of the output are invalid due to time aliasing.
 The overlapsave method writes out the good samples and uses a hop size of , thus recomputing the timealiased output samples in the previous frame.
Time Varying OLA Modifications
In the preceding sections, we assumed that the spectral modification did not vary over time. We will now examine the implications of timevarying spectral modifications. The derivation below follows [9], except that we'll keep our previous notation:(9.42) 
Let's examine the term in more detail:
 describes the time variation of the tap.
 is a filtered version of the tap . It is lowpassfiltered by w and delayed by samples.
 Denote the th timevarying, lowpassfiltered, delayedby filter tap by . This can be interpreted as the weighting in the output at time of an impulse entering the timevarying filter at time .
Block Diagram Interpretation of TimeVarying STFT Modifications
Assuming is causal givesThe term can be interpreted as the FIR filter tap at time . Note how each tap is lowpass filtered by the FFT window . The window thus enforces bandlimiting each filter tap to the bandwidth of the window's main lobe. For an term length BlackmanHarris window, for example, the mainlobe reaches zero at frequency (see Table 5.2 in §5.5.2 for other examples). This bandlimiting places a limit on the bandwidth expansion caused by timevariation of the filter coefficients, which in turn places a limit on the maximum STFT hopsize that can be used without frequencydomain aliasing. See Allen and Rabiner 1977 [9] for further details on the bandlimiting property.
Length L FIR Frame Filters
To avoid time aliasing, we restrict the filter length to a maximum of samples. Since is an arbitrary multiplicative weighting of the th spectral frame, the frame filter need not be causal. For odd , the filter impulse response indices may run from to , where(9.43) 
This gives
Weighted Overlap Add
In the weighted overlap add (WOLA) method, we apply a second window after the inverse DFT [49] and prior to the final overlapadd to create the output signal. Such a window can be called a ``synthesis window,'' ``postwindow,'' or simply ``output window.'' Output windows are important in audio compression applications for minimizing ``blocking effects.'' The synthesis window ``fades out'' any spectral coding error at the frame boundaries, thereby suppressing audible discontinuities. Output windows are not used in simple FFT convolution processors because the input frames are supposed to be expanded by the convolution, and a synthesis window would ``pinch off'' the ``filter ringing'' from each block, yielding incorrect results. Output windows can always be used in conjunction with spectral modifications made by means of the ``filter bank summation'' (FBS) method, which is the subject of the next chapter. The WOLA method is most useful for nonlinear ``instantaneous'' FFT processors such as perceptual audio coders,
 timescale modification, or
 pitchshifters.
WOLA Processing Steps
The sequence of operations in a WOLA processor can be expressed as follows: Extract the th windowed frame of data , (assuming a length causal window and hop size ).
 Take an FFT of the
th frame translated to time zero,
, to produce the th spectral frame
, .  Process as desired to produce .
 Inverse FFT to produce , .
 Apply a synthesis window to to yield a weighted output frame , .
 Translate the th output frame to time as and add to the accumulated output signal .
(9.44) 
Choice of WOLA Window
The synthesis (output) window in weighted overlapadd is typically chosen to be the same as the analysis (input) window, in which case the COLA constraint becomes(9.45) 
We can say that shifts of the window in the time domain are power complementary, whereas for OLA they were amplitude complementary. A trivial way to construct useful windows for WOLA is to take the square root of any good OLA window. This works for all nonnegative OLA windows (which covers essentially all windows in Chapter 3 other than Portnoff windows). For example, the ``rootHann window'' can be defined for odd by
Review of Zero Padding
Expanding on the discussion in §2.5, zeropadding is used for Spectral interpolation
 
 Clearer spectral magnitude/phase plots
 
 Sinusoidal peak tracking (e.g., to help quadratic interpolation)
 To extend to the next highest power of 2 (FFT)
 To make room for ``filter ringing'' in overlapadd convolution using an FFT
Next Section:
The Filter Bank Summation (FBS) Interpretation of the Short Time Fourier Transform (STFT)
Previous Section:
TimeFrequency Displays