Dual of Constant Overlap-Add

In this section, we will derive the Fourier dual of the Constant OverLap-Add (COLA) condition for STFT analysis windows (discussed in §7.1). Recall that for perfect reconstruction using a hop-size of $ R$ samples, the window must be $ \hbox{\sc Cola}(R)$ . We will find that the equivalent frequency-domain condition is that the window transform must have spectral zeros at all frequencies which are a nonzero multiple of $ 2\pi/R$ . Following established nomenclature for filter banks, we will say that such a window transform is $ \hbox{\sc Nyquist}(2\pi/R)$ .

Poisson Summation Formula

Consider the summation of N complex sinusoids having frequencies uniformly spaced around the unit circle [264]:

\begin{eqnarray*}
x(n) &\mathrel{\stackrel{\Delta}{=}}& \frac{1}{N} \sum_{k=0}^{N-1}e^{j\omega_kn} =
\left\{
\begin{array}{ll}
1 & n=0 \quad (\hbox{\sc mod}\ N) \\
0 & \mbox{elsewhere} \\
\end{array} \right. \\ [5pt]
&=& \hbox{\sc IDFT}_n(1 \cdots 1)
\end{eqnarray*}

where $ \omega_k \mathrel{\stackrel{\Delta}{=}}2\pi k / N$ .


\begin{psfrags}
% latex2html id marker 22700\psfrag{d(n)}{\normalsize $\displaystyle\normalsize \sum_l\delta(n-lN)$\ }\psfrag{\makebox[0pt][l]{2}N}{\normalsize $-2N$}\psfrag{\makebox[0pt][l]{N}}{\normalsize $-N$}\psfrag{0}{\normalsize $0$}\psfrag{N}{\normalsize $N$}\psfrag{n}{\normalsize $n$}\psfrag{2N}{\normalsize $2N$}\psfrag{1}{\normalsize $1$}\begin{figure}[htbp]
\includegraphics[width=3.5in]{eps/delta}
\caption{Discrete-time impulse train created
by a sum of sampled complex sinusoids.}
\end{figure}
\end{psfrags}

Setting $ N=R$ (the FFT hop size) gives

$\displaystyle \zbox {\sum_m \delta(n-mR) = \frac{1}{R} \sum_{k=0}^{R-1}e^{j\omega_kn}}$ (9.26)

where $ \omega_k \mathrel{\stackrel{\Delta}{=}}2\pi k/R$ (harmonics of the frame rate).

Let us now consider these equivalent signals as inputs to an LTI system, with an impulse response given by $ w(n)$ , and frequency response equal to $ W(\omega)$ .


\begin{psfrags}
% latex2html id marker 22729\psfrag{w(n)}{\normalsize $w(n)$\ }\psfrag{W(w)}{\normalsize $W(\omega)$\ }\psfrag{timesum}{\normalsize $\displaystyle\sum_l\delta(n-lR)$\ }\psfrag{freqsum}{\normalsize $\frac{1}{R}\displaystyle\sum_k e^{j\omega_kn}$\ }\psfrag{t}{\normalsize $\displaystyle\sum_l w(n-lR)$\ }\psfrag{f}{\normalsize $\frac{1}{R}\displaystyle\sum_k W(\omega_k)e^{j\omega_kn}$\ }\psfrag{time}{\normalsize Time}\psfrag{freq}{\normalsize Frequency}\begin{figure}[htbp]
\includegraphics[width=4in]{eps/poisson}
\caption{Linear systems theory proof of the Poisson summation formula.}
\end{figure}
\end{psfrags}

Looking across the top of Fig.8.16, for the case of input signal $ \sum_m \delta(n-mR)$ we have

$\displaystyle y(n) = \sum_m w(n-mR).$ (9.27)

Looking across the bottom of the figure, for the case of input signal

$\displaystyle x(n) = \frac{1}{R} \sum_{k=0}^{R-1}e^{j\omega_kn},$ (9.28)

we have the output signal

$\displaystyle y(n) = \frac{1}{R} \sum_{k=0}^{R-1} W(\omega_k)e^{j\omega_kn}.$ (9.29)

This second form follows from the fact that complex sinusoids $ e^{j\omega_kn}$ are eigenfunctions of linear systems--a basic result from linear systems theory [264,263].

Since the inputs were equal, the corresponding outputs must be equal too. This derives the Poisson Summation Formula (PSF):

$\displaystyle \zbox {\underbrace{\sum_m w(n-mR)}_{\hbox{\sc Alias}_R(w)} = \underbrace{\frac{1}{R}\sum_{k=0}^{R-1} W(\omega_k)e^{j\omega_k n}}_{\hbox{\sc DFT}_R^{-1} \left[\hbox{\sc Sample}_{\frac{2\pi}{R}}(W)\right]}} \quad \omega_k \isdef \frac{2\pi k}{R} \protect$ (9.30)

Note that the PSF is the Fourier dual of the sampling theorem [270], [264, Appendix G].

The continuous-time PSF is derived in §B.15.


Frequency-Domain COLA Constraints

Recall that for error-free OLA processing, we required the constant-overlap-add (COLA) window constraint:

$\displaystyle \sum_m w(n-mR) = \hbox{constant}$ (9.31)

Thanks to the PSF, we may now express the COLA constraint in the frequency domain:

$\displaystyle \zbox {w\in\hbox{\sc Cola}(R) \;\Leftrightarrow\; W(\omega_k) = 0, \quad \vert k\vert = 1,2, \dots, R-1}$ (9.32)

In other terms,
$\textstyle \parbox{0.8\textwidth}{A window $w$\ gives constant overlap-add at
hop-size $R$\ if and only if the window transform $W$\ is \emph{zero at
all harmonics of the frame rate} $2\pi/R$.}$

Notation:

$\displaystyle \zbox {w \in \hbox{\sc Cola}(R) \quad \Leftrightarrow \quad W \in \hbox{\sc Nyquist}(2\pi/R)} \protect$ (9.33)

The ``Nyquist($ \Omega_R$ )'' property for a function $ W$ simply means that $ W$ is zero at all nonzero multiples of $ \Omega_R$ (all harmonics of the frame rate here).

We may also refer to (8.33) as the ``weak COLA constraint'' in the frequency domain. It gives necessary and sufficient conditions for perfect reconstruction in overlap-add FFT processors. However, when the short-time spectrum is being modified, these conditions no longer apply, and a stronger COLA constraint is preferable.

Strong COLA

An overly strong (but sufficient) condition is to require that the window transform $ W(\omega)$ be bandlimited consistent with downsampling by $ R$ :

$\displaystyle W(\omega) = 0, \quad \vert\omega\vert \geq \pi/R
\quad\quad\hbox{(sufficient for COLA)}
$

This condition is sufficient, but not necessary, for perfect OLA reconstruction. Strong COLA implies weak COLA, but it cannot be achieved exactly by finite-duration window functions.

When either of the strong or weak COLA conditions are met, we have

$\displaystyle \zbox {\sum_m w(n-mR) = \frac{1}{R} W(0)}$ (9.34)

That is, the overlap-add of the window $ w$ at hop-size $ R$ is equal numerically to the dc gain of the window divided by $ R$ .


PSF Dual and Graphical Equalizers

Above, we used the Poisson Summation Formula to show that the constant-overlap-add of a window in the time domain is equivalent to the condition that the window transform have zero-crossings at all harmonics of the frame rate. In this section, we look briefly at the dual case: If the window transform is COLA in the frequency domain, what is the corresponding property of the window in the time domain? As one should expect, being COLA in the frequency domain corresponds to having specific uniform zero-crossings in the time domain.

Bandpass filters that sum to a constant provides an ideal basis for a graphic equalizer. In such a filter bank, when all the ``sliders'' of the equalizer are set to the same level, the filter bank reduces to no filtering at all, as desired.

Let $ N$ denote the number of (complex) filters in our filter bank, with pass-bands uniformly distributed around the unit circle. (We will be using an FFT to implement such a filter bank.) Denote the frequency response of the ``dc channel'' by $ W(e^{j\omega T})$ . Then the constant overlap-add property of the $ N$ -channel filter bank can be expressed as

$\displaystyle W \in \hbox{\sc Cola}(2\pi/N)$ (9.35)

which means

$\displaystyle S(\omega) = \sum_{k=0}^{N-1} W\left(e^{j(\omega-\omega_k)T}\right) = \hbox{constant}$ (9.36)

where $ \omega_k T\isdef k\cdot 2\pi/N$ as usual. By the dual of the Poisson summation formula, we have

$\displaystyle \zbox {W \in \hbox{\sc Cola}(2\pi/N) \quad \Leftrightarrow \quad w \in \hbox{\sc Nyquist}(N)} \protect$ (9.37)

where $ w\in \hbox{\sc Nyquist}(N)$ means that $ w$ is zero at all nonzero integer multiples of $ N$ , i.e.,

$\displaystyle w(n)=0, \quad n=\pm N,\pm 2N, \pm 3N, \ldots\,.$ (9.38)

Thus, using the dual of the PSF, we have found that a good $ N$ -channel equalizer filter bank can be made using bandpass filters which have zero-crossings at multiples of $ N$ samples, because that property guarantees that the filter bank sums to a constant frequency response when all channel gains are equal.

The duality introduced in this section is the basis of the Filter-Bank Summation (FBS) interpretation of the short-time Fourier transform, and it is precisely the Fourier dual of the OverLap-Add (OLA) interpretation [9]. The FBS interpretation of the STFT is the subject of Chapter 9.


PSF and Weighted Overlap Add

Using ``square-root windows'' $ \sqrt{w}$ in the WOLA context, the valid hop sizes $ R$ are identical to those for $ w$ in the OLA case. More generally, given any window $ w(n)$ for use in a WOLA system, it is of interest to determine the hop sizes which yield perfect reconstruction.

Recall that, by the Poisson Summation Formula (PSF),

$\displaystyle \zbox {\underbrace{\sum_m w(n-mR)}_{\hbox{\sc Alias}_R(w)} = \underbrace{\frac{1}{R}\sum_{k=0}^{R-1} W(\omega_k)e^{j\omega_k n}}_{\hbox{\sc DFT}_R^{-1} \left[\hbox{\sc Sample}_{\frac{2\pi}{R}}(W)\right]}} \quad \omega_k \isdef \frac{2\pi k}{R} \protect$ (9.39)

For WOLA, this is easily modified to become

$\displaystyle \zbox {\underbrace{\sum_m w(n-mR)f(n-mR)}_{\hbox{\sc Alias}_R(w\cdot f)} = \underbrace{\frac{1}{R}\sum_{k=0}^{R-1} (W\ast F)(\omega_k)e^{j\omega_k n}}_{\hbox{\sc DFT}_R^{-1} \left[\hbox{\sc Sample}_{\frac{2\pi}{R}}(W\ast F)\right]}} \quad \omega_k \isdef \frac{2\pi k}{R}$ (9.40)

where $ w(n)$ is the analysis window and $ f(n)$ is the synthesis window.

When $ w=f$ , this becomes

$\displaystyle \underbrace{\sum_m w^2(n-mR)}_{\hbox{\sc Alias}_R(w^2)} = \underbrace{\frac{1}{R}\sum_{k=0}^{R-1} (W\ast W)(\omega_k)e^{j\omega_k n}}_{\hbox{\sc DFT}_R^{-1} \left[\hbox{\sc Sample}_{\frac{2\pi}{R}}(W\ast W)\right]}, \quad \omega_k \isdef \frac{2\pi k}{R}$ (9.41)


Example COLA Windows for WOLA

In a weighted overlap-add system, the following windows can be used to satisfy the constant-overlap-add condition:

  • For the rectangular window, $ w^2(n)=w(n)$ , and $ W\ast W = W$ (since $ W(\omega_k)$ is a sinc function which reduces to $ \delta(\omega_k)$ when $ \omega_k = 2\pi k / M$ , and $ \delta\ast \delta = \delta$ .

  • For the Hamming window, the critically sampled window transform has three nonzero samples (where the rectangular-window transform has one). Therefore, $ W\ast W$ has $ 3+3-1=5$ nonzero samples at critical sampling. Measuring main-lobe width from zero-crossing to zero-crossing as usual, we get $ 6\cdot 2\pi/M$ radians per sample, or ``6 side lobes'', for the width of $ W\ast W$ .

  • The squared-Blackman window transform width is $ (5+5-1)+1=10$ .

  • The square of a length $ M$ $ L$ -term Blackman-Harris-family window (where rect is $ L=1$ , Hann is $ L=2$ , etc.) has a main lobe of width $ (2(2L-1)-1)+1=4L-2$ , measured from zero-crossing to zero-crossing in ``side-lobe units'' ($ 2\pi/M$ ). This is up from $ (2L-1)+1=2L$ for the original $ L$ -term window.

  • The width of the main lobe can be used to determine the hop size in the STFT, as will be discussed further in Chapter 9.

Note that we need only find the first zero-crossing in the window transform for any member of the Blackman-Harris window family (Chapter 3), since nulls at all harmonics of that frequency will always be present (at multiples of $ 2\pi/M$ ).


Next Section:
Overlap-Save Method
Previous Section:
Convolving with Long Signals