Uniform Running-Sum Filter Banks

Free Books Spectral Audio Signal Processing

Using a length running-sum filter, let's make bandpass filters tuned to center frequencies

$\displaystyle \omega_k\isdef k\frac{2\pi}{N}, \quad k=0,1,2,\ldots,N-1.$

(10.11)

Since the bandwidths, as defined, are $4\pi/N$ , the filter pass-bands overlap by 50%. A superposition of the bandpass frequency responses for is shown in Fig.9.14. Also shown is the frequency-response sum, which we will show to be exactly constant and equal to . This gives our filter bank the perfect reconstruction property. We can simply add the outputs of the filters in the filter bank to recreate our input signal exactly. This is the source of the name Filter-Bank Summation (FBS).

**Figure:** Example filter-bank channel frequency responses for
$\includegraphics[width=3in]{eps/sincbank}$

System Diagram of the Running-Sum Filter Bank

**Figure 9.15:** DFT Filter Bank.
$\includegraphics{eps/BPFB}$

Figure 9.15 shows the system diagram of the complete -channel filter bank constructed using length FIR running-sum lowpass filters. The th channel computes:

$\displaystyle y_k(n)$	$\displaystyle =$	$\displaystyle (h\ast x_k)(n) = \sum_{m=0}^{N-1}h(m)x_k(n-m)$
	$\displaystyle =$	$\displaystyle (x_k\ast h)(n) = \sum_{m=n-(N-1)}^{n}x_k(m)h(n-m)$
	$\displaystyle =$	$\displaystyle \sum_{m=n-(N-1)}^{n}x(m)e^{-j\omega_k m }\hbox{\sc Shift}_{n,m}(\hbox{\sc Flip}(h))$
	$\displaystyle =$	$\displaystyle \sum_{m=n-(N-1)}^{n}x(m)e^{-j\omega_k m } \protect$	(10.12)

DFT Filter Bank

Recall that the Length Discrete Fourier Transform (DFT) is defined as

$\displaystyle X(k) \isdef \sum_{n=0}^{N-1} x(n) e^{-j2\pi nk/N}$

(10.13)

Comparing this to (9.12), we see that the filter-bank output , $k=0,1,\ldots,N-1$ , is precisely the DFT of the input signal when , i.e.,

$\displaystyle \zbox {X(k) = y_k(N-1)}.$

(10.14)

In other words, the filter-bank output at time (the set of samples for $k=0,1,2,\ldots,N-1$ ), equals the DFT of the first samples of ( , $n=0,\ldots,N-1$ ). That is, taking a snapshot of all filter-bank channels at time yields the DFT of the input data from time 0 through .

More generally, for all , we will call Fig.9.15 the DFT filter bank. The DFT filter bank is the special case of the STFT for which a rectangular window and hop size are used.

The sliding DFT is obtained by advancing successive DFTs by one sample:

$\displaystyle X_n(k) \isdef \sum_{m=0}^{N-1} x(n+m) e^{-j2\pi mk/N}$

(10.15)

When for any integer , the Sliding DFT coincides with the DFT filter bank. At other times, they differ by a linear phase term. (Exercise: find the linear phase term.) The Sliding DFT redefines the time origin every sampling period (each modulation term within the DFT starts at time 0 for each transform), while the DFT Filter Bank does not redefine the time origin (modulation terms are ``free running'' as they would be in an analog filter bank). Since ``DFT time'' repeats every samples, the two treatments coincide every samples (i.e., $e^{j\omega_k(n+LN)}=e^{j\omega_kn}$ for every integer ).

When is a power of 2, the DFT can be implemented using a Cooley-Tukey Fast Fourier Transform (FFT) using only ${\cal O}(N\log_2(N))$ operations per transform. By keeping track of the linear phase term (an ${\cal O}(N)$ modification), a DFT Filter Bank can be implemented efficiently using an FFT. Uniform FIR filter banks are very often implemented in practice using FFT software such as fftw.

Note that the channel bandwidths are narrow compared with half the sampling rate (especially for large ), so that the filter bank output signals are oversampled, in general. We will later look at downsampling the channel signals to obtain a ``hopping FFT'' filter bank. ``Sliding'' and ``hopping'' FFTs are special cases of the discrete-time Short Time Fourier Transform (STFT). The STFT normally also uses a window function other than the rectangular window used in this development (the running-sum lowpass filter).

Inverse DFT and the DFT Filter Bank Sum

The Length inverse DFT is given by [264]

$\displaystyle x(n) = \frac{1}{N}\sum_{k=0}^{N-1} X(k) e^{j2\pi nk/N}, \quad n=0,1,2,\ldots,N-1.$

(10.16)

This suggests that the DFT Filter Bank can be inverted by simply remodulating the baseband filter-bank signals , summing over , and dividing by for proper normalization. That is, we are led to conjecture that

$\displaystyle x(n-N+1) = \frac{1}{N}\sum_{k=0}^{N-1} y_k(n) e^{j2\pi nk/N}, \quad n=0,1,2,\ldots\,.$

(10.17)

This is in fact true, as we will later see. (It is straightforward to show as an exercise.)

Specific Windows

Recall that the rectangular window transform is $\hbox{\sc Nyquist}(2\pi/M)$ , implying the rectangular window itself is $\hbox{\sc Cola}(M)$ , which is obvious.
The window transform for the Hamming family is $\hbox{\sc Nyquist}(4\pi/M)$ , implying that Hamming windows are $\hbox{\sc Cola}(M/2)$ , which we also knew.
The rectangular window transform is also $\hbox{\sc Nyquist}(K2\pi/M)$ for any integer $1\leq K\leq M/2$ , implying that all hop sizes given by for $K=1,2,\ldots,M/2$ are COLA.
Because its side lobes are the same width as the sinc side lobes, the Hamming window transform is also $\hbox{\sc Nyquist}(K2\pi/M)$ ,for any integer $2\leq K\leq M/2$ , implying hop sizes are good, for $K=2,\ldots,M/2$ . Thus, the available hop sizes for the Hamming window family include all of those for the rectangular window except one ( ).

The Nyquist Property on the Unit Circle

As a degenerate case, note that is COLA for any window, while no window transform is $\hbox{\sc Nyquist}(2\pi)$ except the zero window. (since it would have to be zero at dc, and we do not consider such windows). Did the theory break down for ?

Intuitively, the $\hbox{\sc Nyquist}(2\pi/R)$ condition on the window transform $W(\omega)$ ensures that all nonzero multiples of the time-domain-frame-rate $2\pi/R$ will be zeroed out over the interval $[-\pi,\pi)$ along the frequency axis. When the frame-rate equals the sampling rate ( ), there are no frame-rate multiples in the range $[-\pi,\pi)$ . (The range $[0,2\pi)$ gives the same result.) When , there is exactly one frame-rate multiple at $-\pi$ . When , there are two at $\pm 2\pi/3$ . When , they are at $-\pi$ and $\pm\pi/2$ , and so on.

We can cleanly handle the special case of by defining all functions over the unit circle as being $\hbox{\sc Nyquist}(2\pi)$ when there are no frame-rate multiples in the range $[-\pi,\pi)$ . Thus, a discrete-time spectrum $W(\omega), \omega\in[-\pi,\pi)$ is said to be $\hbox{\sc Nyquist}(2\pi/K)$ if $W(r 2\pi/K)=0$ , for all $\vert r\vert=1,2,\ldots,\left\lfloor K/2\right\rfloor$ , where $\left\lfloor x\right\rfloor$ (the ``floor function'') denotes the greatest integer less than or equal to .