Sign in

Not a member? | Forgot your Password?

Search Online Books

Search tips

Free Online Books

Free PDF Downloads

A Quadrature Signals Tutorial: Complex, But Not Complicated

Understanding the 'Phasing Method' of Single Sideband Demodulation

Complex Digital Signal Processing in Telecommunications

Introduction to Sound Processing

C++ Tutorial

Introduction of C Programming for DSP Applications

Fixed-Point Arithmetic: An Introduction

Cascaded Integrator-Comb (CIC) Filter Introduction


IIR Filter Design Software

See Also

Embedded SystemsFPGA

Chapter Contents:

Search Spectral Audio Signal Processing


Book Index | Global Index

Would you like to be notified by email when Julius Orion Smith III publishes a new entry into his blog?


Filter-Bank Summation (FBS) Interpretation of the STFT

We can group the terms in the STFT definition differently to obtain the filter-bank interpretation:

$\displaystyle X_m(\omega_k)$ $\displaystyle =$ $\displaystyle \sum_{n=-\infty}^\infty \underbrace{[ x(n)e^{-j\omega_k n}]}_{x_k(n)} w(n-m)$  
  $\displaystyle =$ $\displaystyle \left[x_k \ast \hbox{\sc Flip}(w)\right](m)
\protect$ (9.2)

As will be explained further below (and illustrated further in Figures 8.3, 8.4, and 8.5), under the filter-bank interpretation, the spectrum of $ x$ is first rotated along the unit circle in the $ z$ plane so as to shift frequency $ \omega_k$ down to 0 (via modulation by $ e^{-j\omega_k n}$ in the time domain), thus forming the heterodyned signal $ x_k(n)\isdeftext x(n)\exp(-j\omega_k
n)$. Next, the heterodyned signal $ x_k(n)$ is lowpass-filtered to a narrow band about frequency 0 (via convolving with the time-reversed window $ \hbox{\sc Flip}(w)$). The STFT is thus interpreted as a frequency-ordered collection of narrow-band time-domain signals, as depicted in Fig.8.2. In other words, the STFT can be seen as a uniform filter bank in which the input signal $ x(n)$ is converted to a set of $ N$ time-domain output signals $ X_n(\omega_k)$, $ k=0,1,\ldots,N-1$, one for each channel of the $ N$-channel filter bank.

figure[htbp] \includegraphics{eps/fbs}

Expanding on the previous paragraph, the STFT (8.2) is computed by the following operations:

  • Frequency-shift $ x(n)$ by $ -\omega_k$ to get $ x_k(n) \mathrel{\stackrel{\Delta}{=}}e^{-j\omega_k n}x(n)$.
  • Convolve $ x_k(n)$ with $ {\tilde w}\mathrel{\stackrel{\Delta}{=}}\hbox{\sc Flip}(w)$ to get $ X_m(\omega_k)$:

    $\displaystyle X_m(\omega_k) = \sum_{n=-\infty}^\infty x_k(n){\tilde w}(m-n) = (x_k * {\tilde w})(m)

The STFT output signal $ X_m(\omega_k)$ is regarded as a time-domain signal (time index $ m$) coming out of the $ k$th channel of an $ N$-channel filter bank. The center frequency of the $ k$th channel filter is $ \omega_k =
2\pi k/N$, $ k=0,1,\ldots,N-1$. Each channel output signal is a baseband signal; that is, it is centered about dc, with the ``carrier term'' $ e^{j\omega_k m}$ taken out by ``demodulation'' (frequency-shifting). In particular, the $ k$th channel signal is constant whenever the input signal happens to be a sinusoid tuned to frequency $ \omega_k$ exactly.

Note that the STFT analysis window $ w$ is now interpreted as (the flip of) a lowpass-filter impulse response. Since the analysis window $ w$ in the STFT is typically symmetric, we usually have $ \hbox{\sc Flip}(w)=w$. This filter is effectively frequency-shifted to provide each channel bandpass filter. If the cut-off frequency of the window transform is $ \omega_c$ (typically half a main-lobe width), then each channel signal can be downsampled significantly. This downsampling factor is the FBS counterpart of the hop size $ R$ in the OLA context.

Figure 8.3 illustrates the filter-bank interpretation for $ R=1$ (the ``sliding STFT''). The input signal $ x(n)$ is frequency-shifted by a different amount for each channel and lowpass filtered by the (flipped) window.

% latex2html id marker 20658\psfrag{w}{\Large$\protect\hbox{\s...
... where $x_k(n)\isdeftext
x(n)\exp(-j\omega_k n)$.

Previous: Overlap-Add (OLA) Interpretation of the STFT
Next: FBS and Perfect Reconstruction

Order a Hardcopy of Spectral Audio Signal Processing

About the Author: Julius Orion Smith III
Julius Smith's background is in electrical engineering (BS Rice 1975, PhD Stanford 1983). He is presently Professor of Music and (by courtesy) of Electrical Engineering at Stanford's Center for Computer Research in Music and Acoustics (CCRMA), teaching courses and pursuing research related to signal processing applied to music and audio systems. See for details.


No comments yet for this page

Add a Comment
You need to login before you can post a comment (best way to prevent spam). ( Not a member? )