Fourier Transforms for Continuous/Discrete Time/Frequency

The Fourier transform can be defined for signals which are

  • discrete or continuous in time, and
  • finite or infinite in duration.
This results in four cases. As you might expect, the frequency domain has the same cases:
  • discrete or continuous in frequency, and
  • finite or infinite in bandwidth.
When time is discrete, the frequency axis is finite, and vice versa.

Reference [264] develops the DFT in detail--the discrete-time, discrete-frequency case. In the DFT, both the time and frequency axes are finite in length.

Table 2.1 (next page) summarizes the four Fourier-transform cases corresponding to discrete or continuous time and/or frequency.


Table 2.1: Four cases of sampled/continuous finite/infinite time and frequency. (Often the FS coefficients are divided by the period $ P$ .)
\begin{table}\begin{center}
\begin{displaymath}
\begin{array}{\vert c\vert c\vert c\vert}
\hline
\multicolumn{3}{\vert c\vert}{\hbox{Time Duration}} \\
\hline
\hbox{Finite} & \hbox{Infinite} & \\
\hline
\hbox{Discrete FT (DFT)} & \hbox{Discrete Time FT (DTFT)}
& \hbox{discr.}
\\
X(k)=\displaystyle\sum_{n=0}^{N-1} x(n)e^{-j\omega_k n}
& \displaystyle
X(\omega)=\displaystyle\sum_{n=-\infty}^{+\infty} x(n)e^{-j\omega n}
& \hbox{time}
\\
k=0,1, \dots, N-1
& \omega \in ( - \pi, +\pi )
& \hbox{$n$}
\\
\hline
\hbox{Fourier Series (FS)} & \hbox{Fourier Transform (FT)}
& \hbox{cont.}
\\
X(k)=
\displaystyle\int_0^Px(t)e^{-j\omega_kt}dt
& X(\omega)= \displaystyle\int_{-\infty}^{+\infty}x(t)e^{-j\omega t} dt
& \hbox{time}
\\
k = - \infty, \ldots, +\infty
& \omega \in ( - \infty, +\infty)
& \hbox{$t$}
\\
\hline
\hbox{discrete freq. } k & \hbox{continuous freq. } \omega & \\
\hline
\end{array}\end{displaymath}
\end{center}
\end{table}


In all four cases, the Fourier transform can be interpreted as the inner product of the signal $ x$ with a complex sinusoid at radian frequency $ \omega$ [264]:

$\displaystyle X(\omega) = \left<x,s_\omega\right>$ (3.1)

where $ s_\omega$ is appropriately adapted, e.g.,

\begin{eqnarray*}
s_\omega(t) &=& e^{j\omega t}\qquad\qquad\qquad\quad\;\mbox{(Fourier Transform)}\\
s_{\omega_k}(t_n) &=& e^{j\omega_k t_n} \eqsp e^{j2\pi nk/N} \quad\mbox{(DFT)} \\
s_\omega(t_n) &=& e^{j\omega t_n} \eqsp \, e^{j\omega n} \quad\qquad\;\!\mbox{(DTFT)}.
\end{eqnarray*}

In spectral modeling of audio, we usually deal with indefinitely long signals. Fourier analysis of an indefinitely long discrete-time signal is carried out using the Discrete Time Fourier Transform (DTFT).3.1Below, the DTFT is defined, and selected Fourier theorems are stated and proved for the DTFT case. Additionally, for completeness, the Fourier Transform (FT) is defined, and selected FT theorems are stated and proved as well. Theorems for the DFT case are detailed in [264].3.2

Discrete Time Fourier Transform (DTFT)

The Discrete Time Fourier Transform (DTFT) can be viewed as the limiting form of the DFT when its length $ N$ is allowed to approach infinity:

$\displaystyle X(\omega) \isdefs \sum_{n=-\infty}^\infty x(n) e^{-j\omega n},$ (3.2)

where $ \omega\in[-\pi,\pi)$ denotes the continuous radian frequency variable,3.3 and $ x(n)$ is the signal amplitude at sample number $ n$ .

The inverse DTFT is

$\displaystyle x(n) \eqsp \frac{1}{2\pi}\int_{-\pi}^\pi X(\omega) e^{j\omega n} d\omega,$ (3.3)

which can be derived in a manner analogous to the derivation of the inverse DFT [264].

Instead of operating on sampled signals of length $ N$ (like the DFT), the DTFT operates on sampled signals $ x(n)$ defined over all integers $ n\in{\bf Z}$ .

Unlike the DFT, the DTFT frequencies form a continuum. That is, the DTFT is a function of continuous frequency $ \omega\in[-\pi,\pi)$ , while the DFT is a function of discrete frequency $ \omega_k$ , $ k\in[0,N-1]$ . The DFT frequencies $ \omega_k =
2\pi k/N$ , $ k=0,1,2,\ldots,N-1$ , are given by the angles of $ N$ points uniformly distributed along the unit circle in the complex plane. Thus, as $ N\to\infty$ , a continuous frequency axis must result in the limit along the unit circle. The axis is still finite in length, however, because the time domain remains sampled.


Fourier Transform (FT) and Inverse

The Fourier transform of a signal $ x(t)\in{\bf C}$ , $ t\in(-\infty,\infty)$ , is defined as

$\displaystyle X(\omega) \isdefs \int_{-\infty}^\infty x(t) e^{-j\omega t} dt \protect$ (3.4)

and its inverse is given by

$\displaystyle x(t) \eqsp \frac{1}{2\pi}\int_{-\infty}^\infty X(\omega) e^{j\omega t} d\omega \protect$ (3.5)

Thus, the Fourier transform is defined for continuous time and continuous frequency, both unbounded. As a result, mathematical questions such as existence and invertibility are most difficult for this case. In fact, such questions fueled decades of confusion in the history of harmonic analysis (see Appendix G).

Existence of the Fourier Transform

Conditions for the existence of the Fourier transform are complicated to state in general [36], but it is sufficient for $ x(t)$ to be absolutely integrable, i.e.,

$\displaystyle \left\Vert\,x\,\right\Vert _1 \isdefs \int_{-\infty}^\infty \left\vert x(t)\right\vert dt \;<\; \infty .$ (3.6)

This requirement can be stated as $ x\in L1$ , meaning that $ x$ belongs to the set of all signals having a finite $ L1$ norm ( $ \left\Vert\,x\,\right\Vert _1<\infty$ ). It is similarly sufficient for $ x(t)$ to be square integrable, i.e.,

$\displaystyle \left\Vert\,x\,\right\Vert _2^2\isdefs \int_{-\infty}^\infty \left\vert x(t)\right\vert^2 dt \;<\; \infty,$ (3.7)

or, $ x\in L2$ . More generally, it suffices to show $ x\in Lp$ for $ 1\leq p\leq 2$ [36, p. 47].

There is never a question of existence, of course, for Fourier transforms of real-world signals encountered in practice. However, idealized signals, such as sinusoids that go on forever in time, do pose normalization difficulties. In practical engineering analysis, these difficulties are resolved using Dirac's ``generalized functions'' such as the impulse (also loosely called the delta function), discussed in §B.10.


Fourier Theorems for the DTFT

This section states and proves selected Fourier theorems for the DTFT. A more complete list for the DFT case is given in [264].3.4Since this material was originally part of an appendix, it is relatively dry reading. Feel free to skip to the next chapter and refer back as desired when a theorem is invoked.

As introduced in §2.1 above, the Discrete-Time Fourier Transform (DTFT) may be defined as

$\displaystyle X(\omega) \isdef \sum_{n=-\infty}^\infty x(n) e^{-j\omega n}.$ (3.8)

We say that $ X$ is the spectrum of $ x$ .

Linearity of the DTFT

$\displaystyle \zbox {\alpha x_1 + \beta x_2 \;\longleftrightarrow\;\alpha X_1 + \beta X_2}$ (3.9)

or

$\displaystyle \hbox{\sc DTFT}(\alpha x_1 + \beta x_2) = \alpha\cdot \hbox{\sc DTFT}(x_1) + \beta \cdot\hbox{\sc DTFT}(x_2)$ (3.10)

where $ \alpha, \beta$ are any scalars (real or complex numbers), $ x_1$ and $ x_2$ are any two discrete-time signals (real- or complex-valued functions of the integers), and $ X_1, X_2$ are their corresponding continuous-frequency spectra defined over the unit circle in the complex plane.


Proof: We have

\begin{eqnarray*}
\hbox{\sc DTFT}_\omega(\alpha x_1 + \beta x_2)
& \isdef & \sum_{n=-\infty}^{\infty}[\alpha x_1(n) + \beta x_2(n)]e^{-j\omega n}\\
&=& \alpha\sum_{n=-\infty}^{\infty}x_1(n)e^{-j\omega n} + \beta \sum_{n=-\infty}^{\infty}x_2(n)e^{-j\omega n}\\
&\isdef & \alpha X_1(\omega) + \beta X_2(\omega)
\end{eqnarray*}

One way to describe the linearity property is to observe that the Fourier transform ``commutes with mixing.''


Time Reversal

For any complex signal $ x(n)$ , $ n\in(-\infty,\infty)$ , we have

$\displaystyle \zbox {\hbox{\sc Flip}(x) \;\longleftrightarrow\;\hbox{\sc Flip}(X)}$ (3.11)

where $ \hbox{\sc Flip}_n(x)\isdef x(-n)$ .


Proof:

\begin{eqnarray*}
\hbox{\sc DTFT}_\omega(\hbox{\sc Flip}(x))
&\isdef & \sum_{n=-\infty}^{\infty} x(-n)e^{-j\omega n}
\eqsp \sum_{m=\infty}^{-\infty} x(m)e^{-j(-\omega) m}
\eqsp X(-\omega) \\ [5pt]
&\isdef & \hbox{\sc Flip}_\omega(X)
\end{eqnarray*}

Arguably, $ \hbox{\sc Flip}(x)$ should include complex conjugation. Let

$\displaystyle \hbox{\sc Flip}_n'(x)\isdefs \overline{\hbox{\sc Flip}_n(x)}\,\mathrel{\mathop=}\,\overline{x(-n)}$ (3.12)

denote such a definition. Then in this case we have

$\displaystyle \zbox {\hbox{\sc Flip}'(x) \;\longleftrightarrow\;\overline{X}}$ (3.13)


Proof:

$\displaystyle \hbox{\sc DTFT}_\omega(\hbox{\sc Flip}'(x)) \isdefs \sum_{n=-\infty}^{\infty} \overline{x(-n)}e^{-j\omega n} \eqsp \sum_{m=\infty}^{-\infty} \overline{x(m)e^{-j\omega m}} \isdefs \overline{X(\omega)}$ (3.14)

In the typical special case of real signals ( $ x(n)\in{\bf R}$ ), we have $ \hbox{\sc Flip}(x)=\hbox{\sc Flip}'(x)$ so that

$\displaystyle \zbox {\hbox{\sc Flip}(x) \;\longleftrightarrow\;\overline{X}.}$ (3.15)

That is, time-reversing a real signal conjugates its spectrum.


Symmetry of the DTFT for Real Signals

Most (if not all) of the signals we deal with in practice are real signals. Here we note some spectral symmetries associated with real signals.

DTFT of Real Signals

The previous section established that the spectrum $ X$ of every real signal $ x$ satisfies

$\displaystyle \hbox{\sc Flip}(X)\eqsp \overline{X}.$ (3.16)

I.e.,

$\displaystyle \zbox {x(n)\in{\bf R}\;\longleftrightarrow\;X(-\omega) = \overline{X(\omega)}.}$ (3.17)

In other terms, if a signal $ x(n)$ is real, then its spectrum is Hermitian (``conjugate symmetric''). Hermitian spectra have the following equivalent characterizations:
  • The real part is even, while the imaginary part is odd:

    \begin{eqnarray*}
\mbox{re}\left\{X(-\omega)\right\} &=& \mbox{re}\left\{X(\omega)\right\}\\
\mbox{im}\left\{X(-\omega)\right\} &=& -\mbox{im}\left\{X(\omega)\right\}
\end{eqnarray*}

  • The magnitude is even, while the phase is odd:

    \begin{eqnarray*}
\left\vert X(-\omega)\right\vert &=& \left\vert X(\omega)\right\vert\\
\angle{X(-\omega)} &=& -\angle{X(\omega)}
\end{eqnarray*}

Note that an even function is symmetric about argument zero while an odd function is antisymmetric about argument zero.


Real Even (or Odd) Signals

If a signal is even in addition to being real, then its DTFT is also real and even. This follows immediately from the Hermitian symmetry of real signals, and the fact that the DTFT of any even signal is real:

\begin{eqnarray*}
\hbox{\sc DTFT}_\omega(x)
& \isdef & \sum_{n=-\infty}^{\infty}x(n) e^{-j\omega n}\\
& = & \sum_{n=-\infty}^{\infty}x(n) \left[\cos(\omega n) + j\sin(\omega n)\right]\\
& = & \sum_{n=-\infty}^{\infty}x(n) \cos(\omega n) + j\sum_{n=-\infty}^{\infty}x(n)\sin(\omega n)\\
& = & \sum_{n=-\infty}^{\infty}x(n) \cos(\omega n)\\
& = & \hbox{real and even}
\end{eqnarray*}

This is true since cosine is even, sine is odd, even times even is even, even times odd is odd, and the sum over all samples of an odd signal is zero. I.e.,

\begin{eqnarray*}
\sum_{n=-\infty}^{\infty}x(n)\cos(\omega n)
&=& \sum_{n=-\infty}^{\infty}\hbox{(even in $n$)}\;\cdot\;\hbox{(doubly even)}\\
&=& \sum_{n=-\infty}^{\infty}\hbox{(doubly even)} = \hbox{(even in $\omega$)}
\end{eqnarray*}

and

\begin{eqnarray*}
\sum_{n=-\infty}^{\infty}x(n)\sin(\omega n)
&=& \sum_{n=-\infty}^{\infty}\hbox{(even in $n$)}\;\cdot\;\hbox{(doubly odd)}\\
&=& \sum_{n=-\infty}^{\infty}\hbox{(doubly odd)} = 0.
\end{eqnarray*}

If $ x$ is real and even, the following are true:

\begin{eqnarray*}
\hbox{\sc Flip}(x) & = & x \qquad \hbox{($x(-n)=x(n)$)}\\
\overline{x} & = & x\\ [5pt]
\hbox{\sc Flip}(X) & = & X\\
\overline{X} & = & X\\
\angle X(\omega) & =& 0 \hbox{ or } \pi
\end{eqnarray*}

Similarly, if a signal is odd and real, then its DTFT is odd and purely imaginary. This follows from Hermitian symmetry for real signals, and the fact that the DTFT of any odd signal is imaginary.

\begin{eqnarray*}
\hbox{\sc DTFT}_\omega(x)
& \isdef & \sum_{n=-\infty}^{\infty}x(n) e^{-j\omega n}\\
& = & \sum_{n=-\infty}^{\infty}x(n) \cos(\omega n) + j\sum_{n=-\infty}^{\infty}x(n)\sin(\omega n)\\
& = & j\sum_{n=-\infty}^{\infty}x(n) \sin(\omega n)\\
& = & \hbox{imaginary and odd}
\end{eqnarray*}

where we used the fact that

\begin{eqnarray*}
\sum_{n=-\infty}^{\infty}x(n)\cos(\omega n)
&=& \sum_{n=-\infty}^{\infty}\hbox{(odd in $n$)}\;\cdot\;\hbox{(doubly even)}\\
&=& \sum_{n=-\infty}^{\infty}\hbox{(odd in $n$, even in $\omega$)} = 0
\end{eqnarray*}

and

\begin{eqnarray*}
\sum_{n=-\infty}^{\infty}x(n)\sin(\omega n)
&=& \sum_{n=-\infty}^{\infty}\hbox{(odd in $n$)}\;\cdot\;\hbox{(doubly odd)}\\
&=& \sum_{n=-\infty}^{\infty}\hbox{(even in $n$, odd in $\omega$)} = \hbox{(odd in $\omega$)}.
\end{eqnarray*}


Shift Theorem for the DTFT

We define the shift operator for sampled signals $ x(n)$ by

$\displaystyle \hbox{\sc Shift}_{l,n}(x) \isdefs x(n-l)$ (3.18)

where $ l$ is any integer ( $ l\in{\bf Z}$ ). Thus, $ \hbox{\sc Shift}_l(x)$ is a right-shift or delay by $ l$ samples.

The shift theorem states3.5

$\displaystyle \zbox {\hbox{\sc Shift}_l(x) \;\longleftrightarrow\;e^{-j(\cdot)l}X},$ (3.19)

or, in operator notation,

$\displaystyle \hbox{\sc DTFT}_\omega[\hbox{\sc Shift}_l(x)] \eqsp \left( e^{-j\omega l} \right) X(\omega)$ (3.20)


Proof:

\begin{eqnarray*}
\hbox{\sc DTFT}_\omega[\hbox{\sc Shift}_l(x)] &\isdef & \sum_{n=-\infty}^{\infty}x(n-l) e^{-j \omega n} \\
&=& \sum_{m=-\infty}^{\infty} x(m) e^{-j \omega (m+l)}
\qquad(m\isdef n-l) \\
&=& \sum_{m=-\infty}^{\infty}x(m) e^{-j \omega m} e^{-j \omega l} \\
&=& e^{-j \omega l} \sum_{m=-\infty}^{\infty}x(m) e^{-j \omega m} \\
&\isdef & e^{-j \omega l} X(\omega)
\end{eqnarray*}

Note that $ e^{-j\omega l}$ is a linear phase term, so called because it is a linear function of frequency with slope equal to $ -l$ :

$\displaystyle \angle \left(e^{-j \omega l}\right) \eqsp -\omega l$ (3.21)

The shift theorem gives us that multiplying a spectrum $ X(\omega)$ by a linear phase term $ e^{-j\omega l}$ corresponds to a delay in the time domain by $ l$ samples. If $ l<0$ , it is called a time advance by $ \vert l\vert$ samples.


Convolution Theorem for the DTFT

The convolution of discrete-time signals $ x$ and $ y$ is defined as

$\displaystyle (x \ast y)(n) \isdefs \sum_{m=-\infty}^\infty x(m)y(n-m).$ (3.22)

This is sometimes called acyclic convolution to distinguish it from the cyclic convolution used for length $ N$ sequences in the context of the DFT [264]. Convolution is cyclic in the time domain for the DFT and FS cases (i.e., whenever the time domain has a finite length), and acyclic for the DTFT and FT cases.3.6

The convolution theorem is then

$\displaystyle \zbox {(x\ast y) \;\longleftrightarrow\;X \cdot Y}$ (3.23)

That is, convolution in the time domain corresponds to pointwise multiplication in the frequency domain.


Proof: The result follows immediately from interchanging the order of summations associated with the convolution and DTFT:

\begin{eqnarray*}
\hbox{\sc DTFT}_\omega(x\ast y) &\isdef & \sum_{n=-\infty}^{\infty}(x\ast y)_n e^{-j\omega n} \\
&\isdef & \sum_{n=-\infty}^{\infty}\sum_{m=-\infty}^{\infty}x(m) y(n-m) e^{-j\omega n} \\
&=& \sum_{m=-\infty}^{\infty}x(m) \sum_{n=-\infty}^{\infty}\underbrace{y(n-m) e^{-j\omega n}}_{e^{-j\omega m}Y(k)} \\
&=& \left(\sum_{m=-\infty}^{\infty}x(m) e^{-j\omega m}\right)Y(\omega)\quad\mbox{(by the shift theorem)}\\
&\isdef & X(\omega)Y(\omega)
\end{eqnarray*}


Correlation Theorem for the DTFT

We define the correlation of discrete-time signals $ x$ and $ y$ by

$\displaystyle \zbox {(x\star y)_n \isdefs \sum_m \overline{x(m)} y(m+n)}
$

The correlation theorem for DTFTs is then

$\displaystyle \zbox {x\star y \;\longleftrightarrow\;\overline{X}\cdot Y}
$


Proof:

\begin{eqnarray*}
(x\star y)_n
&\isdef & \sum_m \overline{x(m)}y(n+m) \\
&=& \sum_m \overline{x(-m)}y(n-m) \qquad (m\leftarrow -m)\\
&=& \left(\hbox{\sc Flip}(\overline{x})\ast y\right)_n \\
&\;\longleftrightarrow\;& \overline{X} \cdot Y
\end{eqnarray*}

where the last step follows from the convolution theorem of §2.3.5 and the symmetry result $ \hbox{\sc Flip}(\overline{x}) \;\longleftrightarrow\;
\overline{X}$ of §2.3.2.


Autocorrelation

The autocorrelation of a signal $ x$ is simply the cross-correlation of $ x$ with itself:

$\displaystyle (x \star x)(n) \isdefs \sum_m\overline{x(m)}x(m+n).$ (3.24)

From the correlation theorem, we have

$\displaystyle \zbox {(x \star x) \;\longleftrightarrow\;\vert X\vert^2}
$

Note that this definition of autocorrelation is appropriate for signals having finite support (nonzero over a finite number of samples). For infinite-energy (but finite-power) signals, such as stationary noise processes, we define the sample autocorrelation to include a normalization suitable for this case (see Chapter 6 and Appendix C).

From the autocorrelation theorem we have that a digital-filter impulse-response $ h(n)$ is that of a lossless allpass filter [263] if and only if $ h\star h = \delta \,\leftrightarrow\, 1$ . In other words, the autocorrelation of the impulse-response of every allpass filter is impulsive.


Power Theorem for the DTFT

The inner product of two signals may be defined in the time domain by [264]

$\displaystyle \left<x,y\right> \isdefs \sum_{n=-\infty}^{\infty} x_n\overline{y}_n.$ (3.25)

The inner product of two spectra may be defined as

$\displaystyle \left<X,Y\right> \isdefs \frac{1}{2\pi}\int_{-\pi}^\pi X(\omega)\overline{Y(\omega)} d\omega.$ (3.26)

Note that the frequency-domain inner product includes a normalization factor while the time-domain definition does not.

Using inner-product notation, the power theorem (or Parseval's theorem [202]) for DTFTs can be stated as follows:

$\displaystyle \zbox {\left<x,y\right> \eqsp \left<X,Y\right>}$ (3.27)

That is, the inner product of two signals in the time domain equals the inner product of their respective spectra (a complex scalar in general).

When we consider the inner product of a signal with itself, we have the special case known as the energy theorem (or Rayleigh's energy theorem):

$\displaystyle \Vert x \Vert^2 \eqsp \left<x,x\right> \eqsp \left<X,X\right> \eqsp \Vert X \Vert^2$ (3.28)

where $ \Vert\,\cdot\,\Vert$ denotes the $ L2$ norm induced by the inner product. It is always real.


Proof:

\begin{eqnarray*}
\left<x,y\right> &\isdef & \sum_{n=-\infty}^{\infty} x(n)\overline{y(n)}
\eqsp (y\star x)_0
\eqsp \hbox{\sc DTFT}_0^{-1}(\overline{Y}\cdot X) \\
&=& \frac{1}{2\pi} \int_{-\pi}^{\pi} X(\omega)\overline{Y(\omega)} d\omega
\isdefs \left<X,Y\right>
\end{eqnarray*}


Stretch Operator

We define the stretch operator in the time domain by

$\displaystyle \hbox{\sc Stretch}_{L,n}(x) \isdefs \left\{\begin{array}{ll} x\left(\frac{n}{L}\right), & n = 0\;(\hbox{\sc mod}\ L) \\ [5pt] 0, & \hbox{otherwise} \\ \end{array} \right..$ (3.29)

In other terms, we stretch a sampled signal by the factor $ L$ by inserting $ L-1$ zeros in between each pair of samples of the signal.

Figure 2.1: Illustration of the stretch operator.
\includegraphics[width=4in]{eps/stretch2}

In the literature on multirate filter banks (see Chapter 11), the stretch operator is typically called instead the upsampling operator. That is, stretching a signal by the factor of $ K$ is called upsampling the signal by the factor $ K$ . (See §11.1.1 for the graphical symbol ( $ \uparrow K$ ) and associated discussion.) The term ``stretch'' is preferred in this book because ``upsampling'' is easily confused with ``increasing the sampling rate''; resampling a signal to a higher sampling rate is conceptually implemented by a stretch operation followed by an ideal lowpass filter which moves the inserted zeros to their properly interpolated values.

Note that we could also call the stretch operator the scaling operator, to unify the terminology in the discrete-time case with that of the continuous-time case (§2.4.1 below).


Repeat (Scaling) Operator

We define the repeat operator in the frequency domain as a scaling of frequency axis by some integer factor $ L>0$ :

$\displaystyle \hbox{\sc Repeat}_{L,\nu}(X) \isdefs X(L\omega), \quad \omega\in\left[-\frac{\pi}{L},\frac{\pi}{L}\right),$ (3.30)

where $ \nu=L\omega\in[-\pi,\pi)$ denotes the radian frequency variable after applying the repeat operator.

The repeat operator maps the entire unit circle (taken as $ -\pi$ to $ \pi$ ) to a segment of itself $ [-\pi/L,\pi/L)$ , centered about $ \omega
= 0$ , and repeated $ L$ times. This is illustrated in Fig.2.2 for $ L=3$ .

\begin{psfrags}
% latex2html id marker 7156\psfrag{w}{\Large $\omega$}\begin{figure}[htbp]
\includegraphics[width=\twidth]{eps/repeat2}
\caption{Illustration of the repeat operator.}
\end{figure}
\end{psfrags}

Since the frequency axis is continuous and $ 2\pi$ -periodic for DTFTs, the repeat operator is precisely equivalent to a scaling operator for the Fourier transform case (§B.4). We call it ``repeat'' rather than ``scale'' because we are restricting the scale factor to positive integers, and because the name ``repeat'' describes more vividly what happens to a periodic spectrum that is compressively frequency-scaled over the unit circle by an integer factor.


Stretch/Repeat (Scaling) Theorem

Using these definitions, we can compactly state the stretch theorem:

$\displaystyle \zbox {\hbox{\sc Stretch}_L(x) \;\longleftrightarrow\;\hbox{\sc Repeat}_L(X)}$ (3.31)


Proof:

\begin{eqnarray*}
\hbox{\sc DTFT}_\omega[\hbox{\sc Stretch}_L(x)]
&\isdef & \sum_{n=-\infty}^{\infty}\hbox{\sc Stretch}_{L,n}(x)e^{-j\omega n}\\
&=& \sum_{m=-\infty}^{\infty}x(m)e^{-j\omega m L}\qquad \hbox{($m\isdef n/L$)}\\
&\isdef & X(\omega L)
\end{eqnarray*}

As $ \omega$ traverses the interval $ [-\pi,\pi)$ , $ X(\omega L)$ traverses the unit circle $ L$ times, thus implementing the repeat operation on the unit circle. Note also that when $ \omega
= 0$ , we have $ \omega L = 0$ , so that dc always maps to dc. At half the sampling rate $ \omega=\pm\pi$ , on the other hand, after the mapping, we may have either $ Y(\pi)=X(-\pi)$ ($ L$ odd), or $ X(0)$ ($ L$ even), where $ Y(\omega)
\isdeftext X(\omega L)$ .

The stretch theorem makes it clear how to do ideal sampling-rate conversion for integer upsampling ratios $ L$ : We first stretch the signal by the factor $ L$ (introducing $ L-1$ zeros between each pair of samples), followed by an ideal lowpass filter cutting off at $ \pi/L$ . That is, the filter has a gain of 1 for $ \left\vert\omega\right\vert <\pi/L$ , and a gain of 0 for $ \pi/L < \left\vert\omega\right\vert
\leq \pi$ . Such a system (if it were realizable) implements ideal bandlimited interpolation of the original signal by the factor $ L$ .

The stretch theorem is analogous to the scaling theorem for continuous Fourier transforms (introduced in §2.4.1 below).


Downsampling and Aliasing

The downsampling operator $ \hbox{\sc Downsample}_M$ selects every $ M^{th}$ sample of a signal:

$\displaystyle \zbox {\hbox{\sc Downsample}_{M,n}(x) \isdefs x(Mn)}$ (3.32)

The aliasing theorem states that downsampling in time corresponds to aliasing in the frequency domain:

$\displaystyle \zbox {\hbox{\sc Downsample}_M(x) \;\longleftrightarrow\;\frac{1}{M} \hbox{\sc Alias}_M(X)}$ (3.33)

where the $ \hbox{\sc Alias}$ operator is defined as

$\displaystyle \zbox {\hbox{\sc Alias}_{M,\omega}(X) \isdefs \sum_{k=0}^{M-1} X\left(\omega+k\frac{2\pi}{M}\right)}$ (3.34)

for $ \omega\in[-\pi,\pi)$ . The summation terms for $ k\neq 0$ are called aliasing components.

In z transform notation, the $ \hbox{\sc Alias}$ operator can be expressed as [287]

$\displaystyle \hbox{\sc Alias}_{M,z}(X) \eqsp \sum_{k=0}^{M-1} X\left(W_M^k z^\frac{1}{M}\right)$ (3.35)

where $ W_M\isdeftext e^{j2\pi/M}$ is a common notation for the primitive $ M$ th root of unity. On the unit circle of the $ z$ plane, this becomes

$\displaystyle \hbox{\sc Alias}_{M,\omega}(X) \eqsp \sum_{k=0}^{M-1} X\left[e^{j\left(\frac{\omega}{M} + k\frac{2\pi}{M}\right)}\right], \quad -\pi\leq \omega < \pi.$ (3.36)

The frequency scaling corresponds to having a sampling interval of $ T=1$ after downsampling, which corresponds to the interval $ T=1/M$ prior to downsampling.

The aliasing theorem makes it clear that, in order to downsample by factor $ M$ without aliasing, we must first lowpass-filter the spectrum to $ (-\pi / M, \pi / M)$ . This filtering (when ideal) zeroes out the spectral regions which alias upon downsampling.

Note that any rational sampling-rate conversion factor $ \rho = L/M$ may be implemented as an upsampling by the factor $ L$ followed by downsampling by the factor $ M$ [50,287]. Conceptually, a stretch-by-$ L$ is followed by a lowpass filter cutting off at $ \omega_c \isdeftext \pi/(L\;\max\;M)$ , followed by downsample-by-$ L$ , i.e.,

$\displaystyle x^\prime \eqsp \hbox{\sc Downsample}_M\{\hbox{\sc Lowpass}_{\omega_c}[\hbox{\sc Stretch}_L(x)]\}$ (3.37)

In practice, there are more efficient algorithms for sampling-rate conversion [270,135,78] based on a more direct approach to bandlimited interpolation.

Proof of Aliasing Theorem

To show:

$\displaystyle \zbox {\hbox{\sc Downsample}_N(x) \;\longleftrightarrow\;\frac{1}{N} \hbox{\sc Alias}_N(X)}
$

or
\fbox{$x(nN) \;\longleftrightarrow\;\dfrac{1}{N} \displaystyle\sum_{m=0}^{N-1} X\left(e^{j2\pi m/N} z^{1/N}\right)$}

From the DFT case [264], we know this is true when $ x$ and $ X$ are each complex sequences of length $ N_s$ , in which case $ y$ and $ Y$ are length $ N_s/N$ . Thus,

$\displaystyle x(nN) \;\longleftrightarrow\; Y(\omega_k N) \eqsp \frac{1}{N} \sum_{m=0}^{N-1} X\left(\omega_k + \frac{2\pi}{N} m \right), \; k\in [0,N_s/N)$ (3.38)

where we have chosen to keep frequency samples $ \omega_k$ in terms of the original frequency axis prior to downsampling, i.e., $ \omega_k =
2\pi k/ N_s$ for both $ X$ and $ Y$ . This choice allows us to easily take the limit as $ N_s\to\infty$ by simply replacing $ \omega_k$ by $ \omega$ :

$\displaystyle x(nN) \;\longleftrightarrow\; Y(\omega N) \eqsp \frac{1}{N} \sum_{m=0}^{N-1} X\left(\omega + \frac{2\pi}{N} m \right), \; \omega\in[0,2\pi/N)$ (3.39)

Replacing $ \omega$ by $ \omega^\prime =\omega N$ and converting to $ z$ -transform notation $ X(z)$ instead of Fourier transform notation $ X(\omega)$ , with $ z=e^{j\omega^\prime }$ , yields the final result.


Differentiation Theorem Dual


Theorem: Let $ x(n)$ denote a signal with DTFT $ X(e^{j\omega})$ , and let

$\displaystyle X^\prime(e^{j\omega}) \isdefs \frac{d}{d\omega} X(e^{j\omega})$ (3.40)

denote the derivative of $ X$ with respect to $ \omega$ . Then we have

$\displaystyle \zbox {-jn x(n) \;\longleftrightarrow\;\frac{d}{d\omega}X(e^{j\omega})}
$

where $ X(e^{j\omega})$ denotes the DTFT of $ x(n)$ .


Proof: Using integration by parts, we obtain

\begin{eqnarray*}
\hbox{\sc IDTFT}_{n}(X^\prime)
&\isdef & \frac{1}{2\pi}\int_{-\pi}^\pi X^\prime(e^{j\omega}) e^{j\omega n} d\omega\\
&=& \left. \frac{1}{2\pi}X(e^{j\omega})e^{j\omega t}\right\vert _{-\pi}^{\pi} -
\frac{1}{2\pi}\int_{-\pi}^\pi X(e^{j\omega}) (jn)e^{j\omega n} d\omega\\
&=& -jn x(n).
\end{eqnarray*}

An alternate method of proof is given in §B.3.

Corollary: Perhaps a cleaner statement is as follows:

$\displaystyle \zbox {- n x(n) \;\longleftrightarrow\;\frac{d}{d(j\omega)}X(e^{j\omega})}
$

This completes our coverage of selected DTFT theorems. The next section adds some especially useful FT theorems having no precise counterpart in the DTFT (discrete-time) case.


Continuous-Time Fourier Theorems

Selected Fourier theorems for the continuous-time case are stated and proved in Appendix B. However, two are sufficiently important that we state them here.


Scaling Theorem

The scaling theorem (or similarity theorem) says that if you horizontally ``stretch'' a signal by the factor $ \alpha $ in the time domain, you ``squeeze'' and amplify its Fourier transform by the same factor in the frequency domain. This is an important general Fourier duality relationship:


Theorem: For all continuous-time functions $ x(t)$ possessing a Fourier transform,

$\displaystyle \zbox {\hbox{\sc Stretch}_\alpha(x) \;\longleftrightarrow\;\left\vert\alpha\right\vert\hbox{\sc Stretch}_{(1/\alpha)}(X)}
$

where

$\displaystyle \hbox{\sc Stretch}_{\alpha,t}(x) \isdefs x\left(\frac{t}{\alpha}\right)
$

and $ \alpha $ is any nonzero real number (the abscissa stretch factor). A more commonly used notation is the following:

$\displaystyle \zbox {x\left(\frac{t}{\alpha}\right) \;\longleftrightarrow\; \left\vert\alpha\right\vert\cdot X(\alpha\omega)}$ (3.41)


Proof: See §B.4.

The scaling theorem is fundamentally restricted to the continuous-time, continuous-frequency (Fourier transform) case. The closest we come to the scaling theorem among the DTFT theorems (§2.3) is the stretch (repeat) theorem (page [*]). For this and other continuous-time Fourier theorems, see Appendix B.


Spectral Roll-Off


Definition: A function $ W(\omega)$ is said to be of order $ 1/\omega^{n+1}$ if there exist $ \omega_0$ and some positive constant $ M<\infty$ such that $ \left\vert W(\omega)\right\vert<M/w^{n+1}$ for all $ \omega > \omega_0$ .


Theorem: (Riemann Lemma): If the derivatives up to order $ n$ of a function $ w(t)$ exist and are of bounded variation, then its Fourier Transform $ W(\omega)$ is asymptotically of order $ 1/\omega^{n+1}$ , i.e.,

$\displaystyle W(\omega) = {\cal O}\left(\frac{1}{\omega^{n+1}}\right), \quad(\hbox{as }\omega\to\infty)$ (3.42)


Proof: See §B.18.


Spectral Interpolation

The need for spectral interpolation comes up in many situations. For example, we always use the DFT in practice, while conceptually we often prefer the DTFT. For time-limited signals, that is, signals which are zero outside some finite range, the DTFT can be computed from the DFT via spectral interpolation. Conversely, the DTFT of a time-limited signal can be sampled to obtain its DFT.3.7Another application of DFT interpolation is spectral peak estimation, which we take up in Chapter 5; in this situation, we start with a sampled spectral peak from a DFT, and we use interpolation to estimate the frequency of the peak more accurately than what we get by rounding to the nearest DFT bin frequency.

The following sections describe the theoretical and practical details of ideal spectral interpolation.

Ideal Spectral Interpolation

Ideally, the spectrum of any signal $ x(n)$ at any frequency $ \omega =
2\pi f$ is obtained by projecting the signal $ x$ onto the zero-phase, unit-amplitude, complex sinusoid at frequency $ \omega$ [264]:

$\displaystyle X(\omega) \isdef \left<x,s_\omega\right>,$ (3.43)

where

\begin{eqnarray*}
s_\omega(t) &\isdef & e^{j\omega t}\qquad\qquad\qquad\quad\;\,\mbox{(Fourier Transform)}\\
s_\omega(t_n) &\isdef & e^{j\omega_k t_n} \isdefs e^{j2\pi nk/N} \quad\mbox{(DFT)} \\
s_\omega(t_n) &\isdef & e^{j\omega t_n} \isdefs e^{j\omega n} \quad\qquad\;\mbox{(DTFT)}.
\end{eqnarray*}

Thus, for signals in the DTFT domain which are time limited to $ n\in[-N/2,N/2-1]$ , we obtain

$\displaystyle X(\omega) \isdefs \left<x,s_\omega\right> = \sum_{n=-\infty}^\infty x(n) e^{-j\omega n} = \sum_{n=-N/2}^{N/2-1} x(n) e^{-j\omega n}.$ (3.44)

This can be thought of as a zero-centered DFT evaluated at $ \omega\in[-\pi,\pi)$ instead of $ \omega_k =
2\pi k/N$ for some $ k\in[0,N-1]$ . It arises naturally from taking the DTFT of a finite-length signal. Such time-limited signals may be said to have ``finite support'' [175].


Interpolating a DFT

Starting with a sampled spectrum $ X(\omega_k)$ , $ k=0,1,\ldots,N-1$ , typically obtained from a DFT, we can interpolate by taking the DTFT of the IDFT which is not periodically extended, but instead zero-padded [264]:3.8

\begin{eqnarray*}
X(\omega) &=& \hbox{\sc DTFT}(\hbox{\sc ZeroPad}_{\infty}(\hbox{\sc IDFT}_N(X)))\\
&\isdef & \sum_{n=-N/2}^{N/2-1}\left[\frac{1}{N}\sum_{k=0}^{N-1}X(\omega_k)
e^{j\omega_k n}\right]e^{-j\omega n}\\
&=& \sum_{k=0}^{N-1}X(\omega_k)
\left[\frac{1}{N}\sum_{n=-N/2}^{N/2-1} e^{j(\omega_k-\omega) n}\right]\\
&=& \sum_{k=0}^{N-1}X(\omega_k)\,\hbox{asinc}_N(\omega-\omega_k)\\
&=& \left<X,\hbox{\sc Sample}_N\{\hbox{\sc Shift}_{\omega}(\hbox{asinc}_N)\}\right>
\end{eqnarray*}

(The aliased sinc function, $ \hbox{asinc}_N(\omega)$ , is derived in §3.1.) Thus, zero-padding in the time domain interpolates a spectrum consisting of $ N$ samples around the unit circle by means of `` $ \hbox{asinc}_N$ interpolation.'' This is ideal, time-limited interpolation in the frequency domain using the aliased sinc function as an interpolation kernel. We can almost rewrite the last line above as $ X(\omega)=(X\ast \hbox{asinc}_N)_\omega$ , but such an expression would normally be defined only for $ \omega=\omega_l=2\pi l/N$ , where $ l$ is some integer, since $ X$ is discrete while $ \hbox{asinc}_N$ is continuous.

Figure F.1 lists a matlab function for performing ideal spectral interpolation directly in the frequency domain. Such an approach is normally only used when non-uniform sampling of the frequency axis is needed. For uniform spectral upsampling, it is more typical to take an inverse FFT, zero pad, then a longer FFT, as discussed further in the next section.


Zero Padding in the Time Domain

Unlike time-domain interpolation [270], ideal spectral interpolation is very easy to implement in practice by means of zero padding in the time domain. That is,

$\textstyle \parbox{0.8\textwidth}{\emph{Zero padding} in the time domain corresponds to \emph{ideal
interpolation} in the frequency domain.}$
Since the frequency axis (the unit circle in the $ z$ plane) is finite in length, ideal interpolation can be implemented exactly to within numerical round-off error. This is quite different from ideal (band-limited) time-domain interpolation, in which the interpolation kernel is sinc$ (\pi f_st)$ ; the sinc function extends to plus and minus infinity in time, so it can never be implemented exactly in practice.3.9

Practical Zero Padding

To interpolate a uniformly sampled spectrum $ X(\omega_k)$ , $ k=0,1,2,\ldots,N-1$ by the factor $ L$ , we may take the length $ N$ inverse DFT, append $ N\cdot(L-1)$ zeros to the time-domain data, and take a length $ NL$ DFT. If $ NL$ is a power of two, then so is $ N$ and we can use a Cooley-Tukey FFT for both steps (which is very fast):

$\displaystyle X(\omega_l) = \hbox{\sc FFT}_{NL,l}(\hbox{\sc ZeroPad}_{LN}(\hbox{\sc IFFT}_N(X))),\quad l=0,\ldots,LN-1$ (3.45)

This operation creates $ L-1$ new bins between each pair of original bins in $ X$ , thus increasing the number of spectral samples around the unit circle from $ N$ to $ LN$ . An example for $ L=2$ is shown in Fig.2.4 (compare to Fig.2.3).

In matlab, we can specify zero-padding by simply providing the optional FFT-size argument:

X = fft(x,N);  % FFT size N > length(x)


Zero-Padding to the Next Higher Power of 2

Another reason we zero-pad is to be able to use a Cooley-Tukey FFT with any window length $ M$ . When $ M$ is not a power of $ 2$ , we append enough zeros to make the FFT size $ N>M$ be a power of $ 2$ . In Matlab and Octave, the function nextpow2 returns the next higher power of 2 greater than or equal to its argument:

N = 2^nextpow2(M); % smallest M-compatible FFT size


Zero-Padding for Interpolating Spectral Displays

Suppose we perform spectrum analysis on some sinusoid using a length $ M$ window. Without zero padding, the DFT length is $ N=M$ . We may regard the DFT as a critically sampled DTFT (sampled in frequency). Since the bin separation in a length-$ N$ DFT is $ 2\pi/N$ , and the zero-crossing interval for Blackman-Harris side lobes is $ 2\pi/M$ , we see that there is one bin per side lobe in the sampled window transform. These spectral samples are illustrated for a Hamming window transform in Fig.2.3b. Since $ K=4$ in Table 5.2, the main lobe is 4 samples wide when critically sampled. The side lobes are one sample wide, and the samples happen to hit near some of the side-lobe zero-crossings, which could be misleading to the untrained eye if only the samples were shown. (Note that the plot is clipped at -60 dB.)

Figure 2.3: (a) Hamming window. (b) Critically sampled sinusoidal peak = frequency-shifted Hamming-window transform.
\includegraphics[width=\twidth]{eps/spectsamps}

If we now zero pad the Hamming-window by a factor of 2 (append 21 zeros to the length $ M=21$ window and take an $ N=42$ point DFT), we obtain the result shown in Fig.2.4. In this case, the main lobe is 8 samples wide, and there are two samples per side lobe. This is significantly better for display even though there is no new information in the spectrum relative to Fig.2.3.3.10

Incidentally, the solid lines in Fig.2.3b and 2.4b indicating the ``true'' DTFT were computed using a zero-padding factor of $ L=N/M=1000$ , and they were virtually indistinguishable visually from $ L=100$ . ($ L=10$ is not enough.)

Figure 2.4: (a) Hamming window zero-padded by a factor of 2. (b) $ 2\times $ -oversampled sinusoidal peak = frequency-shifted Hamming-window transform.
\includegraphics[width=\twidth]{eps/spectsamps2}


Zero-Padding for Interpolating Spectral Peaks

For sinusoidal peak-finding, spectral interpolation via zero-padding gets us closer to the true maximum of the main lobe when we simply take the maximum-magnitude FFT-bin as our estimate.

The examples in Fig.2.5 show how zero-padding helps in clarifying the true peak of the sampled window transform. With enough zero-padding, even very simple interpolation methods, such as quadratic polynomial interpolation, will give accurate peak estimates.

Figure 2.5: Illustration of ideal interpolation in the frequency domain as a result of zero padding in the time domain.
\includegraphics[width=0.7\twidth]{eps/spectsamps3}

Another illustration of zero-padding appears in Section 8.1.3 of [264].


Zero-Phase Zero Padding

The previous zero-padding example used the causal Hamming window, and the appended zeros all went to the right of the window in the FFT input buffer (see Fig.2.4a). When using zero-phase FFT windows (usually the best choice), the zero-padding goes in the middle of the FFT buffer, as we now illustrate.

We look at zero-phase zero-padding using a Blackman window3.3.1) which has good, though suboptimal, characteristics for audio work.3.11

Figure 2.6a shows a windowed segment of some sinusoidal data, with the window also shown as an envelope. Figure 2.6b shows the same data loaded into an FFT input buffer with a factor of 2 zero-phase zero padding. Note that all time is ``modulo $ N$ '' for a length $ N$ FFT. As a result, negative times $ -n$ map to $ N-n$ in the FFT input buffer.

Figure 2.6: (a) Blackman window overlaid with windowed data. b) Zero-padded windowed data loaded into the FFT input buffer.
\includegraphics[width=\twidth]{eps/zpblackmanT}

Figure 2.7a shows the result of performing an FFT on the data of Fig.2.6b. Since frequency indices are also modulo $ N$ , the negative-frequency bins appear in the right half of the buffer. Figure 2.6b shows the same data ``rotated'' so that bin number is in order of physical frequency from $ -f_s/2$ to $ f_s/2$ . If $ k$ is the bin number, then the frequency in Hz is given by $ k
f_s/N$ , where $ f_s$ denotes the sampling rate and $ N$ is the FFT size.

Figure 2.7: (a) FFT magnitude data, as returned by the FFT. (b) FFT magnitude spectrum ``rotated'' to a more ``physical'' frequency axis in bin numbers.
\includegraphics[width=\twidth]{eps/zpblackmanF}

The Matlab script for creating Figures 2.6 and 2.7 is listed in in §F.1.1.

Matlab/Octave fftshift utility

Matlab and Octave have a simple utility called fftshift that performs this bin rotation. Consider the following example:

octave:4>
fftshift([1 2 3 4])
ans =
  3  4  1  2
octave:5>
If the vector [1 2 3 4] is the output of a length 4 FFT, then the first element (1) is the dc term, and the third element (3) is the point at half the sampling rate ($ f_s/2$ ), which can be taken to be either plus or minus $ f_s/2$ since they are the same point on the unit circle in the $ z$ plane. Elements 2 and 4 are plus and minus $ f_s/4$ , respectively. After fftshift, element (3) is first, which indicates that both Matlab and Octave regard the spectral sample at half the sampling rate as a negative frequency. The next element is 4, corresponding to frequency $ -f_s/4$ , followed by dc and $ f_s/4$ .

Another reasonable result would be fftshift([1 2 3 4]) == [4 1 2 3], which defines half the sampling rate as a positive frequency. However, giving $ f_s/2$ to the negative frequencies balances giving dc to the positive frequencies, and the number of samples on both sides is then the same. For an odd-length DFT, there is no point at $ \pm f_s/2$ , so the result

octave:4>
fftshift([1 2 3])
ans =
  3  1  2
octave:5>
is the only reasonable answer, corresponding to frequencies $ -f_s/3,
0, f_s/3$ , respectively.


Index Ranges for Zero-Phase Zero-Padding

Having looked at zero-phase zero-padding ``pictorially'' in matlab buffers, let's now specify the index-ranges mathematically. Denote the window length by $ M$ (an odd integer) and the FFT length by $ N>M$ (a power of 2). Then the windowed data will occupy indices 0 to $ (M-1)/2$ (positive-time segment), and $ N-(M-1)/2$ to $ N-1$ (negative-time segment). Here we are assuming a 0-based indexing scheme as used in C or C++. We add 1 to all indices for matlab indexing to obtain 1:(M-1)/2+1 and N-(M-1)/2+1:N, respectively. The zero-padding zeros go in between these ranges, i.e., from $ (M-1)/2 + 1$ to $ N-(M-1)/2 - 1$ .


Summary

To summarize, zero-padding is used for

  • padding out to the next higher power of 2 so a Cooley-Tukey FFT can be used with any window length,
  • improving the quality of spectral displays, and
  • oversampling spectral peaks so that some simple final interpolation will be accurate.
In addition, we will learn in Chapter 8 that zero-padding is also necessary to accommodate spectral modifications in the short-time Fourier transform (STFT). This is because spectral modifications cause the time-domain signal to lengthen in time, and without sufficient zero-padding to accommodate it, there will be time aliasing in the reconstruction of the signal from the modified FFTs.

Some examples of interpolated spectral display by means of zero-padding may be seen in §3.4.


Next Section:
Spectrum Analysis Windows
Previous Section:
Introduction and Overview