Spectral Interpolation

The need for spectral interpolation comes up in many situations. For example, we always use the DFT in practice, while conceptually we often prefer the DTFT. For time-limited signals, that is, signals which are zero outside some finite range, the DTFT can be computed from the DFT via spectral interpolation. Conversely, the DTFT of a time-limited signal can be sampled to obtain its DFT.3.7Another application of DFT interpolation is spectral peak estimation, which we take up in Chapter 5; in this situation, we start with a sampled spectral peak from a DFT, and we use interpolation to estimate the frequency of the peak more accurately than what we get by rounding to the nearest DFT bin frequency.

The following sections describe the theoretical and practical details of ideal spectral interpolation.

Ideal Spectral Interpolation

Ideally, the spectrum of any signal $ x(n)$ at any frequency $ \omega =
2\pi f$ is obtained by projecting the signal $ x$ onto the zero-phase, unit-amplitude, complex sinusoid at frequency $ \omega$ [264]:

$\displaystyle X(\omega) \isdef \left<x,s_\omega\right>,$ (3.43)

where

\begin{eqnarray*}
s_\omega(t) &\isdef & e^{j\omega t}\qquad\qquad\qquad\quad\;\,\mbox{(Fourier Transform)}\\
s_\omega(t_n) &\isdef & e^{j\omega_k t_n} \isdefs e^{j2\pi nk/N} \quad\mbox{(DFT)} \\
s_\omega(t_n) &\isdef & e^{j\omega t_n} \isdefs e^{j\omega n} \quad\qquad\;\mbox{(DTFT)}.
\end{eqnarray*}

Thus, for signals in the DTFT domain which are time limited to $ n\in[-N/2,N/2-1]$ , we obtain

$\displaystyle X(\omega) \isdefs \left<x,s_\omega\right> = \sum_{n=-\infty}^\infty x(n) e^{-j\omega n} = \sum_{n=-N/2}^{N/2-1} x(n) e^{-j\omega n}.$ (3.44)

This can be thought of as a zero-centered DFT evaluated at $ \omega\in[-\pi,\pi)$ instead of $ \omega_k =
2\pi k/N$ for some $ k\in[0,N-1]$ . It arises naturally from taking the DTFT of a finite-length signal. Such time-limited signals may be said to have ``finite support'' [175].


Interpolating a DFT

Starting with a sampled spectrum $ X(\omega_k)$ , $ k=0,1,\ldots,N-1$ , typically obtained from a DFT, we can interpolate by taking the DTFT of the IDFT which is not periodically extended, but instead zero-padded [264]:3.8

\begin{eqnarray*}
X(\omega) &=& \hbox{\sc DTFT}(\hbox{\sc ZeroPad}_{\infty}(\hbox{\sc IDFT}_N(X)))\\
&\isdef & \sum_{n=-N/2}^{N/2-1}\left[\frac{1}{N}\sum_{k=0}^{N-1}X(\omega_k)
e^{j\omega_k n}\right]e^{-j\omega n}\\
&=& \sum_{k=0}^{N-1}X(\omega_k)
\left[\frac{1}{N}\sum_{n=-N/2}^{N/2-1} e^{j(\omega_k-\omega) n}\right]\\
&=& \sum_{k=0}^{N-1}X(\omega_k)\,\hbox{asinc}_N(\omega-\omega_k)\\
&=& \left<X,\hbox{\sc Sample}_N\{\hbox{\sc Shift}_{\omega}(\hbox{asinc}_N)\}\right>
\end{eqnarray*}

(The aliased sinc function, $ \hbox{asinc}_N(\omega)$ , is derived in §3.1.) Thus, zero-padding in the time domain interpolates a spectrum consisting of $ N$ samples around the unit circle by means of `` $ \hbox{asinc}_N$ interpolation.'' This is ideal, time-limited interpolation in the frequency domain using the aliased sinc function as an interpolation kernel. We can almost rewrite the last line above as $ X(\omega)=(X\ast \hbox{asinc}_N)_\omega$ , but such an expression would normally be defined only for $ \omega=\omega_l=2\pi l/N$ , where $ l$ is some integer, since $ X$ is discrete while $ \hbox{asinc}_N$ is continuous.

Figure F.1 lists a matlab function for performing ideal spectral interpolation directly in the frequency domain. Such an approach is normally only used when non-uniform sampling of the frequency axis is needed. For uniform spectral upsampling, it is more typical to take an inverse FFT, zero pad, then a longer FFT, as discussed further in the next section.


Zero Padding in the Time Domain

Unlike time-domain interpolation [270], ideal spectral interpolation is very easy to implement in practice by means of zero padding in the time domain. That is,

$\textstyle \parbox{0.8\textwidth}{\emph{Zero padding} in the time domain corresponds to \emph{ideal
interpolation} in the frequency domain.}$
Since the frequency axis (the unit circle in the $ z$ plane) is finite in length, ideal interpolation can be implemented exactly to within numerical round-off error. This is quite different from ideal (band-limited) time-domain interpolation, in which the interpolation kernel is sinc$ (\pi f_st)$ ; the sinc function extends to plus and minus infinity in time, so it can never be implemented exactly in practice.3.9

Practical Zero Padding

To interpolate a uniformly sampled spectrum $ X(\omega_k)$ , $ k=0,1,2,\ldots,N-1$ by the factor $ L$ , we may take the length $ N$ inverse DFT, append $ N\cdot(L-1)$ zeros to the time-domain data, and take a length $ NL$ DFT. If $ NL$ is a power of two, then so is $ N$ and we can use a Cooley-Tukey FFT for both steps (which is very fast):

$\displaystyle X(\omega_l) = \hbox{\sc FFT}_{NL,l}(\hbox{\sc ZeroPad}_{LN}(\hbox{\sc IFFT}_N(X))),\quad l=0,\ldots,LN-1$ (3.45)

This operation creates $ L-1$ new bins between each pair of original bins in $ X$ , thus increasing the number of spectral samples around the unit circle from $ N$ to $ LN$ . An example for $ L=2$ is shown in Fig.2.4 (compare to Fig.2.3).

In matlab, we can specify zero-padding by simply providing the optional FFT-size argument:

X = fft(x,N);  % FFT size N > length(x)


Zero-Padding to the Next Higher Power of 2

Another reason we zero-pad is to be able to use a Cooley-Tukey FFT with any window length $ M$ . When $ M$ is not a power of $ 2$ , we append enough zeros to make the FFT size $ N>M$ be a power of $ 2$ . In Matlab and Octave, the function nextpow2 returns the next higher power of 2 greater than or equal to its argument:

N = 2^nextpow2(M); % smallest M-compatible FFT size


Zero-Padding for Interpolating Spectral Displays

Suppose we perform spectrum analysis on some sinusoid using a length $ M$ window. Without zero padding, the DFT length is $ N=M$ . We may regard the DFT as a critically sampled DTFT (sampled in frequency). Since the bin separation in a length-$ N$ DFT is $ 2\pi/N$ , and the zero-crossing interval for Blackman-Harris side lobes is $ 2\pi/M$ , we see that there is one bin per side lobe in the sampled window transform. These spectral samples are illustrated for a Hamming window transform in Fig.2.3b. Since $ K=4$ in Table 5.2, the main lobe is 4 samples wide when critically sampled. The side lobes are one sample wide, and the samples happen to hit near some of the side-lobe zero-crossings, which could be misleading to the untrained eye if only the samples were shown. (Note that the plot is clipped at -60 dB.)

Figure 2.3: (a) Hamming window. (b) Critically sampled sinusoidal peak = frequency-shifted Hamming-window transform.
\includegraphics[width=\twidth]{eps/spectsamps}

If we now zero pad the Hamming-window by a factor of 2 (append 21 zeros to the length $ M=21$ window and take an $ N=42$ point DFT), we obtain the result shown in Fig.2.4. In this case, the main lobe is 8 samples wide, and there are two samples per side lobe. This is significantly better for display even though there is no new information in the spectrum relative to Fig.2.3.3.10

Incidentally, the solid lines in Fig.2.3b and 2.4b indicating the ``true'' DTFT were computed using a zero-padding factor of $ L=N/M=1000$ , and they were virtually indistinguishable visually from $ L=100$ . ($ L=10$ is not enough.)

Figure 2.4: (a) Hamming window zero-padded by a factor of 2. (b) $ 2\times $ -oversampled sinusoidal peak = frequency-shifted Hamming-window transform.
\includegraphics[width=\twidth]{eps/spectsamps2}


Zero-Padding for Interpolating Spectral Peaks

For sinusoidal peak-finding, spectral interpolation via zero-padding gets us closer to the true maximum of the main lobe when we simply take the maximum-magnitude FFT-bin as our estimate.

The examples in Fig.2.5 show how zero-padding helps in clarifying the true peak of the sampled window transform. With enough zero-padding, even very simple interpolation methods, such as quadratic polynomial interpolation, will give accurate peak estimates.

Figure 2.5: Illustration of ideal interpolation in the frequency domain as a result of zero padding in the time domain.
\includegraphics[width=0.7\twidth]{eps/spectsamps3}

Another illustration of zero-padding appears in Section 8.1.3 of [264].


Zero-Phase Zero Padding

The previous zero-padding example used the causal Hamming window, and the appended zeros all went to the right of the window in the FFT input buffer (see Fig.2.4a). When using zero-phase FFT windows (usually the best choice), the zero-padding goes in the middle of the FFT buffer, as we now illustrate.

We look at zero-phase zero-padding using a Blackman window3.3.1) which has good, though suboptimal, characteristics for audio work.3.11

Figure 2.6a shows a windowed segment of some sinusoidal data, with the window also shown as an envelope. Figure 2.6b shows the same data loaded into an FFT input buffer with a factor of 2 zero-phase zero padding. Note that all time is ``modulo $ N$ '' for a length $ N$ FFT. As a result, negative times $ -n$ map to $ N-n$ in the FFT input buffer.

Figure 2.6: (a) Blackman window overlaid with windowed data. b) Zero-padded windowed data loaded into the FFT input buffer.
\includegraphics[width=\twidth]{eps/zpblackmanT}

Figure 2.7a shows the result of performing an FFT on the data of Fig.2.6b. Since frequency indices are also modulo $ N$ , the negative-frequency bins appear in the right half of the buffer. Figure 2.6b shows the same data ``rotated'' so that bin number is in order of physical frequency from $ -f_s/2$ to $ f_s/2$ . If $ k$ is the bin number, then the frequency in Hz is given by $ k
f_s/N$ , where $ f_s$ denotes the sampling rate and $ N$ is the FFT size.

Figure 2.7: (a) FFT magnitude data, as returned by the FFT. (b) FFT magnitude spectrum ``rotated'' to a more ``physical'' frequency axis in bin numbers.
\includegraphics[width=\twidth]{eps/zpblackmanF}

The Matlab script for creating Figures 2.6 and 2.7 is listed in in §F.1.1.

Matlab/Octave fftshift utility

Matlab and Octave have a simple utility called fftshift that performs this bin rotation. Consider the following example:

octave:4>
fftshift([1 2 3 4])
ans =
  3  4  1  2
octave:5>
If the vector [1 2 3 4] is the output of a length 4 FFT, then the first element (1) is the dc term, and the third element (3) is the point at half the sampling rate ($ f_s/2$ ), which can be taken to be either plus or minus $ f_s/2$ since they are the same point on the unit circle in the $ z$ plane. Elements 2 and 4 are plus and minus $ f_s/4$ , respectively. After fftshift, element (3) is first, which indicates that both Matlab and Octave regard the spectral sample at half the sampling rate as a negative frequency. The next element is 4, corresponding to frequency $ -f_s/4$ , followed by dc and $ f_s/4$ .

Another reasonable result would be fftshift([1 2 3 4]) == [4 1 2 3], which defines half the sampling rate as a positive frequency. However, giving $ f_s/2$ to the negative frequencies balances giving dc to the positive frequencies, and the number of samples on both sides is then the same. For an odd-length DFT, there is no point at $ \pm f_s/2$ , so the result

octave:4>
fftshift([1 2 3])
ans =
  3  1  2
octave:5>
is the only reasonable answer, corresponding to frequencies $ -f_s/3,
0, f_s/3$ , respectively.


Index Ranges for Zero-Phase Zero-Padding

Having looked at zero-phase zero-padding ``pictorially'' in matlab buffers, let's now specify the index-ranges mathematically. Denote the window length by $ M$ (an odd integer) and the FFT length by $ N>M$ (a power of 2). Then the windowed data will occupy indices 0 to $ (M-1)/2$ (positive-time segment), and $ N-(M-1)/2$ to $ N-1$ (negative-time segment). Here we are assuming a 0-based indexing scheme as used in C or C++. We add 1 to all indices for matlab indexing to obtain 1:(M-1)/2+1 and N-(M-1)/2+1:N, respectively. The zero-padding zeros go in between these ranges, i.e., from $ (M-1)/2 + 1$ to $ N-(M-1)/2 - 1$ .


Summary

To summarize, zero-padding is used for

  • padding out to the next higher power of 2 so a Cooley-Tukey FFT can be used with any window length,
  • improving the quality of spectral displays, and
  • oversampling spectral peaks so that some simple final interpolation will be accurate.
In addition, we will learn in Chapter 8 that zero-padding is also necessary to accommodate spectral modifications in the short-time Fourier transform (STFT). This is because spectral modifications cause the time-domain signal to lengthen in time, and without sufficient zero-padding to accommodate it, there will be time aliasing in the reconstruction of the signal from the modified FFTs.

Some examples of interpolated spectral display by means of zero-padding may be seen in §3.4.


Next Section:
The Rectangular Window
Previous Section:
Continuous-Time Fourier Theorems