- discrete or continuous in time, and
- finite or infinite in duration.
- discrete or continuous in frequency, and
- finite or infinite in bandwidth.
Table 2.1 (next page) summarizes the four Fourier-transform cases corresponding to discrete or continuous time and/or frequency.
where is appropriately adapted, e.g.,
In spectral modeling of audio, we usually deal with indefinitely long signals. Fourier analysis of an indefinitely long discrete-time signal is carried out using the Discrete Time Fourier Transform (DTFT).3.1Below, the DTFT is defined, and selected Fourier theorems are stated and proved for the DTFT case. Additionally, for completeness, the Fourier Transform (FT) is defined, and selected FT theorems are stated and proved as well. Theorems for the DFT case are detailed in .3.2
Discrete Time Fourier Transform (DTFT)
The Discrete Time Fourier Transform (DTFT) can be viewed as the limiting form of the DFT when its length is allowed to approach infinity:
where denotes the continuous radian frequency variable,3.3 and is the signal amplitude at sample number .
The inverse DTFT is
which can be derived in a manner analogous to the derivation of the inverse DFT .
Instead of operating on sampled signals of length (like the DFT), the DTFT operates on sampled signals defined over all integers .
Unlike the DFT, the DTFT frequencies form a continuum. That is, the DTFT is a function of continuous frequency , while the DFT is a function of discrete frequency , . The DFT frequencies , , are given by the angles of points uniformly distributed along the unit circle in the complex plane. Thus, as , a continuous frequency axis must result in the limit along the unit circle. The axis is still finite in length, however, because the time domain remains sampled.
The Fourier transform of a signal , , is defined as
and its inverse is given by
Thus, the Fourier transform is defined for continuous time and continuous frequency, both unbounded. As a result, mathematical questions such as existence and invertibility are most difficult for this case. In fact, such questions fueled decades of confusion in the history of harmonic analysis (see Appendix G).
Existence of the Fourier Transform
Conditions for the existence of the Fourier transform are complicated to state in general , but it is sufficient for to be absolutely integrable, i.e.,
This requirement can be stated as , meaning that belongs to the set of all signals having a finite norm ( ). It is similarly sufficient for to be square integrable, i.e.,
or, . More generally, it suffices to show for [36, p. 47].
There is never a question of existence, of course, for Fourier transforms of real-world signals encountered in practice. However, idealized signals, such as sinusoids that go on forever in time, do pose normalization difficulties. In practical engineering analysis, these difficulties are resolved using Dirac's ``generalized functions'' such as the impulse (also loosely called the delta function), discussed in §B.10.
Fourier Theorems for the DTFT
This section states and proves selected Fourier theorems for the DTFT. A more complete list for the DFT case is given in .3.4Since this material was originally part of an appendix, it is relatively dry reading. Feel free to skip to the next chapter and refer back as desired when a theorem is invoked.
We say that is the spectrum of .
Linearity of the DTFT
where are any scalars (real or complex numbers), and are any two discrete-time signals (real- or complex-valued functions of the integers), and are their corresponding continuous-frequency spectra defined over the unit circle in the complex plane.
Proof: We have
One way to describe the linearity property is to observe that the Fourier transform ``commutes with mixing.''
For any complex signal , , we have
Arguably, should include complex conjugation. Let
denote such a definition. Then in this case we have
In the typical special case of real signals ( ), we have so that
That is, time-reversing a real signal conjugates its spectrum.
Most (if not all) of the signals we deal with in practice are real signals. Here we note some spectral symmetries associated with real signals.
The previous section established that the spectrum of every real signal satisfies
In other terms, if a signal is real, then its spectrum is Hermitian (``conjugate symmetric''). Hermitian spectra have the following equivalent characterizations:
- The real part is even, while the imaginary part is odd:
- The magnitude is even, while the phase is odd:
Real Even (or Odd) Signals
If a signal is even in addition to being real, then its DTFT is also real and even. This follows immediately from the Hermitian symmetry of real signals, and the fact that the DTFT of any even signal is real:
This is true since cosine is even, sine is odd, even times even is even, even times odd is odd, and the sum over all samples of an odd signal is zero. I.e.,
If is real and even, the following are true:
Similarly, if a signal is odd and real, then its DTFT is odd and purely imaginary. This follows from Hermitian symmetry for real signals, and the fact that the DTFT of any odd signal is imaginary.
where we used the fact that
Shift Theorem for the DTFT
where is any integer ( ). Thus, is a right-shift or delay by samples.
The shift theorem states3.5
or, in operator notation,
Note that is a linear phase term, so called because it is a linear function of frequency with slope equal to :
The shift theorem gives us that multiplying a spectrum by a linear phase term corresponds to a delay in the time domain by samples. If , it is called a time advance by samples.
Convolution Theorem for the DTFT
This is sometimes called acyclic convolution to distinguish it from the cyclic convolution used for length sequences in the context of the DFT . Convolution is cyclic in the time domain for the DFT and FS cases (i.e., whenever the time domain has a finite length), and acyclic for the DTFT and FT cases.3.6
That is, convolution in the time domain corresponds to pointwise multiplication in the frequency domain.
Proof: The result follows immediately from interchanging the order of summations associated with the convolution and DTFT:
Correlation Theorem for the DTFT
The correlation theorem for DTFTs is then
From the correlation theorem, we have
Note that this definition of autocorrelation is appropriate for signals having finite support (nonzero over a finite number of samples). For infinite-energy (but finite-power) signals, such as stationary noise processes, we define the sample autocorrelation to include a normalization suitable for this case (see Chapter 6 and Appendix C).
From the autocorrelation theorem we have that a digital-filter impulse-response is that of a lossless allpass filter  if and only if . In other words, the autocorrelation of the impulse-response of every allpass filter is impulsive.
Power Theorem for the DTFT
The inner product of two spectra may be defined as
Note that the frequency-domain inner product includes a normalization factor while the time-domain definition does not.
That is, the inner product of two signals in the time domain equals the inner product of their respective spectra (a complex scalar in general).
When we consider the inner product of a signal with itself, we have the special case known as the energy theorem (or Rayleigh's energy theorem):
where denotes the norm induced by the inner product. It is always real.
In other terms, we stretch a sampled signal by the factor by inserting zeros in between each pair of samples of the signal.
In the literature on multirate filter banks (see Chapter 11), the stretch operator is typically called instead the upsampling operator. That is, stretching a signal by the factor of is called upsampling the signal by the factor . (See §11.1.1 for the graphical symbol ( ) and associated discussion.) The term ``stretch'' is preferred in this book because ``upsampling'' is easily confused with ``increasing the sampling rate''; resampling a signal to a higher sampling rate is conceptually implemented by a stretch operation followed by an ideal lowpass filter which moves the inserted zeros to their properly interpolated values.
Note that we could also call the stretch operator the scaling operator, to unify the terminology in the discrete-time case with that of the continuous-time case (§2.4.1 below).
where denotes the radian frequency variable after applying the repeat operator.
The repeat operator maps the entire unit circle (taken as to ) to a segment of itself , centered about , and repeated times. This is illustrated in Fig.2.2 for .
Since the frequency axis is continuous and -periodic for DTFTs, the repeat operator is precisely equivalent to a scaling operator for the Fourier transform case (§B.4). We call it ``repeat'' rather than ``scale'' because we are restricting the scale factor to positive integers, and because the name ``repeat'' describes more vividly what happens to a periodic spectrum that is compressively frequency-scaled over the unit circle by an integer factor.
Using these definitions, we can compactly state the stretch theorem:
As traverses the interval , traverses the unit circle times, thus implementing the repeat operation on the unit circle. Note also that when , we have , so that dc always maps to dc. At half the sampling rate , on the other hand, after the mapping, we may have either ( odd), or ( even), where .
The stretch theorem makes it clear how to do ideal sampling-rate conversion for integer upsampling ratios : We first stretch the signal by the factor (introducing zeros between each pair of samples), followed by an ideal lowpass filter cutting off at . That is, the filter has a gain of 1 for , and a gain of 0 for . Such a system (if it were realizable) implements ideal bandlimited interpolation of the original signal by the factor .
Downsampling and Aliasing
The downsampling operator selects every sample of a signal:
where the operator is defined as
for . The summation terms for are called aliasing components.
In z transform notation, the operator can be expressed as 
where is a common notation for the primitive th root of unity. On the unit circle of the plane, this becomes
The frequency scaling corresponds to having a sampling interval of after downsampling, which corresponds to the interval prior to downsampling.
The aliasing theorem makes it clear that, in order to downsample by factor without aliasing, we must first lowpass-filter the spectrum to . This filtering (when ideal) zeroes out the spectral regions which alias upon downsampling.
Note that any rational sampling-rate conversion factor may be implemented as an upsampling by the factor followed by downsampling by the factor [50,287]. Conceptually, a stretch-by- is followed by a lowpass filter cutting off at , followed by downsample-by- , i.e.,
In practice, there are more efficient algorithms for sampling-rate conversion [270,135,78] based on a more direct approach to bandlimited interpolation.
where we have chosen to keep frequency samples in terms of the original frequency axis prior to downsampling, i.e., for both and . This choice allows us to easily take the limit as by simply replacing by :
Replacing by and converting to -transform notation instead of Fourier transform notation , with , yields the final result.
denote the derivative of with respect to . Then we have
where denotes the DTFT of .
Proof: Using integration by parts, we obtain
An alternate method of proof is given in §B.3.
Corollary: Perhaps a cleaner statement is as follows:
This completes our coverage of selected DTFT theorems. The next section adds some especially useful FT theorems having no precise counterpart in the DTFT (discrete-time) case.
Continuous-Time Fourier Theorems
Selected Fourier theorems for the continuous-time case are stated and proved in Appendix B. However, two are sufficiently important that we state them here.
The scaling theorem (or similarity theorem) says that if you horizontally ``stretch'' a signal by the factor in the time domain, you ``squeeze'' and amplify its Fourier transform by the same factor in the frequency domain. This is an important general Fourier duality relationship:
Theorem: For all continuous-time functions possessing a Fourier transform,
and is any nonzero real number (the abscissa stretch factor). A more commonly used notation is the following:
Proof: See §B.4.
The scaling theorem is fundamentally restricted to the continuous-time, continuous-frequency (Fourier transform) case. The closest we come to the scaling theorem among the DTFT theorems (§2.3) is the stretch (repeat) theorem (page ). For this and other continuous-time Fourier theorems, see Appendix B.
Definition: A function is said to be of order if there exist and some positive constant such that for all .
Proof: See §B.18.
The need for spectral interpolation comes up in many situations. For example, we always use the DFT in practice, while conceptually we often prefer the DTFT. For time-limited signals, that is, signals which are zero outside some finite range, the DTFT can be computed from the DFT via spectral interpolation. Conversely, the DTFT of a time-limited signal can be sampled to obtain its DFT.3.7Another application of DFT interpolation is spectral peak estimation, which we take up in Chapter 5; in this situation, we start with a sampled spectral peak from a DFT, and we use interpolation to estimate the frequency of the peak more accurately than what we get by rounding to the nearest DFT bin frequency.
The following sections describe the theoretical and practical details of ideal spectral interpolation.
Thus, for signals in the DTFT domain which are time limited to , we obtain
This can be thought of as a zero-centered DFT evaluated at instead of for some . It arises naturally from taking the DTFT of a finite-length signal. Such time-limited signals may be said to have ``finite support'' .
Interpolating a DFT
(The aliased sinc function, , is derived in §3.1.) Thus, zero-padding in the time domain interpolates a spectrum consisting of samples around the unit circle by means of `` interpolation.'' This is ideal, time-limited interpolation in the frequency domain using the aliased sinc function as an interpolation kernel. We can almost rewrite the last line above as , but such an expression would normally be defined only for , where is some integer, since is discrete while is continuous.
Figure F.1 lists a matlab function for performing ideal spectral interpolation directly in the frequency domain. Such an approach is normally only used when non-uniform sampling of the frequency axis is needed. For uniform spectral upsampling, it is more typical to take an inverse FFT, zero pad, then a longer FFT, as discussed further in the next section.
Zero Padding in the Time Domain
Unlike time-domain interpolation , ideal spectral interpolation is very easy to implement in practice by means of zero padding in the time domain. That is,
Since the frequency axis (the unit circle in the plane) is finite in length, ideal interpolation can be implemented exactly to within numerical round-off error. This is quite different from ideal (band-limited) time-domain interpolation, in which the interpolation kernel is sinc ; the sinc function extends to plus and minus infinity in time, so it can never be implemented exactly in practice.3.9
To interpolate a uniformly sampled spectrum , by the factor , we may take the length inverse DFT, append zeros to the time-domain data, and take a length DFT. If is a power of two, then so is and we can use a Cooley-Tukey FFT for both steps (which is very fast):
This operation creates new bins between each pair of original bins in , thus increasing the number of spectral samples around the unit circle from to . An example for is shown in Fig.2.4 (compare to Fig.2.3).
X = fft(x,N); % FFT size N > length(x)
Another reason we zero-pad is to be able to use a Cooley-Tukey FFT with any window length . When is not a power of , we append enough zeros to make the FFT size be a power of . In Matlab and Octave, the function nextpow2 returns the next higher power of 2 greater than or equal to its argument:
N = 2^nextpow2(M); % smallest M-compatible FFT size
Suppose we perform spectrum analysis on some sinusoid using a length window. Without zero padding, the DFT length is . We may regard the DFT as a critically sampled DTFT (sampled in frequency). Since the bin separation in a length- DFT is , and the zero-crossing interval for Blackman-Harris side lobes is , we see that there is one bin per side lobe in the sampled window transform. These spectral samples are illustrated for a Hamming window transform in Fig.2.3b. Since in Table 5.2, the main lobe is 4 samples wide when critically sampled. The side lobes are one sample wide, and the samples happen to hit near some of the side-lobe zero-crossings, which could be misleading to the untrained eye if only the samples were shown. (Note that the plot is clipped at -60 dB.)
If we now zero pad the Hamming-window by a factor of 2 (append 21 zeros to the length window and take an point DFT), we obtain the result shown in Fig.2.4. In this case, the main lobe is 8 samples wide, and there are two samples per side lobe. This is significantly better for display even though there is no new information in the spectrum relative to Fig.220.127.116.11
Incidentally, the solid lines in Fig.2.3b and 2.4b indicating the ``true'' DTFT were computed using a zero-padding factor of , and they were virtually indistinguishable visually from . ( is not enough.)
The examples in Fig.2.5 show how zero-padding helps in clarifying the true peak of the sampled window transform. With enough zero-padding, even very simple interpolation methods, such as quadratic polynomial interpolation, will give accurate peak estimates.
The previous zero-padding example used the causal Hamming window, and the appended zeros all went to the right of the window in the FFT input buffer (see Fig.2.4a). When using zero-phase FFT windows (usually the best choice), the zero-padding goes in the middle of the FFT buffer, as we now illustrate.
Figure 2.6a shows a windowed segment of some sinusoidal data, with the window also shown as an envelope. Figure 2.6b shows the same data loaded into an FFT input buffer with a factor of 2 zero-phase zero padding. Note that all time is ``modulo '' for a length FFT. As a result, negative times map to in the FFT input buffer.
Figure 2.7a shows the result of performing an FFT on the data of Fig.2.6b. Since frequency indices are also modulo , the negative-frequency bins appear in the right half of the buffer. Figure 2.6b shows the same data ``rotated'' so that bin number is in order of physical frequency from to . If is the bin number, then the frequency in Hz is given by , where denotes the sampling rate and is the FFT size.
Matlab/Octave fftshift utility
Matlab and Octave have a simple utility called fftshift that performs this bin rotation. Consider the following example:
octave:4> fftshift([1 2 3 4]) ans = 3 4 1 2 octave:5>If the vector [1 2 3 4] is the output of a length 4 FFT, then the first element (1) is the dc term, and the third element (3) is the point at half the sampling rate ( ), which can be taken to be either plus or minus since they are the same point on the unit circle in the plane. Elements 2 and 4 are plus and minus , respectively. After fftshift, element (3) is first, which indicates that both Matlab and Octave regard the spectral sample at half the sampling rate as a negative frequency. The next element is 4, corresponding to frequency , followed by dc and .
Another reasonable result would be fftshift([1 2 3 4]) == [4 1 2 3], which defines half the sampling rate as a positive frequency. However, giving to the negative frequencies balances giving dc to the positive frequencies, and the number of samples on both sides is then the same. For an odd-length DFT, there is no point at , so the result
octave:4> fftshift([1 2 3]) ans = 3 1 2 octave:5>is the only reasonable answer, corresponding to frequencies , respectively.
Having looked at zero-phase zero-padding ``pictorially'' in matlab buffers, let's now specify the index-ranges mathematically. Denote the window length by (an odd integer) and the FFT length by (a power of 2). Then the windowed data will occupy indices 0 to (positive-time segment), and to (negative-time segment). Here we are assuming a 0-based indexing scheme as used in C or C++. We add 1 to all indices for matlab indexing to obtain 1:(M-1)/2+1 and N-(M-1)/2+1:N, respectively. The zero-padding zeros go in between these ranges, i.e., from to .
To summarize, zero-padding is used for
- padding out to the next higher power of 2 so a Cooley-Tukey FFT can be used with any window length,
- improving the quality of spectral displays, and
- oversampling spectral peaks so that some simple final interpolation will be accurate.
Some examples of interpolated spectral display by means of zero-padding may be seen in §3.4.
Spectrum Analysis Windows
Introduction and Overview