FIR Digital Filter Design

FIR filters are basic in spectral audio signal processing. In fact, the fastest way to implement long FIR filters in conventional CPUs5.1 is by means of FFT convolution. The convolution theorem for Fourier transforms2.3.5) states that the convolution of an input signal $ x$ with a filter impulse-response $ h$ is given by the inverse DTFT of the product of the signal's spectrum $ X(\omega)$ times the filter's frequency response $ H(\omega)$ , i.e.,

$\displaystyle y(n) \eqsp \hbox{\sc DTFT}^{-1}_n(H\cdot X) \isdefs \frac{1}{2\pi}\int_{-\pi}^{\pi} H(\omega)X(\omega)e^{j\omega n} d\omega,$ (5.1)


H(\omega) &=& \hbox{\sc DTFT}_\omega(h) \eqsp \mbox{filter frequency response},\;
X(\omega) &=& \hbox{\sc DTFT}_\omega(x) \eqsp \mbox{input-signal spectrum}\\
y(n) &=& \mbox{output signal at time $n\in{\bf Z}$\ (any integer),}

and the DTFT is defined as (§2.1)

$\displaystyle X(\omega) \isdefs \hbox{\sc DTFT}_\omega(x) \isdefs \sum_{n\in{\bf Z}} x(n)\, e^{-j\omega n}.$ (5.2)

As usual with the DTFT, the sampling rate is assumed to be $ f_s=1$ . In practice, the DTFT is used in sampled form, $ \omega \to
\omega_k = 2\pi k/N$ , replacing the DTFT with a (zero-padded) FFT, as will be discussed in Chapter 8. To make the most of FFT processors for FIR filter implementation, we will need flexible ways to design many kinds of FIR filters.

This chapter provides a starting point in the area of FIR digital filter design. The so-called ``window method'' for FIR filter design, also based on the convolution theorem for Fourier transforms, is discussed in some detail, and compared with an optimal Chebyshev method. Other methods, such as least-squares, are discussed briefly to provide further perspective. Tools for FIR filter design in both Octave and the Matlab Signal Processing Toolbox are listed where applicable. For more information on digital filter design, see, e.g., the documentation for the Matlab Signal Processing Toolbox and/or [263,283,32,204,275,224,198,258].

The Ideal Lowpass Filter

Consider the ideal lowpass filter, depicted in Fig.4.1.

Figure 4.1: Desired amplitude response (gain versus frequency) for an ideal lowpass filter.

An ideal lowpass may be characterized by a gain of 1 for all frequencies below some cut-off frequency $ f_c$ in Hz, and a gain of 0 for all higher frequencies.5.2 The impulse response of the ideal lowpass filter is easy to calculate:

$\displaystyle h_{\hbox{ideal}}(n)$ $\displaystyle \isdef$ DTFT$\displaystyle ^{-1}_n\left(H_{\mbox{ideal}}\right)$  
  $\displaystyle \isdef$ $\displaystyle \frac{1}{2\pi}\int_{-\pi}^{\pi} d\omega e^{j\omega n }
1, & \left\vert\omega\right\vert\leq\omega_c \\ [5pt]
0, & \mbox{otherwise} \\
\end{array} \right.$  
  $\displaystyle =$ $\displaystyle \frac{1}{2\pi}\int_{-\omega_c}^{\omega_c} e^{j\omega n }d\omega$  
  $\displaystyle =$ $\displaystyle \frac{1}{2\pi jn}\left[e^{j\omega_c n} - e^{-j\omega_c n}\right]$  
  $\displaystyle =$ $\displaystyle \frac{\sin(\omega_c n)}{\pi n}$  
  $\displaystyle \isdef$ $\displaystyle 2f_c$   sinc$\displaystyle (2f_c n ), \quad n \in {\bf Z}
\protect$ (5.3)

where $ \omega_c\isdef 2\pi f_c$ denotes the normalized cut-off frequency in radians per sample. Thus, the impulse response of an ideal lowpass filter is a sinc function.

Unfortunately, we cannot implement the ideal lowpass filter in practice because its impulse response is infinitely long in time. It is also noncausal; it cannot be shifted to make it causal because the impulse response extends all the way to time $ n=-\infty$ . It is clear we will have to accept some sort of compromise in the design of any practical lowpass filter.

The subject of digital filter design is generally concerned with finding an optimal approximation to the desired frequency response by minimizing some norm of a prescribed error criterion with respect to a set of practical filter coefficients, perhaps subject also to some constraints (usually linear equality or inequality constraints) on the filter coefficients, as we saw for optimal window design in § audio applications, optimality is difficult to define precisely because perception is involved. It is therefore valuable to consider also suboptimal methods that are ``close enough'' to optimal, and which may have other advantages such as extreme simplicity and/or speed. We will examine some specific cases below.

Lowpass Filter Design Specifications

Typical design parameters for a lowpass filter are shown in Fig.4.2.

Figure 4.2: Illustration of typical lowpass filter specifications in the frequency domain.

The design parameters are defined as follows:

  • $ \delta_s = $ stop-band ripple ( $ \leq 0.001 = -60 $ dB is common)
  • $ \delta_p = $ pass-band ripple ($ \leq 0.1$ dB typical)
  • $ \omega_s = $ stop-band edge frequency
  • $ \omega_p = $ pass-band edge frequency
  • TW: transition width $ = \omega_s - \omega_p $
  • SBA: stop-band attenuation $ = -20 \log( \delta_s ) $
The pass-band ripple is typically larger than the stop-band ripple because it is a deviation about 1 instead of 0. For example, a pass-band ripple of $ \pm 0.1$ dB translates to $ 10^{0.1/20}-1 \approx 0.01$ on a linear scale. A stop-band ripple of $ -60$ dB, on the other hand, equals $ 10^{-60/20} = 0.001$ on a linear scale. Thus, a typical pass-band ripple specification may be 10 times larger than a typical stop-band ripple specification, on a linear scale, though less audible.5.4 For a stop-band gain down around $ -80$ dB, keeping the pass-band ripple at $ \pm 0.1$ dB, the pass-band ripple becomes around 100 times larger than the stop-band ripple, on a linear scale, but again the stop-band ripple is more likely to yield audible error in typical situations. In summary, the pass-band ripple is an allowed gain deviation, while the stop-band ripple is an allowed ``leakage'' level.

In terms of these specifications, we may define an optimal FIR lowpass filter of a given length to be one which minimizes the stop-band and pass-band ripple (weighted relatively as desired) for given stop-band and pass-band edge frequencies. Such optimal filters are often designed in practice by Chebyshev methods, as we encountered already in the study of windows for spectrum analysis3.103.13). Optimal Chebyshev FIR filters will be discussed further below (in §4.5.2), but first we look at simpler FIR design methods and compare to optimal Chebyshev designs for reference. An advantage of the simpler methods is that they are more suitable for interactive, real-time, and/or signal-adaptive FIR filter design.

Ideal Lowpass Filter Revisited

The ideal lowpass filter of Fig.4.1 can now be described by the following specifications:

  • The transition width TW is zero ( $ \omega_s=\omega_p$ in Fig.4.2).
  • The pass-band and stop-band ripples are both zero
    ( $ \delta_p = \delta_s = 0$ in Fig.4.2, and $ \hbox{SBA} = \infty$ ).
As discussed above, these specifications are not achievable by any FIR filter. It is instructive, however, to see how an optimal least-squares design comes out in this case.

Optimal (but poor if unweighted)
Least-Squares Impulse Response Design

Perhaps the most commonly employed error criterion in signal processing is the least-squares error criterion.

Let $ h(n)$ denote some ideal filter impulse response, possibly infinitely long, and let $ {\hat h}(n)$ denote the impulse response of a length $ L$ causal FIR filter that we wish to design. The sum of squared errors is given by

$\displaystyle J_2({\hat h}) \isdefs \sum_{n=-\infty}^\infty\left\vert h(n)-{\hat h}(n)\right\vert^2 \eqsp \sum_{n=0}^{L-1}\left\vert h(n)-{\hat h}(n)\right\vert^2 + c_2$ (5.4)

where $ c_2$ does not depend on $ {\hat h}$ . Note that $ J({\hat h})\geq c_2$ . We can minimize the error by simply matching the first $ L$ terms in the desired impulse response. That is, the optimal least-squares FIR filter has the following ``tap'' coefficients:

$\displaystyle {\hat h}(n) \isdef \left\{\begin{array}{ll} h(n), & 0\leq n \leq L-1 \\ [5pt] 0, & \hbox{otherwise} \\ \end{array} \right. \protect$ (5.5)

The same solution works also for any $ Lp$ norm4.10.1). That is, the error

$\displaystyle J_p({\hat h}) \isdefs \sum_{n=-\infty}^\infty\left\vert h(n)-{\hat h}(n)\right\vert^p \eqsp \sum_{n=0}^{L-1}\left\vert h(n)-{\hat h}(n)\right\vert^p + c_p \;\geq\; c_p$ (5.6)

is also minimized by matching the leading $ L$ terms of the desired impulse response.

In the $ L2$ (least-squares) case, we have, by the Fourier energy theorem (§2.3.8),

$\displaystyle J_2({\hat h}) \isdefs \sum_{n=-\infty}^\infty\left\vert h(n)-{\hat h}(n)\right\vert^2 \eqsp \frac{1}{2\pi}\int_{-\pi}^{\pi}\left\vert H(\omega)-{\hat H}(\omega)\right\vert^2 d\omega.$ (5.7)

Therefore, $ {\hat H}(\omega)=\hbox{\sc DTFT}({\hat h})$ is an optimal least-squares approximation to $ H(\omega)$ when $ {\hat h}$ is given by (4.5). In other words, the frequency response of the filter $ {\hat H}$ is optimal in the $ L2$ (least-squares) sense.


Figure 4.3 shows the amplitude response of a length $ 30$ optimal least-squares FIR lowpass filter, for the case in which the cut-off frequency is one-fourth the sampling rate ($ f_c=1/4$ ).

Figure: Amplitude response of a length $ 30$ FIR lowpass-filter obtained by truncating the ideal impulse response.

We see that, although the impulse response is optimal in the least-squares sense (in fact optimal under any $ Lp$ norm with any error-weighting), the filter is quite poor from an audio perspective. In particular, the stop-band gain, in which zero is desired, is only about 10 dB down. Furthermore, increasing the length of the filter does not help, as evidenced by the length 71 result in Fig.4.4.

Figure: Amplitude response of a length $ 71$ FIR lowpass-filter obtained by truncating the ideal impulse response.

It is not the case that a length $ 71$ FIR filter is too short for implementing a reasonable audio lowpass filter, as can be seen in Fig.4.5. The optimal Chebyshev lowpass filter in this figure was designed by the Matlab statement

hh = firpm(L-1,[0 0.5 0.6 1],[1 1 0 0]);
where, in terms of the lowpass design specs defined in §4.2 above, we are asking for
  • $ \omega_p = 0.5 \pi$ (pass-band edge frequency)5.5
  • $ \omega_s = 0.6\pi$ (stop-band edge frequency)
In this case, the pass-band and stop-band ripple are equally weighted and thus are minimized equally for the given FIR length $ L$ .5.6

Figure: Amplitude response of a length $ 71$ FIR lowpass-filter obtained by the Remez Exchange Algorithm (function firpm in the Matlab Signal Processing Toolbox).

We see that the Chebyshev design has a stop-band attenuation better than 60 dB, no corner-frequency resonance, and the error is equiripple in both stop-band (visible) and pass-band (not visible). Note also that there is a transition band between the pass-band and stop-band (specified in the call to firpm as being between normalized frequencies 0.5 and 0.6).

The main problem with the least-squares design examples above is the absence of a transition band specification. That is, the filter specification calls for an infinite roll-off rate from the pass-band to the stop-band, and this cannot be accomplished by any FIR filter. (Review Fig.4.2 for an illustration of more practical lowpass-filter design specifications.) With a transition band and a weighting function, least-squares FIR filter design can perform very well in practice. As a rule of thumb, the transition bandwidth should be at least $ 4\pi/L$ , where $ L$ is the FIR filter length in samples. (Recall that the main-lobe width of a length $ L$ rectangular window is $ 4\pi/L$3.1.2).) Such a rule respects the basic Fourier duality of length in the time domain and ``minimum feature width'' in the frequency domain.

Frequency Sampling Method for
FIR Filter Design

The frequency-sampling method for FIR filter design is perhaps the simplest and most direct technique imaginable when a desired frequency response has been specified. It consists simply of uniformly sampling the desired frequency response, and performing an inverse DFT to obtain the corresponding (finite) impulse response [224, pp. 105-23], [198, pp. 251-55]. The results are not optimal, however, because the response generally deviates from what is desired between the samples. When the desired frequency-response is undersampled, which is typical, the resulting impulse response will be time aliased to some extent. It is important to evaluate the final impulse response via a simulated DTFT (FFT with lots of zero padding), comparing to the originally desired frequency response.

The frequency-sampling method for FIR filter design is illustrated in §4.6.2 below in the context of the window method for FIR filter design, to which we now turn.

Window Method for FIR Filter Design

The window method for digital filter design is fast, convenient, and robust, but generally suboptimal. It is easily understood in terms of the convolution theorem for Fourier transforms, making it instructive to study after the Fourier theorems and windows for spectrum analysis. It can be effectively combined with the frequency sampling method, as we will see in §4.6 below.

The window method consists of simply ``windowing'' a theoretically ideal filter impulse response $ h(n)$ by some suitably chosen window function $ w(n)$ , yielding

$\displaystyle h_w(n) = w(n)\cdot h(n), \quad n\in{\cal Z}.$ (5.8)

For example, as derived in Eq.$ \,$ (4.3), the impulse response of the ideal lowpass filter is the well known sinc function

$\displaystyle h(n) \eqsp B\cdot$sinc$\displaystyle (B n) \isdefs B\frac{\sin(\pi B n)}{\pi B n}, \quad n\in{\bf Z},$ (5.9)

where $ B=2f_c$ is the total normalized bandwidth of the lowpass filter in Hz (counting both negative and positive frequencies), and $ f_c$ denotes the cut-off frequency in Hz. As noted earlier, we cannot implement this filter in practice because it is noncausal and infinitely long.

Since $ h(n) = B\,$sinc$ (BnT)$ decays away from time 0 as $ 1/n$ , we would expect to be able to truncate it to the interval $ [-N,N]$ , for some sufficiently large $ N$ , and obtain a pretty good FIR filter which approximates the ideal filter. This would be an example of using the window method with the rectangular window. We saw in §4.3 that such a choice is optimal in the least-squares sense, but it designs relatively poor audio filters. Choosing other windows corresponds to tapering the ideal impulse response to zero instead of truncating it. Tapering better preserves the shape of the desired frequency response, as we will see. By choosing the window carefully, we can manage various trade-offs so as to maximize the filter-design quality in a given application.

Window functions are always time limited. This means there is always a finite integer $ N_w$ such that $ w(n)=0$ for all $ \vert n\vert>N_w$ . The final windowed impulse response $ h_w(n) =
w(n)\cdot h(n)$ is thus always time-limited, as needed for practical implementation. The window method always designs a finite-impulse-response (FIR) digital filter (as opposed to an infinite-impulse-response (IIR) digital filter).

By the dual of the convolution theorem, pointwise multiplication in the time domain corresponds to convolution in the frequency domain. Thus, the designed filter $ h_w$ has a frequency response given by

$\displaystyle H_w(f) = \left(W\ast H\right)(f), \quad f\in\left[\,-1/2,\,1/2\,\right)$ (5.10)

where $ H=\hbox{\sc DTFT}(h)$ is the ideal frequency response and $ W=\hbox{\sc DTFT}(w)$ is the window transform. For the ideal lowpass filter, $ H$ is a rectangular window in the frequency domain. The frequency response $ H_w(f)$ is thus obtained by convolving the rectangular window with the window transform $ W(f)$ . This implies several points which can be immediately seen in terms of this convolution operation:

  • The pass-band gain is primarily the area under the main lobe of the window transform, provided the main lobe ``fits'' inside the pass-band (i.e., the total lowpass bandwidth $ B\isdeftext 2f_c$ is greater than or equal to the main-lobe width of $ W$ ).

  • The stop-band gain is given by an integral over a portion of the side lobes of the window transform. Since side-lobes oscillate about zero, a finite integral over them is normally much smaller than the side-lobes themselves, due to adjacent side-lobe cancellation under the integral.

  • The best stop-band performance occurs when the cut-off frequency is set so that the stop-band side-lobe integral traverses a whole number of side lobes.

  • The transition bandwidth is equal to the bandwidth of the main lobe of the window transform, again provided that the main lobe ``fits'' inside the pass-band.

  • For very small lowpass bandwidths $ B$ , $ H(\omega)$ approaches an impulse in the frequency domain. Since the impulse is the identity operator under convolution, the resulting lowpass filter $ H_w(\omega)$ approaches the window transform $ W(\omega)$ for small $ B$ . In particular, the stop-band gain approaches the window side-lobe level, and the transition width approaches half the main-lobe width. Thus, for good results, the lowpass cut-off frequency should be set no lower than half the window's main-lobe width.

Matlab Support for the Window Method

Octave and the Matlab Signal Processing Toolbox have two functions implementing the window method for FIR digital filter design:

  • fir1 designs lowpass, highpass, bandpass, and multi-bandpass filters.

  • fir2 takes an arbitrary magnitude frequency response specification.

The default window type is Hamming, but any window can be passed in as an argument. In addition, there is a function kaiserord for estimating the parameters of a Kaiser window which will achieve the desired filter specifications.

Bandpass Filter Design Example

The matlab code below designs a bandpass filter which passes frequencies between 4 kHz and 6 kHz, allowing transition bands from 3-4 kHz and 6-8 kHz (i.e., the stop-bands are 0-3 kHz and 8-10 kHz, when the sampling rate is 20 kHz). The desired stop-band attenuation is 80 dB, and the pass-band ripple is required to be no greater than 0.1 dB. For these specifications, the function kaiserord returns a beta value of $ \beta=7.85726$ and a window length of $ M=101$ . These values are passed to the function kaiser which computes the window function itself. The ideal bandpass-filter impulse response is computed in fir1, and the supplied Kaiser window is applied to shorten it to length $ M=101$ .

fs = 20000;                 % sampling rate
F = [3000 4000 6000 8000];  % band limits
A = [0 1 0];                % band type: 0='stop', 1='pass'
dev = [0.0001 10^(0.1/20)-1 0.0001]; % ripple/attenuation spec
[M,Wn,beta,typ] = kaiserord(F,A,dev,fs);  % window parameters
b = fir1(M,Wn,typ,kaiser(M+1,beta),'noscale'); % filter design

Note the conciseness of the matlab code thanks to the use of kaiserord and fir1 from Octave or the Matlab Signal Processing Toolbox.

Figure 4.6 shows the magnitude frequency response $ \vert B(\omega/2\pi)\vert$ of the resulting FIR filter $ b$ . Note that the upper pass-band edge has been moved to 6500 Hz instead of 6000 Hz, and the stop-band begins at 7500 Hz instead of 8000 Hz as requested. While this may look like a bug at first, it's actually a perfectly fine solution. As discussed earlier (§4.5), all transition-widths in filters designed by the window method must equal the window-transform's main-lobe width. Therefore, the only way to achieve specs when there are multiple transition regions specified is to set the main-lobe width to the minimum transition width. For the others, it makes sense to center the transition within the requested transition region.

Figure 4.6: Amplitude response of the FIR bandpass filter designed by the window method.

Under the Hood of kaiserord

Without kaiserord, we would need to implement Kaiser's formula [115,67] for estimating the Kaiser-window $ \beta $ required to achieve the given filter specs:

$\displaystyle \beta = \left\{\begin{array}{ll} 0.1102(A-8.7), & A > 50 \\ [5pt] 0.5842(A-21)^{0.4} + 0.07886(A-21), & 21< A < 50 \\ [5pt] 0, & A < 21, \\ \end{array} \right. \protect$ (5.11)

where $ A$ is the desired stop-band attenuation in dB (typical values in audio work are $ A=60$ to $ 90$ ). Note that this estimate for $ \beta $ becomes too small when the filter pass-band width approaches zero. In the limit of a zero-width pass-band, the frequency response becomes that of the Kaiser window transform itself. A non-zero pass-band width acts as a ``moving average'' lowpass filter on the side-lobes of the window transform, which brings them down in level. The kaiserord estimate assumes some of this side-lobe smoothing is present.

A similar function from [198] for window design (as opposed to filter design5.7) is

$\displaystyle \beta = \left\{\begin{array}{ll} 0, & A<13.26 \\ [5pt] 0.76609(A-13.26)^{0.4} + 0.09834(A-13.26), & 13.26< A < 60 \\ [5pt] 0.12438*(A+6.3), & 60<A<120, \\ \end{array} \right. \protect$ (5.12)

where now $ A$ is the desired side-lobe attenuation in dB (as opposed to stop-band attenuation). A plot showing Kaiser window side-lobe level for various values of $ \beta $ is given in Fig.3.28.

Similarly, the filter order $ M$ is estimated from stop-band attenuation $ A$ and desired transition width $ \Delta\omega$ using the empirical formula

$\displaystyle M = \frac{A-8}{2.285 \cdot \Delta\omega}$ (5.13)

where $ \Delta\omega$ is in radians between 0 and $ \pi$ .

Without the function fir1, we would have to manually implement the window method of filter design by (1) constructing the impulse response of the ideal bandpass filter $ h(n)$ (a cosine modulated sinc function), (2) computing the Kaiser window $ w(n)$ using the estimated length and $ \beta $ from above, then finally (3) windowing the ideal impulse response with the Kaiser window to obtain the FIR filter coefficients $ h_w(n) = w(n)h(n)$ . A manual design of this nature will be illustrated in the Hilbert transform example of §4.6.

Comparison to the Optimal Chebyshev FIR Bandpass Filter

To provide some perspective on the results, let's compare the window method to the optimal Chebyshev FIR filter4.10) for the same length and design specifications above.

The following Matlab code illustrates two different bandpass filter designs. The first (different transition bands) illustrates a problem we'll look at. The second (equal transition bands, commented out), avoids the problem.

M = 101;
normF = [0 0.3 0.4 0.6 0.8 1.0];  % transition bands different
%normF = [0 0.3 0.4 0.6 0.7 1.0]; % transition bands the same
amp = [0 0 1 1 0 0];              % desired amplitude in each band

[b2,err2] = firpm(M-1,normF,amp); % optimal filter of length M

Figure 4.7 shows the frequency response of the Chebyshev FIR filter designed by firpm, to be compared with the window-method FIR filter in Fig.4.6. Note that the upper transition band ``blows up''. This is a well known failure mode in FIR filter design using the Remez exchange algorithm [176,224]. It can be eliminated by narrowing the transition band, as shown in Fig.4.8. There is no error penalty in the transition region, so it is necessary that each one be ``sufficiently narrow'' to avoid this phenomenon.

Remember the rule of thumb that the narrowest transition-band possible for a length $ L$ FIR filter is on the order of $ 4\pi/L$ , because that's the width of the main-lobe of a length $ L$ rectangular window (measured between zero-crossings) (§3.1.2). Therefore, this value is quite exact for the transition-widths of FIR bandpass filters designed by the window method using the rectangular window (when the main-lobe fits entirely within the adjacent pass-band and stop-band). For a Hamming window, the window-method transition width would instead be $ 8\pi/L$ . Thus, we might expect an optimal Chebyshev design to provide transition widths in the vicinity of $ 8\pi/L$ , but probably not too close to $ 4\pi/L$ or below In the example above, where the sampling rate was $ 20$ kHz, and the filter length was $ L=101$ , we expect to be able to achieve transition bands circa $ (20,000/(2\pi))\cdot (8\pi/101) = 792$ Hz, but not so low as $ (20,000/(2\pi))\cdot (4\pi/101) = 396$ Hz. As we found above, $ 2000$ Hz was under-constrained, while $ 1000$ Hz was ok, being near the ``Hamming transition width.''

Figure 4.7: Amplitude response of the optimal Chebyshev FIR bandpass filter designed by the Remez exchange method.

Figure 4.8: Amplitude response of the optimal Chebyshev FIR bandpass filter as in Fig.4.7 with the upper transition band narrowed from 2 kHz down to 1 kHz in width.

Hilbert Transform Design Example

We will now use the window method to design a complex bandpass filter which passes positive frequencies and rejects negative frequencies.

Since every real signal $ x(n)$ possesses a Hermitian spectrum $ X(\omega)$ , i.e., $ X(-\omega) = \overline{X(\omega)}$ , it follows that, if we filter out the negative frequencies, we will destroy this spectral symmetry, and the output signal will be complex for every nonzero real input signal (excluding dc and half the sampling rate). In other terms, we want a filter which produces a ``single sideband'' (SSB) output signal in response to any real input signal. The Hermitian spectrum of a real signal can be viewed as two sidebands about dc (with one sideband being the ``conjugate flip'' of the other). See §2.3.3 for a review of Fourier symmetry-properties for real signals.

An ``analytic signal'' in signal processing is defined as any signal $ x(n)$ having only positive or only negative frequencies, but not both (typically only positive frequencies). In principle, the imaginary part of an analytic signal is computed from its real part by the Hilbert transform (defined and discussed below). In other words, one can ``filter out'' the negative-frequency components of a signal $ x(n)$ by taking its Hilbert transform $ y(n) = {\cal H}_n\{x\}$ and forming the analytic signal $ x_a(n) = x(n) + j y(n)$ . Thus, an alternative problem specification is to ask for a (real) filter which approximates the Hilbert transform as closely as possible for a given filter order.

Primer on Hilbert Transform Theory

We need a Hilbert-transform filter $ {\cal H}$ to compute the imaginary part $ y(t)$ of the analytic signal $ x_a(t)$ given its real part $ x(t)$ . That is,

$\displaystyle x_a(t) \eqsp x(t) + j\,y(t)$ (5.14)

where $ y = {\cal H}\{x\}$ . In the frequency domain, we have

$\displaystyle X_a(\omega) \eqsp X(\omega) + j\,Y(\omega) \eqsp \left[1+j\,H(\omega)\right]\,X(\omega)$ (5.15)

where $ H(\omega)$ denotes the frequency response of the Hilbert transform $ {\cal H}$ . Since by definition we have $ X_a(\omega)=0$ for $ \omega<0$ , we must have $ j\,H(\omega)=-1$ for $ \omega<0$ , so that $ H(\omega)=j$ for negative frequencies (an allpass response with phase-shift $ +90$ degrees). To pass the positive-frequency components unchanged, we would most naturally define $ H(\omega)=0$ for $ \omega>0$ . However, conventionally, the positive-frequency Hilbert-transform frequency response is defined more symmetrically as $ H(\omega)=-j$ for $ \omega>0$ , which gives $ j\,H(\omega)=+1$ and $\mbox{re\ensuremath{\left\{X_a(\omega)\right\}}}=2X(\omega)$ , i.e., the positive-frequency components of $ X$ are multiplied by $ 2$ .

In view of the foregoing, the frequency response of the ideal Hilbert-transform filter may be defined as follows:

$\displaystyle H(\omega) \isdefs \left\{\begin{array}{ll} \quad\! j, & \omega<0 \\ [5pt] \quad\!0, & \omega=0 \\ [5pt] -j, & \omega>0 \\ \end{array} \right. \protect$ (5.16)

Note that the point at $ \omega
= 0$ can be defined arbitrarily since the inverse-Fourier transform integral is not affected by a single finite point (being a ``set of measure zero'').

The ideal filter impulse response $ h(n)$ is obtained by finding the inverse Fourier transform of (4.16). For discrete time, we may take the inverse DTFT of (4.16) to obtain the ideal discrete-time Hilbert-transform impulse response, as pursued in Problem 10. We will work with the usual continuous-time limit $ h(t)=1/(\pi t)$ in the next section.

Hilbert Transform

The Hilbert transform $ y(t)$ of a real, continuous-time signal $ x(t)$ may be expressed as the convolution of $ x$ with the Hilbert transform kernel:

$\displaystyle h(t) \isdefs \frac{1}{\pi t},\qquad t\in(-\infty,\infty) \protect$ (5.17)

That is, the Hilbert transform of $ x$ is given by

$\displaystyle y = h \ast x. % \qquad \hbox{(Hilbert transform of $x$)}.
$ (5.18)

Thus, the Hilbert transform is a non-causal linear time-invariant filter.

The complex analytic signal $ x_a(t)$ corresponding to the real signal $ x(t)$ is then given by

$\displaystyle x_a(t)$ $\displaystyle \isdef$ $\displaystyle x(t) + j y(t)$  
$\displaystyle %\qquad \hbox{(Analytic signal corresponding to $x$)}\\ [10pt]
$ $\displaystyle =$ $\displaystyle \frac{1}{\pi}\displaystyle\int_0^{\infty} X(\omega)e^{j\omega t}\, d\omega$ (5.19)

To show this last equality (note the lower limit of 0 instead of the usual $ -\infty$ ), it is easiest to apply (4.16) in the frequency domain:

$\displaystyle X_a(\omega)$ $\displaystyle \isdef$ $\displaystyle X(\omega) + j Y(\omega)$  
  $\displaystyle \isdef$ $\displaystyle (X_++X_-) + j (-j X_+ + j X_-)$ (5.20)
  $\displaystyle =$ $\displaystyle 2X_+(\omega)$ (5.21)

Thus, the negative-frequency components of $ X$ are canceled, while the positive-frequency components are doubled. This occurs because, as discussed above, the Hilbert transform is an allpass filter that provides a $ 90$ degree phase shift at all negative frequencies, and a $ -90$ degree phase shift at all positive frequencies, as indicated in (4.16). The use of the Hilbert transform to create an analytic signal from a real signal is one of its main applications. However, as the preceding sections make clear, a Hilbert transform in practice is far from ideal because it must be made finite-duration in some way.

Filtering and Windowing the Ideal
Hilbert-Transform Impulse Response

Let $ h_i(t)$ denote the convolution kernel of the continuous-time Hilbert transform from (4.17) above:

$\displaystyle h_i(t) = \frac{1}{\pi t} \protect$ (5.22)

Convolving a real signal $ x(t)$ with this kernel produces the imaginary part of the corresponding analytic signal. The way the ``window method'' for digital filter design is classically done is to simply sample the ideal impulse response to obtain $ h_i(nT)$ and then window it to give $ \hat{h}_i(nT)=w(n)h_i(nT)$ . However, we know from above (e.g., §4.5.2) that we need to provide transition bands in order to obtain a reasonable design. A single-sideband filter needs a transition band between dc and $ \Omega_w$ , or higher, where $ \Omega_w$ denotes the main-lobe width (in rad/sample) of the window $ w$ we choose, and a second transition band is needed between $ \pi-\Omega_w$ and $ \pi$ .

Note that we cannot allow a time-domain sample at time 0 in (4.22) because it would be infinity. Instead, time 0 should be taken to lie between two samples, thereby introducing a small non-integer advance or delay. We'll choose a half-sample delay. As a result, we'll need to delay the real-part filter by half a sample as well when we make a complete single-sideband filter.

The matlab below illustrates the design of an FIR Hilbert-transform filter by the window method using a Kaiser window. For a more practical illustration, the sampling-rate assumed is set to $ f_s=22,050$ Hz instead of being normalized to 1 as usual. The Kaiser-window $ \beta $ parameter is set to $ 8$ , which normally gives ``pretty good'' audio performance (cf. Fig.3.28). From Fig.3.28, we see that we can expect a stop-band attenuation better than $ 50$ dB. The choice of $ \beta $ , in setting the time-bandwidth product of the Kaiser window, determines both the stop-band rejection and the transition bandwidths required by our FIR frequency response.

  M = 257;    % window length = FIR filter length (Window Method)
  fs = 22050; % sampling rate assumed (Hz)
  f1 = 530;   % lower pass-band limit = transition bandwidth (Hz)
  beta = 8;   % beta for Kaiser window for decent side-lobe rejection
Recall that, for a rectangular window, our minimum transition bandwidth would be $ 2f_s/M \approx 172$ Hz, and for a Hamming window, $ 4f_s/M \approx 343$ Hz. In this example, using a Kaiser window with $ \beta=8$ ( $ \alpha=\beta/\pi\approx 2.5$ ), the main-lobe width is on the order of $ 5f_s/M \approx 430$ Hz, so we expect transition bandwidths of this width. The choice $ f_1=530$ above should therefore be sufficient, but not ``tight''.5.8 For each doubling of the filter length (or each halving of the sampling rate), we may cut $ f_1$ in half.

Matlab, Continued

Given the above design parameters, we compute some derived parameters as follows:

  fn = fs/2;             % Nyquist limit (Hz)
  f2 = fn - f1;          % upper pass-band limit
  N = 2^(nextpow2(8*M)); % large FFT for interpolated display
  k1 = round(N*f1/fs);   % lower band edge in bins
  if k1<2, k1=2; end;    % cannot have dc or fn response
  kn = N/2 + 1;          % bin index at Nyquist limit (1-based)
  k2 = kn-k1+1;          % high-frequency band edge
  f1 = k1*fs/N           % quantized band-edge frequencies
  f2 = k2*fs/N
Setting the upper transition band the same as the low-frequency band ( $ f_2 = f_s/2 - f_1$ ) provides an additional benefit: the symmetry of the desired response about $ f_s/4$ cuts the computational expense of the filter in half, because it forces every other sample in the impulse response to be zero [224, p. 172].5.9

Kaiser Window

With the filter length $ M=257$ and Kaiser window $ \beta=8$ as given above, we may compute the Kaiser window itself in matlab via

  w = kaiser(M,beta)'; % Kaiser window in "linear phase form"
The spectrum of this window (zero-padded by more than a factor of 8) is shown in Fig.4.9 (full magnitude spectrum) and Fig.4.10 (zoom-in on the main lobe).

Figure 4.9: The Kaiser window magnitude spectrum.

Figure 4.10: Kaiser window magnitude spectrum near dc.

Windowing a Desired Impulse Response Computed by the
Frequency Sampling Method

The next step is to apply our Kaiser window to the ``desired'' impulse response, where ``desired'' means a time-shifted (by 1/2 sample) and bandlimited (to introduce transition bands) version of the ``ideal'' impulse response in (4.22). In principle, we are using the frequency-sampling method4.4) to prepare a desired FIR filter of length $ N=4096$ as the inverse FFT of a desired frequency response prepared by direct Fourier intuition. This long FIR filter is then ``windowed'' down to length $ M=257$ to give us our final FIR filter designed by the window method.

If the smallest transition bandwidth is $ f_1$ Hz, then the FFT size $ N$ should satisfy $ N\gg f_s/f_1$ . Otherwise, there may be too much time aliasing in the desired impulse response.5.10 The only non-obvious part in the matlab below is ``.^8'' which smooths the taper to zero and looks better on a log magnitude scale. It would also make sense to do a linear taper on a dB scale which corresponds to an exponential taper to zero.

H = [ ([0:k1-2]/(k1-1)).^8,ones(1,k2-k1+1),...
      ([k1-2:-1:0]/(k1-1)).^8, zeros(1,N/2-1)];
Figure 4.11 shows our desired amplitude response so constructed.

Figure 4.11: Desired single-sideband-filter frequency response.

Now we inverse-FFT the desired frequency response to obtain the desired impulse response:

h = ifft(H); % desired impulse response
hodd = imag(h(1:2:N));     % This should be zero
ierr = norm(hodd)/norm(h); % Look at numerical round-off error
% Typical value: ierr = 4.1958e-15
% Also look at time aliasing:
aerr = norm(h(N/2-N/32:N/2+N/32))/norm(h);
% Typical value: 4.8300e-04
The real part of the desired impulse response is shown in Fig.4.12, and the imaginary part in Fig.4.13.

Figure 4.12: Real part of the desired single-sideband-filter impulse response.

Figure 4.13: Desired Hilbert-transform-filter impulse response.

Now use the Kaiser window to time-limit the desired impulse response:

% put window in zero-phase form:
wzp = [w((M+1)/2:M), zeros(1,N-M), w(1:(M-1)/2)];
hw = wzp .* h; % single-sideband FIR filter, zero-centered
Hw = fft(hw);  % for results display: plot(db(Hw));
hh = [hw(N-(M-1)/2+1:N),hw(1:(M+1)/2)]; % caual FIR
% plot(db(fft([hh,zeros(1,N-M)]))); % freq resp plot

Figure 4.14 and Fig.4.15 show the normalized dB magnitude frequency response of our final FIR filter consisting of the $ M$ nonzero samples of hw.

Figure 4.14: Frequency response of the Kaiser-windowed impulse response.

Figure 4.15: Zoomed frequency response of the Kaiser-windowed impulse response.

More General FIR Filter Design

We have looked at more than just FIR filter design by the window method and frequency-sampling technique. The general steps were

  1. Prepare a desired frequency-response that ``seems achievable''
  2. Inverse FFT
  3. Window the result (time-limit it)
  4. FFT that to see how it looks
In general, the end result will be a somewhat smoothed version of what we originally asked for in the frequency domain. However, this smoothing will be minimal if we asked for a truly ``doable'' desired frequency response. Because the above steps are fast and non-iterative, they can be used effectively in response to an interactive user interface (such as a set of audio graphic-equalizer sliders that are interpolated to form the desired frequency response), or even in a real-time adaptive system.

Comparison to Optimal Chebyshev FIR Filter

Let's now compare the window-method design using the Kaiser window to the optimal equiripple FIR filter design given by the Remez multiple exchange algorithm.

Note, by the way, that many filter-design software functions, such as firpm have special modes for designing Hilbert-transform filters [224].

It turns out that the Remez exchange algorithm has convergence problems for filters larger than a few hundred taps. Therefore, the FIR filter length $ M=257$ was chosen above to be small enough to work out in this comparison. However, keep in mind that for very large filter orders, the Remez exchange method may not be an option. There are more recently developed methods for optimal Chebyshev FIR filter design, using ``convex optimization'' techniques, that may continue to work at very high orders [218,22,153]. The fast nonparametric methods discussed above (frequency sampling, window method) will work fine at extremely high orders.

The following Matlab command will try to design the FIR Hilbert-transform filter of the desired length with the desired transition bands:

 hri = firpm(M-1, [f1,f2]/fn, [1,1], [1], 'Hilbert');
Instead, however, we will use a more robust method [228] which uses the Remez exchange algorithm to design a lowpass filter, followed by modulation of the lowpass impulse-response by a complex sinusoid at frequency $ f_s/4$ in order to frequency-shift the lowpass to the single-sideband filter we seek:
tic; % remember the current time
hrm = firpm(M-1, [0,(f2-fs/4)/fn,0.5,1], [1,1,0,0], [1,10]);
dt = toc; % design time dt can be minutes
hr = hrm .* j .^ [0:M-1]; % modulate lowpass to single-sideband
The weighting [1,10] in the call to firpm above says ``make the pass-band ripple $ 10$ times that of the stop-band.'' For steady-state audio spectra, pass-band ripple can be as high as $ 0.1$ dB or more without audible consequences.5.11 The result is shown in Fig.4.16 (full amplitude response) and Fig.4.17 (zoom-in on the dc transition band). By symmetry the high-frequency transition region is identical (but flipped):

Figure 4.16: Frequency response of the optimal Chebyshev FIR filter designed by the Remez exchange algorithm.

Figure 4.17: Transition region of the optimal Chebyshev frequency response.

In this case we did not normalize the peak amplitude response to 0 dB because it has a ripple peak of about 1 dB in the pass-band. Figure 4.18 shows a zoom-in on the pass-band ripple.

Figure 4.18: Pass-Band ripple for optimal Chebyshev frequency response.


We can note the following points regarding our single-sideband FIR filter design by means of direct Fourier intuition, frequency-sampling, and the window-method:

  • The pass-band ripple is much smaller than 0.1 dB, which is ``over designed'' and therefore wasting of taps.

  • The stop-band response ``droops'' which ``wastes'' filter taps when stop-band attenuation is the only stop-band specification. In other words, the first stop-band ripple drives the spec ($ -80$ dB), while all higher-frequency ripples are over-designed. On the other hand, a high-frequency ``roll-off'' of this nature is quite natural in the frequency domain, and it corresponds to a ``smoother pulse'' in the time domain. Sometimes making the stop-band attenuation uniform will cause small impulses at the beginning and end of the impulse response in the time domain. (The pass-band and stop-band ripple can ``add up'' under the inverse Fourier transform integral.) Recall this impulsive endpoint phenomenon for the Chebyshev window shown in Fig.3.33.

  • The pass-band is degraded by early roll-off. The pass-band edge is not exactly in the desired place.

  • The filter length can be thousands of taps long without running into numerical failure. Filters this long are actually needed for sampling rate conversion [270,218].

We can also note some observations regarding the optimal Chebyshev version designed by the Remez multiple exchange algorithm:

  • The stop-band is ideal, equiripple.

  • The transition bandwidth is close to half that of the window method. (We already knew our chosen transition bandwidth was not ``tight'', but our rule-of-thumb based on the Kaiser-window main-lobe width predicted only about $ 23$ % excess width.)

  • The pass-band is ideal, though over-designed for static audio spectra.

  • The computational design time is orders of magnitude larger than that for window method.

  • The design fails to converge for filters much longer than 256 taps. (Need to increase working precision or use a different method to get longer optimal Chebyshev FIR filters.)

Practical application of Hilbert-transform filters to phase-vocoder analysis for additive synthesis is discussed in §G.10.

Generalized Window Method

Reiterating and expanding on points made in §4.6.3, often we need a filter with a frequency response that is not analytically known. An example is a graphic equalizer in which a user may manipulate sliders in a graphical user interface to control the gain in each of several frequency bands. From the foregoing, the following procedure, based in spirit on the window method (§4.5), can yield good results:

  1. Synthesize the desired frequency response as the smoothest possible interpolation of the desired frequency-response points. For example, in a graphic equalizer, cubic splines [286] could be used to connect the desired band gains.5.12

  2. If the desired frequency response is real (as in simple band gains), either plan for a zero-phase filter in the end, or synthesize a desired phase, such as linear phase or minimum phase (see §4.8 below).

  3. Perform the inverse Fourier transform of the (sampled) desired frequency response to obtain the desired impulse response.

  4. Plot an overlay of the desired impulse response and the window to be applied, ensuring that the great majority of the signal energy in the desired impulse response lies under the window to be used.

  5. Multiply by the window.

  6. Take an FFT (now with zero padding introduced by the window).

  7. Plot an overlay of the original desired response and the response retained after time-domain windowing, and verify that the specifications are within an acceptable range.
In summary, FIR filters can be designed nonparametrically, directly in the frequency domain, followed by a final smoothing (windowing in the time domain) which guarantees that the FIR length will be precisely limited. As we'll discuss in Chapter 8, it is necessary to precisely limit the FIR filter length to avoid time-aliasing in an FFT-convolution implementation.

Minimum-Phase Filter Design

Above, we used the Hilbert transform to find the imaginary part of an analytic signal from its real part. A closely related application of the Hilbert transform is constructing a minimum phase [263] frequency response from an amplitude response.

Let $ H(e^{j\omega})$ denote a desired complex, minimum-phase frequency response in the digital domain ($ z$ plane):

$\displaystyle H(e^{j\omega}) \isdefs G(\omega)e^{j\Theta(\omega)},$ (5.23)

and suppose we have only the amplitude response

$\displaystyle G(\omega) \isdefs \left\vert H(e^{j\omega})\right\vert.$ (5.24)

Then the phase response $ \Theta(\omega)$ can be computed as the Hilbert transform of $ \ln G(\omega)$ . This can be seen by inspecting the log frequency response:

$\displaystyle \ln H(e^{j\omega}) \eqsp \ln G(\omega) + j\Theta(\omega)$ (5.25)

If $ \Theta$ is computed from $ G$ by the Hilbert transform, then $ \ln
H(e^{j\omega})$ is an ``analytic signal'' in the frequency domain. Therefore, it has no ``negative times,'' i.e., it is causal. The time domain signal corresponding to a log spectrum is called the cepstrum [263]. It is reviewed in the next section that a frequency response is minimum phase if and only if the corresponding cepstrum is causal [198, Ch. 10], [263, Ch. 11].

Minimum-Phase and Causal Cepstra

To show that a frequency response is minimum phase if and only if the corresponding cepstrum is causal, we may take the log of the corresponding transfer function, obtaining a sum of terms of the form $ \ln(1-\xi_iz^{-1})$ for the zeros and $ -\ln(1-\rho _iz^{-1})$ for the poles. Since all poles and zeros of a minimum-phase system must be inside the unit circle of the $ z$ plane, the Laurent expansion of all such terms (the cepstrum) must be causal. In practice, as discussed in [263], we may compute an approximate cepstrum as an inverse FFT of the log spectrum, and make it causal by ``flipping'' the negative-time cepstral coefficients around to positive time (adding them to the positive-time coefficients). That is $ c(n) \leftarrow
c(n) + c(-n)$ , for $ n=1,2,\ldots\,,$ and $ c(n)\leftarrow 0$ for $ n<0$ . This effectively inverts all unstable poles and all non-minimum-phase zeros with respect to the unit circle. In other terms, $ p_i
\leftarrow 1/p_i$ (if unstable), and $ \xi_i\leftarrow 1/\xi_i$ (if non-minimum phase).

The Laurent expansion of a differentiable function of a complex variable can be thought of as a two-sided Taylor expansion, i.e., it includes both positive and negative powers of $ z$ , e.g.,

$\displaystyle H(z) = \cdots + h(-2)z^2 + h(-1)z + h(0) + h(1)z^{-1}+ h(2)z^{-2}+ \cdots\,.$ (5.26)

In digital signal processing, a Laurent series is typically expanded about points on the unit circle in the $ z$ plane, because the unit circle--our frequency axis--must lie within the annulus of convergence of the series expansion in most applications. The power-of-$ z$ terms are the noncausal terms, while the power-of-$ z^{-1}$ terms are considered causal. The term $ h(0)$ in the above general example is associated with time 0, and is included with the causal terms.

Optimal FIR Digital Filter Design

We now look briefly at the topic of optimal FIR filter design. We saw examples above of optimal Chebyshev designs (§4.5.2). and an oversimplified optimal least-squares design (§4.3). Here we elaborate a bit on optimality formulations under various error norms.

Lp norms

The $ Lp$ norm of an $ N$ -dimensional vector (signal) $ x$ is defined as

$\displaystyle \left\Vert\,x\,\right\Vert _p \isdefs \left( \sum_{i=0}^{N-1} \vert x_i\vert^p \right)^ \frac{1}{p}.$ (5.27)

Special Cases

  • $ L1$ norm

    $\displaystyle \left\Vert\,x\,\right\Vert _1 \isdefs \sum_{i=0}^{N-1} \vert x_i\vert$ (5.28)

    • Sum of the absolute values of the elements
    • ``City block'' distance
  • $ L2$ norm

    $\displaystyle \left\Vert\,x\,\right\Vert _2 \isdefs \sqrt{ \sum_{i=0}^{N-1} \vert x_i\vert^2 }$ (5.29)

    • ``Euclidean'' distance
    • Minimized by ``Least Squares'' techniques

  • $ L-infinity$ norm

    $\displaystyle \left\Vert\,x\,\right\Vert _\infty \isdefs \lim_{p\to\infty} \left( \sum_{i=0}^{N-1} \vert x_i\vert^p \right)^\frac{1}{p} \protect$ (5.30)

    In the limit as $ p \rightarrow \infty$ , the $ Lp$ norm of $ x$ is dominated by the maximum element of $ x$ . Optimal Chebyshev filters minimize this norm of the frequency-response error.

Filter Design using Lp Norms

Formulated as an $ Lp$ norm minimization, the FIR filter design problem can be stated as follows:

$\displaystyle \min_h \left\Vert W(\omega_k)\left[H(\omega_k) - D(\omega_k)\right]\right\Vert _p$ (5.31)

  • $ h = $ FIR filter coefficients
  • $ \omega_k = $ suitable discrete set of frequencies
  • $ D(\omega_k) = $ desired (complex) frequency response
  • $ H(\omega_k) = $ obtained frequency response (typically fft(h))
  • $ W(\omega_k) = $ (optional) error weighting function
An especially valuable property of FIR filter design under $ Lp$ norms is that the error norm is typically a convex function of the filter coefficients, rendering it amenable to a wide variety of convex-optimization algorithms [22]. The following sections look at some specific cases.

Optimal Chebyshev FIR Filters

As we've seen above, the defining characteristic of FIR filters optimal in the Chebyshev sense is that they minimize the maximum frequency-response error-magnitude over the frequency axis. In other terms, an optimal Chebyshev FIR filter is optimal in the minimax sense: The filter coefficients are chosen to minimize the worst-case error (maximum weighted error-magnitude ripple) over all frequencies. This also means it is optimal in the $ L-infinity$ sense because, as noted above, the $ L-infinity$ norm of a weighted frequency-response error $ E(\omega) = W(\omega) [ H(\omega) -
D(\omega)]$ is the maximum magnitude over all frequencies:

$\displaystyle \left\Vert\,E\,\right\Vert _\infty \eqsp \max_{-\pi \leq \omega < \pi} \left\vert E(\omega)\right\vert$ (5.32)

Thus, we can say that an optimal Chebyshev filter minimizes the $ L-infinity$ norm of the (possibly weighted) frequency-response error. The $ L-infinity$ norm is also called the uniform norm. While the optimal Chebyshev FIR filter is unique, in principle, there is no guarantee that any particular numerical algorithm can find it.

The optimal Chebyshev FIR filter can often be found effectively using the Remez multiple exchange algorithm (typically called the Parks-McClellan algorithm when applied to FIR filter design) [176,224,66]. This was illustrated in §4.6.4 above. The Parks-McClellan/Remez algorithm also appears to be the most efficient known method for designing optimal Chebyshev FIR filters (as compared with, say linear programming methods using matlab's linprog as in §3.13). This algorithm is available in Matlab's Signal Processing Toolbox as firpm() (remez() in (Linux) Octave).5.13There is also a version of the Remez exchange algorithm for complex FIR filters. See §4.10.7 below for a few details.

The Remez multiple exchange algorithm has its limitations, however. In particular, convergence of the FIR filter coefficients is unlikely for FIR filters longer than a few hundred taps or so.

Optimal Chebyshev FIR filters are normally designed to be linear phase [263] so that the desired frequency response $ D(\omega)$ can be taken to be real (i.e., first a zero-phase FIR filter is designed). The design of linear-phase FIR filters in the frequency domain can therefore be characterized as real polynomial approximation on the unit circle [229,258].

In optimal Chebyshev filter designs, the error exhibits an equiripple characteristic--that is, if the desired response is $ D(\omega)$ and the ripple magnitude is $ \epsilon$ , then the frequency response of the optimal FIR filter (in the unweighted case, i.e., $ W(\omega)=1$ for all $ \omega$ ) will oscillate between $ D(\omega)+\epsilon$ and $ D(\omega)-\epsilon$ as $ \omega$ increases. The powerful alternation theorem characterizes optimal Chebyshev solutions in terms of the alternating error peaks. Essentially, if one finds sufficiently many for the given FIR filter order, then you have found the unique optimal Chebyshev solution [224]. Another remarkable result is that the Remez multiple exchange converges monotonically to the unique optimal Chebyshev solution (in the absence of numerical round-off errors).

Fine online introductions to the theory and practice of Chebyshev-optimal FIR filter design are given in [32,283].

The window method4.5) and Remez-exchange method together span many practical FIR filter design needs, from ``quick and dirty'' to essentially ideal FIR filters (in terms of conventional specifications).

Least-Squares Linear-Phase FIR Filter Design

Another versatile, effective, and often-used case is the weighted least squares method, which is implemented in the matlab function firls and others. A good general reference in this area is [204].

Let the FIR filter length be $ L+1$ samples, with $ L$ even, and suppose we'll initially design it to be centered about the time origin (``zero phase''). Then the frequency response is given on our frequency grid $ \omega_k$ by

$\displaystyle H(\omega_k) \eqsp \sum_{n=-L/2}^{L/2} h_n e^{-j\omega_kn}$ (5.33)

Enforcing even symmetry in the impulse response, i.e., $ h_n =
h_{-n}$ , gives a zero-phase FIR filter that we can later right-shift $ L/2$ samples to make a causal, linear phase filter. In this case, the frequency response reduces to a sum of cosines:

$\displaystyle H( \omega_k ) \eqsp h_0 + 2\sum_{n=1}^{L/2} h_n \cos (\omega_k n), \quad k=0,1,2,\ldots, N-1$ (5.34)

or, in matrix form:

$\displaystyle \underbrace{\left[ \begin{array}{c} H(\omega_0) \\ H(\omega_1) \\ \vdots \\ H(\omega_{N-1}) \end{array} \right]}_{{\underline{d}}} = \underbrace{\left[ \begin{array}{ccccc} 1 & 2\cos(\omega_0) & \dots & 2\cos[\omega_0(L/2)] \\ 1 & 2\cos(\omega_1) & \dots & 2\cos[\omega_1(L/2)] \\ \vdots & \vdots & & \vdots \\ 1 & 2\cos(\omega_{N-1}) & \dots & 2\cos[\omega_{N-1}(L/2)] \end{array} \right]}_\mathbf{A} \underbrace{\left[ \begin{array}{c} h_0 \\ h_1 \\ \vdots \\ h_{L/2} \end{array} \right]}_{{\underline{h}}} \protect$ (5.35)

Recall from §3.13.8, that the Remez multiple exchange algorithm is based on this formulation internally. In that case, the left-hand-side includes the alternating error, and the frequency grid $ \omega_k$ iteratively seeks the frequencies of maximum error--the so-called extremal frequencies.

In matrix notation, our filter-design problem can be stated as (cf. §3.13.8)

$\displaystyle \min_{{\underline{h}}} \left\Vert \mathbf{A}{\underline{h}}-{\underline{d}}\right\Vert _2^2$ (5.36)

where these quantities are defined in (4.35). We can denote the optimal least-squares solution by

$\displaystyle {\underline{\hat{h}}}\isdefs \arg \min_{\underline{h}}\left\Vert\,\mathbf{A}{\underline{h}}-{\underline{d}}\,\right\Vert _2 \eqsp \arg \min_{\underline{h}}\left\Vert\,\mathbf{A}{\underline{h}}-{\underline{d}}\,\right\Vert _2^2$ (5.37)

To find $ {\underline{\hat{h}}}$ , we need to minimize
$\displaystyle \left\Vert\,\mathbf{A}{\underline{h}}-{\underline{d}}\,\right\Vert _2^2$ $\displaystyle =$ $\displaystyle (\mathbf{A}{\underline{h}}-{\underline{d}})^T(\mathbf{A}{\underline{h}}-{\underline{d}})$  
  $\displaystyle =$ $\displaystyle ({\underline{h}}^T\mathbf{A}^T-{\underline{d}}^T)(\mathbf{A}{\underline{h}}-{\underline{d}})$  
  $\displaystyle =$ $\displaystyle {\underline{h}}^T\mathbf{A}^T\mathbf{A}{\underline{h}}
\protect$ (5.38)

This is a quadratic form in $ {\underline{h}}$ . Therefore, it has a global minimum which we can find by setting the gradient to zero, and solving for $ {\underline{h}}$ .5.14Assuming all quantities are real, equating the gradient to zero yields the so-called normal equations

$\displaystyle \mathbf{A}^T\mathbf{A}{\underline{h}}\eqsp \mathbf{A}^T{\underline{d}}$ (5.39)

with solution

$\displaystyle \zbox {{\underline{\hat{h}}}\eqsp \left[(\mathbf{A}^T\mathbf{A})^{-1}\mathbf{A}^T\right]{\underline{d}}.}$ (5.40)

The matrix

$\displaystyle \mathbf{A}^\dagger \isdefs (\mathbf{A}^T\mathbf{A})^{-1}\mathbf{A}^T$ (5.41)

is known as the (Moore-Penrose) pseudo-inverse of the matrix $ \mathbf {A}$ . It can be interpreted as an orthogonal projection matrix, projecting $ {\underline{d}}$ onto the column-space of $ \mathbf {A}$ [264], as we illustrate further in the next section.

Geometric Interpretation of Least Squares

Typically, the number of frequency constraints is much greater than the number of design variables (filter coefficients). In these cases, we have an overdetermined system of equations (more equations than unknowns). Therefore, we cannot generally satisfy all the equations, and are left with minimizing some error criterion to find the ``optimal compromise'' solution.

In the case of least-squares approximation, we are minimizing the Euclidean distance, which suggests the geometrical interpretation shown in Fig.4.19.

% latex2html id marker 14494\psfrag{Ax}{{\Large $\mathbf{A}{\underline{\hat{h}}}$}}\psfrag{b}{{\Large ${\underline{d}}$}}\psfrag{column}{{\Large column-space of $\mathbf{A}$}}\psfrag{space}{}\begin{figure}[htbp]
\caption{Geometric interpretation of orthogonal
projection of the vector ${\underline{d}}$\ onto the column-space of $\mathbf{A}$.}
Thus, the desired vector $ {\underline{d}}$ is the vector sum of its best least-squares approximation $ \mathbf{A}{\underline{\hat{h}}}$ plus an orthogonal error $ e$ :

$\displaystyle {\underline{d}}\eqsp \mathbf{A}{\underline{\hat{h}}}+ {\underline{e}}.$ (5.42)

In practice, the least-squares solution $ {\underline{\hat{h}}}$ can be found by minimizing the sum of squared errors:

$\displaystyle \hbox{Minimize}_{\underline{h}}\Vert{\underline{e}}\Vert _2 \eqsp \Vert{\underline{d}}-\mathbf{A}{\underline{h}}\Vert _2$ (5.43)

Figure 4.19 suggests that the error vector $ {\underline{d}}-\mathbf{A}{\underline{\hat{h}}}$ is orthogonal to the column space of the matrix $ \mathbf {A}$ , hence it must be orthogonal to each column in $ \mathbf {A}$ :

$\displaystyle \mathbf{A}^T({\underline{d}}-\mathbf{A}{\underline{\hat{h}}})\eqsp 0 \quad\Rightarrow\quad \mathbf{A}^T\mathbf{A}{\underline{\hat{h}}}\eqsp \mathbf{A}^T{\underline{d}}$ (5.44)

This is how the orthogonality principle can be used to derive the fact that the best least squares solution is given by

$\displaystyle {\underline{\hat{h}}}\eqsp (\mathbf{A}^T\mathbf{A})^{-1}\mathbf{A}^T {\underline{d}}\eqsp \mathbf{A}^\dagger {\underline{d}}$ (5.45)

In matlab, it is numerically superior to use ``h= A $ \backslash$ h'' as opposed to explicitly computing the pseudo-inverse as in ``h = pinv(A) * d''. For a discussion of numerical issues in matrix least-squares problems, see, e.g., [92].

We will return to least-squares optimality in §5.7.1 for the purpose of estimating the parameters of sinusoidal peaks in spectra.

Matlab Support for Least-Squares FIR Filter Design

Some of the available functions are as follows:

For more information, type help firls and/or doc firls, etc., and refer to the ``See Also'' section of the documentation for pointers to more relevant functions.

Chebyshev FIR Design via Linear Programming

We return now to the $ L-infinity$ -norm minimization problem of §4.10.2:

$\displaystyle \min_{\underline{h}}\left\Vert\,\mathbf{A}{\underline{h}}-{\underline{d}}\,\right\Vert _\infty, \protect$ (5.46)

and discuss its formulation as a linear programming problem, very similar to the optimal window formulations in §3.13. We can rewrite (4.46) as

$\displaystyle \min_{\underline{h}}\max_k \vert{\underline{a}}_k^T {\underline{h}}- {\underline{d}}_k\vert$ (5.47)

where $ {\underline{a}}_k^T$ denotes the $ k$ th row of the matrix $ \mathbf {A}$ . This can be expressed as
$\displaystyle \min\; t$      
s.t.   $\displaystyle \vert{\underline{a}}_k^T {\underline{h}}-{\underline{d}}_k\vert<t$ (5.48)

Introducing a new variable

$\displaystyle \tilde{{\underline{h}}} \isdefs \left[ \begin{array}{c} {\underline{h}}\\ t \end{array} \right],$ (5.49)

then we can write

$\displaystyle t \eqsp f^T \tilde{{\underline{h}}} \isdefs \left[\begin{array}{cccc} 0 & \cdots & 0 & 1 \end{array}\right] \tilde{{\underline{h}}},$ (5.50)

and our optimization problem can be written in more standard form:
$\displaystyle \min_{\tilde{{\underline{h}}}}$   $\displaystyle f^T \tilde{{\underline{h}}}$  
s.t.   $\displaystyle \left\vert [\begin{array}{cc} a_k^T \; 0\\ \end{array} ] \cdot \tilde{{\underline{h}}} -{\underline{d}}_k \right\vert
\;<\; \tilde{{\underline{h}}}^T f^T\tilde{{\underline{h}}}$ (5.51)

Thus, we are minimizing a linear objective, subject to a set of linear inequality constraints. This is known as a linear programming problem, as discussed previously in §3.13.1, and it may be solved using the matlab linprog function. As in the case of optimal window design, linprog is not normally as efficient as the Remez multiple exchange algorithm (firpm), but it is more general, allowing for linear equality and inequality constraints to be imposed.

More General Real FIR Filters

So far we have looked at the design of linear phase filters. In this case, $ \mathbf {A}$ , $ {\underline{h}}$ and $ {\underline{d}}$ are all real. In some applications, we need to specify both the magnitude and phase of the frequency response. Examples include

In general, these all involve complex desired frequency-response samples ( $ {\underline{d}}$ complex), and the matrix $ \mathbf {A}$ remains a more general Fourier transform matrix that cannot be reduced to a matrix of cosines.

Nonlinear-Phase FIR Filter Design

Above, we considered only linear-phase (symmetric) FIR filters. The same methods also work for antisymmetric FIR filters having a purely imaginary frequency response, when zero-centered, such as differentiators and Hilbert transformers [224].

We now look at extension to nonlinear-phase FIR filters, managed by treating the real and imaginary parts separately in the frequency domain [218]. In the nonlinear-phase case, the frequency response is complex in general. Therefore, in the formulation Eq.$ \,$ (4.35) both $ {\underline{d}}$ and $ \mathbf {A}$ are complex, but we still desire the FIR filter coefficients $ {\underline{h}}$ to be real. If we try to use ' $ \backslash$ ' or pinv in matlab, we will generally get a complex result for $ {\underline{\hat{h}}}$ .

Problem Formulation

$\displaystyle \min_{\underline{h}}\left\Vert\,\mathbf{A}{\underline{h}}-{\underline{d}}\,\right\Vert _2$ (5.52)

where $ \mathbf{A}\in {\bf C}^{N\times M}$ , $ {\underline{d}}\in {\bf C}^{N\times
1}$ , and $ {\underline{h}}\in {\bf R}^{M\times 1}$ . Hence we have,

$\displaystyle \min_{\underline{h}}\left\Vert \left[{\cal{R}}(\mathbf{A})+j{\cal{I}}(\mathbf{A})\right]{\underline{h}} - \left[ {\cal{R}}({\underline{d}})+j{\cal{I}}({\underline{d}}) \right] \right\Vert _2^2$ (5.53)

which can be written as

$\displaystyle \min_{\underline{h}}\left\Vert\, {\cal{R}}(\mathbf{A}){\underline{h}}- {\cal{R}}({\underline{d}}) +j \left[ {\cal{I}}(\mathbf{A}){\underline{h}}+{\cal{I}}({\underline{d}}) \right] \,\right\Vert _2^2$ (5.54)


$\displaystyle \min_{\underline{h}}\left\vert \left\vert \left[ \begin{array}{c} {\cal{R}}(\mathbf{A}) \\ {\cal{I}}(\mathbf{A}) \end{array} \right] {\underline{h}} - \left[ \begin{array}{c} {\cal{R}}({\underline{d}}) \\ {\cal{I}}({\underline{d}}) \end{array}\right] \right\vert \right\vert _2^2$ (5.55)

which is written in terms of only real variables.

In summary, we can use the standard least-squares solvers in matlab and end up with a real solution for the case of complex desired spectra and nonlinear-phase FIR filters.

Matlab for General FIR Filter Design

The cfirpm function (formerly cremez) [116,117] in the Matlab Signal Processing Toolbox performs complex $ L-infinity$ FIR filter design (``Complex FIR Parks-McClellan''). Convergence is theoretically guaranteed for arbitrary magnitude and phase specifications versus frequency. It reduces to Parks-McClellan algorithm (Remez second algorithm) as a special case.

The firgr function (formerly gremez) in the Matlab Filter Design Toolbox performs ``generalized'' $ L-infinity$ FIR filter design, adding support for minimum-phase FIR filter design, among other features [254].

Finally, the fircband function in the Matlab DSP System Toolbox designs a variety of real FIR filters with various filter-types and constraints supported.

This is of course only a small sampling of what is available. See, e.g., the Matlab documentation on its various toolboxes relevant to filter design (especially the Signal Processing and Filter Design toolboxes) for much more.

Second-Order Cone Problems

In Second-Order Cone Problems (SOCP), a linear function is minimized over the intersection of an affine set and the product of second-order (quadratic) cones [153,22]. Nonlinear, convex problem including linear and (convex) quadratic programs are special cases. SOCP problems are solved by efficient primal-dual interior-point methods. The number of iterations required to solve a problem grows at most as the square root of the problem size. A typical number of iterations ranges between 5 and 50, almost independent of the problem size.


See §3.13 for examples of optimal FFT window design using linprog.

Nonlinear Optimization in Matlab

There are various matlab functions available for nonlinear optimizations as well. These can be utilized in more exotic FIR filter designs, such as designs driven more by perceptual criteria:

  • The fsolve function in Octave, or the Matlab Optimization Toolbox, attempts to solve unconstrained, overdetermined, nonlinear systems of equations.
  • The Octave function sqp handles constrained nonlinear optimization.
  • The Octave optim package includes many additional functions such as leasqr for performing Levenberg-Marquardt nonlinear regression. (Say, e.g., which leasqr and explore its directory.)
  • The Matlab Optimization Toolbox similarly contains many functions for optimization.
Again, this is only a sampling of what is available.

Next Section:
Spectrum Analysis of Sinusoids
Previous Section:
Spectrum Analysis Windows