Delay-Line and Signal Interpolation

It is often necessary for a delay line to vary in length. Consider, for example, simulating a sound ray as in Fig.2.8 when either the source or listener is moving. In this case, separate read and write pointers are normally used (as opposed to a shared read-write pointer in Fig.2.2). Additionally, for good quality audio, it is usually necessary to interpolate the delay-line length rather than ``jumping'' between integer numbers of samples. This is typically accomplished using an interpolating read, but interpolating writes are also used (e.g., for true Doppler simulation, as described in §5.9).

Delay-Line Interpolation

As mentioned above, when an audio delay line needs to vary smoothly over time, some form of interpolation between samples is usually required to avoid ``zipper noise'' in the output signal as the delay length changes. There is a hefty literature on ``fractional delay'' in discrete-time systems, and the survey in [267] is highly recommended.

This section will describe the most commonly used cases. Linear interpolation is perhaps most commonly used because it is very straightforward and inexpensive, and because it sounds very good when the signal bandwidth is small compared with half the sampling rate. For a delay line in a nearly lossless feedback loop, such as in a vibrating string simulation, allpass interpolation is sometimes a better choice since it costs the same as linear interpolation in the first-order case and has no gain distortion. (Feedback loops can be very sensitive to gain distortions.) Finally, in later sections, some higher-order interpolation methods are described.

Linear Interpolation

Linear interpolation works by effectively drawing a straight line between two neighboring samples and returning the appropriate point along that line.

More specifically, let $ \eta$ be a number between 0 and 1 which represents how far we want to interpolate a signal $ y$ between time $ n$ and time $ n+1$. Then we can define the linearly interpolated value $ \hat y(n+\eta)$ as follows:

$\displaystyle \hat y(n+\eta) = (1-\eta) \cdot y(n) + \eta \cdot y(n+1) \protect$ (5.1)

For $ \eta=0$, we get exactly $ \hat y(n)=y(n)$, and for $ \eta=1$, we get exactly $ \hat y(n+1)=y(n+1)$. In between, the interpolation error $ \left\vert\hat y(n+\eta)-y(n+\eta)\right\vert$ is nonzero, except when $ y(t)$ happens to be a linear function between $ y(n)$ and $ y(n+1)$.

One-Multiply Linear Interpolation

Note that by factoring out $ \eta$, we can obtain a one-multiply form,

$\displaystyle \hat y(n+\eta) = y(n) + \eta\cdot\left[y(n+1) - y(n)\right].
$

Thus, the computational complexity of linear interpolation is one multiply and two additions per sample of output.


Fractional Delay Filtering by Linear Interpolation

Figure 4.1: Linearly interpolated delay line.
\includegraphics[width=\twidth]{eps/delayli}

A linearly interpolated delay line is depicted in Fig.4.1. In contrast to Eq.$ \,$(4.1), we interpolate linearly between times $ n-M$ and $ n-M-1$, and $ \eta$ is called the fractional delay in samples. The first-order (linear-interpolating) filter following the delay line in Fig.4.1 may be called a fractional delay filter [267]. Equation (4.1), on the other hand, expresses the more general case of an interpolated table lookup, where $ y(n)$ is regarded as a table of samples and $ \hat y(n+\eta)$ is regarded as an interpolated table-lookup based on the samples stored at indices $ n$ and $ n+1$.

The difference between a fractional delay filter and an interpolated table lookup is that table-lookups can ``jump around,'' while fractional delay filters receive a sequential stream of input samples and produce a corresponding sequential stream of interpolated output values. As a result of this sequential access, fractional delay filters may be recursive IIR digital filters (provided the desired delay does not change too rapidly over time). In contrast, ``random-access'' interpolated table lookups are typically implemented using weighted linear combinations, making them equivalent to nonrecursive FIR filters in the sequential case.5.1

The C++ class implementing a linearly interpolated delay line in the Synthesis Tool Kit (STK) is called DelayL.

The frequency response of linear interpolation for fixed fractional delay ($ \eta$ fixed in Fig.4.1) is shown in Fig.4.2. From inspection of Fig.4.1, we see that linear interpolation is a one-zero FIR filter. When used to provide a fixed fractional delay, the filter is linear and time-invariant (LTI). When the fractional delay $ \eta$ changes over time, it is a linear time-varying filter.

Figure 4.2: Linear interpolation frequency responses for delays between 0 and 1. Note the higher accuracy at low frequencies, reaching zero error at dc for all fractional delays.
\includegraphics[width=\twidth]{eps/linear1}

Linear interpolation sounds best when the signal is oversampled. Since natural audio spectra tend to be relatively concentrated at low frequencies, linear interpolation tends to sound very good at high sampling rates.

When interpolation occurs inside a feedback loop, such as in digital waveguide models for vibrating strings (see Chapter 6), errors in the amplitude response can be highly audible (particularly when the loop gain is close to 1, as it is for steel strings, for example). In these cases, it is possible to eliminate amplitude error (at some cost in delay error) by using an allpass filter for delay-line interpolation.


First-Order Allpass Interpolation

A delay line interpolated by a first-order allpass filter is drawn in Fig.4.3.

Figure 4.3: Allpass-interpolated delay line.
\includegraphics[width=\twidth]{eps/delayai}

Intuitively, ramping the coefficients of the allpass gradually ``grows'' or ``hides'' one sample of delay. This tells us how to handle resets when crossing sample boundaries.

The difference equation is

\begin{eqnarray*}
{\hat x}(n-\Delta) \isdef y(n) &=& \eta \cdot x(n) + x(n-1) - ...
...y(n-1) \\
&=& \eta \cdot \left[ x(n) - y(n-1)\right] + x(n-1).
\end{eqnarray*}

Thus, like linear interpolation, first-order allpass interpolation requires only one multiply and two adds per sample of output.

The transfer function is

$\displaystyle H(z) = \frac{\eta + z^{-1}}{1 + \eta z^{-1}}. \protect$ (5.2)

At low frequencies ($ z\to 1$), the delay becomes

$\displaystyle \Delta \approx \frac{1-\eta}{1+\eta} \protect$ (5.3)

Figure 4.4 shows the phase delay of the first-order digital allpass filter for a variety of desired delays at dc. Since the amplitude response of any allpass is 1 at all frequencies, there is no need to plot it.

Figure 4.4: Allpass-interpolation phase delay for a variety of desired delays (exact at dc).
\includegraphics[width=\twidth]{eps/allpass1}

The first-order allpass interpolator is generally controlled by setting its dc delay to the desired delay. Thus, for a given desired delay $ \Delta$, the allpass coefficient is (from Eq.$ \,$(4.3))

$\displaystyle \eta \approx \frac{1-\Delta}{1+\Delta}
$

From Eq.$ \,$(4.2), we see that the allpass filter's pole is at $ z=-\eta$, and its zero is at $ z=-1/\eta$. A pole-zero diagram for $ \Delta =0.1$ is given in Fig.4.5. Thus, zero delay is provided by means of a pole-zero cancellation! Due to inevitable round-off errors, pole-zero cancellations are to be avoided in practice. For this reason and others (discussed below), allpass interpolation is best used to provide a delay range lying wholly above zero, e.g.,

$\displaystyle \Delta\in[0.1,1.1] \;\longleftrightarrow\; \eta\in[-0.05,0.82]
$

Figure 4.5: Allpass-interpolator pole-zero diagram for $ \Delta =0.1$.
\includegraphics[width=4.8in]{eps/ap1pz}
Note that, unlike linear interpolation, allpass interpolation is not suitable for ``random access'' interpolation in which interpolated values may be requested at any arbitrary time in isolation. This is because the allpass is recursive so that it must run for enough samples to reach steady state. However, when the impulse response is reasonably short, as it is for delays near one sample, it can in fact be used in ``random access mode'' by giving it enough samples with which to work.

The STK class implementing allpass-interpolated delay is DelayA.

Minimizing First-Order Allpass Transient Response

In addition to approaching a pole-zero cancellation at $ z=-1$, another undesirable artifact appears as $ \Delta\to0$: The transient response also becomes long when the pole at $ z=-\eta$ gets close to the unit circle.

A plot of the impulse response for $ \Delta =0.1$ is shown in Fig.4.6. We see a lot of ``ringing'' near half the sampling rate. We actually should expect this from the nonlinear-phase distortion which is clearly evident near half the sampling rate in Fig.4.4. We can interpret this phenomenon as the signal components near half the sampling rate being delayed by different amounts than other frequencies, therefore ``sliding out of alignment'' with them.

Figure 4.6: Impulse response of the first-order allpass interpolator for $ \Delta =0.1$.
\includegraphics[width=\twidth]{eps/ap1ir}

For audio applications, we would like to keep the impulse-response duration short enough to sound ``instantaneous.'' That is, we do not wish to have audible ``ringing'' in the time domain near $ f_s/2$. For high quality sampling rates, such as larger than $ f_s=40$ kHz, there is no issue of direct audibility, since the ringing is above the range of human hearing. However, it is often convenient, especially for research prototyping, to work at lower sampling rates where $ f_s/2$ is audible. Also, many commercial products use such sampling rates to save costs.

Since the time constant of decay, in samples, of the impulse response of a pole of radius $ R$ is approximately

$\displaystyle \frac{\tau}{T} \approx \frac{1}{1-R},
$

and since a 60-dB decay occurs in about 7 time constants (``$ t_{60}$'') [451, p. 38], we can limit the pole of the allpass filter to achieve any prescribed specification on maximum impulse-response duration.

For example, suppose 100 ms is chosen as the maximum $ t_{60}$ allowed at a sampling rate of $ f_s=10,000$. Then applying the above constraints yields $ \eta\leq 0.993$, corresponding to the allowed delay range $ [0.00351,1.00351]$.


Linear Interpolation as Resampling

Convolution Interpretation

Linearly interpolated fractional delay is equivalent to filtering and resampling a weighted impulse train (the input signal samples) with a continuous-time filter having the simple triangular impulse response

$\displaystyle h_l(t) = \left\{\begin{array}{ll} 1-\left\vert t/T\right\vert, & ...
...ght\vert\leq T, \\ [5pt] 0, & \hbox{otherwise}. \\ \end{array} \right. \protect$ (5.4)

Convolution of the weighted impulse train with $ h_l(t)$ produces a continuous-time linearly interpolated signal

$\displaystyle x(t) = \sum_{n=-\infty}^{\infty} x(nT) h_l(t-nT). \protect$ (5.5)

This continuous result can then be resampled at the desired fractional delay.

In discrete time processing, the operation Eq.$ \,$(4.5) can be approximated arbitrarily closely by digital upsampling by a large integer factor $ M$, delaying by $ L$ samples (an integer), then finally downsampling by $ M$, as depicted in Fig.4.7 [96]. The integers $ L$ and $ M$ are chosen so that $ \eta \approx L/M$, where $ \eta$ the desired fractional delay.

Figure 4.7: Linear interpolation as a convolution.
\includegraphics[width=0.8\twidth]{eps/polyphaseli}

The convolution interpretation of linear interpolation, Lagrange interpolation, and others, is discussed in [407].


Frequency Response of Linear Interpolation

Since linear interpolation can be expressed as a convolution of the samples with a triangular pulse, we can derive the frequency response of linear interpolation. Figure 4.7 indicates that the triangular pulse $ h_l(t)$ serves as an anti-aliasing lowpass filter for the subsequent downsampling by $ M$. Therefore, it should ideally ``cut off'' all frequencies higher than $ 0.5f_s/M$.


Triangular Pulse as Convolution of Two Rectangular Pulses

The 2-sample wide triangular pulse $ h_l(t)$ (Eq.$ \,$(4.4)) can be expressed as a convolution of the one-sample rectangular pulse with itself.

Figure 4.8: The width $ T$ rectangular pulse.
\includegraphics{eps/rectpulse}

The one-sample rectangular pulse is shown in Fig.4.8 and may be defined analytically as

$\displaystyle p_T(t) \isdef u\left(t+\frac{T}{2}\right) - u\left(t-\frac{T}{2}\right),
$

where $ u(t)$ is the Heaviside unit step function:

$\displaystyle u(t) \isdef \left\{\begin{array}{ll}
1, & t\geq 0 \\ [5pt]
0, & t<0 \\
\end{array}\right..
$

Convolving $ p_T(t)$ with itself produces the two-sample triangular pulse $ h_l(t)$:

$\displaystyle h_l(t) = (p_T\ast p_T)(t) \isdef \int_{-\infty}^{\infty} p_T(\tau)p_T(t-\tau)d\tau
$

While the result can be verified algebraically by substituting $ u(t+T/2)-u(t-T/2)$ for $ p_T(t)$, it seen more quickly via graphical convolution.


Linear Interpolation Frequency Response

Since linear interpolation is a convolution of the samples with a triangular pulse $ h_l(t) = (p_T\ast p_T)(t)$ (from Eq.$ \,$(4.5)), the frequency response of the interpolation is given by the Fourier transform $ H_l(f)$, which yields a sinc$ ^2$ function. This frequency response applies to linear interpolation from discrete time to continuous time. If the output of the interpolator is also sampled, this can be modeled by sampling the continuous-time interpolation result in Eq.$ \,$(4.5), thereby aliasing the sinc$ ^2$ frequency response, as shown in Fig.4.9.

Figure: (a) $ \protect$sinc$ ^2(fT) =
{\cal F}_f^2\{p_T/T\}$.
(b) $ \protect\alias _{10}\{\protect$sinc$ ^2(fT)\} \propto
\protect\dtft \{\protect\sample _{T/10}\{p_T\ast p_T\}\}$.
\includegraphics[width=3in]{eps/sincsquared}

In slightly more detail, from $ h_l(t) = (p_T\ast p_T)(t)$, and $ {\cal F}_f\{p_T\}= T$sinc$ ( f T)$, we have

$\displaystyle H_l(f) \isdef {\cal F}_f\{h_l\} = {\cal F}_f\{p_T\ast p_T\} = {\cal F}^2_f\{p_T\} = T^2$sinc$\displaystyle ^2(fT)
$

where we used the convolution theorem for Fourier transforms, and the fact that $ P_T(f) = T$sinc$ ( f T) = \sin(\pi f T) / (\pi f)$.

The Fourier transform of $ h_l(nT/L)$ is the same function aliased on a block of size $ f_s=L/T$ Hz. Both $ H_l$ and its alias are plotted in Fig.4.9. The example in this figure pertains to an output sampling rate which is $ L=10$ times that of the input signal. In other words, the input signal is upsampled by a factor of $ L=10$ using linear interpolation. The ``main lobe'' of the interpolation frequency response $ H_l(f)$ contains the original signal bandwidth; note how it is attenuated near half the original sampling rate ($ f T=1$ in Fig.4.9). The ``sidelobes'' of the frequency response contain attenuated copies of the original signal bandwidth (see the DFT stretch theorem), and thus constitute spectral imaging distortion in the final output (sometimes also referred to as a kind of ``aliasing,'' but, for clarity, that term will not be used for imaging distortion in this book). We see that the frequency response of linear interpolation is less than ideal in two ways:

  • The spectrum is ``rolled'' off near half the sampling rate. In fact, it is nowhere flat within the ``passband'' (-1 to 1 in Fig.4.9).
  • Spectral imaging distortion is suppressed by only 26 dB (the level of the first sidelobe in Fig.4.9.
These qualitative remarks apply to all upsampling factors $ L\geq 2$ using linear interpolation. The case $ L=1$ is considered in the next section.


Special Cases

In the limiting case of $ L=1$, the input and output sampling rates are equal, and all sidelobes of the frequency response $ H_l(f)$ (partially shown in Fig.4.9) alias into the main lobe.

If the output is sampled at the same exact time instants as the input signal, the input and output are identical. In terms of the aliasing picture of the previous section, the frequency response aliases to a perfect flat response over $ fT\in[-1,1]$, with all spectral images combining coherently under the flat gain. It is important in this reconstruction that, while the frequency response of the underlying continuous interpolating filter is aliased by sampling, the signal spectrum is only imaged--not aliased; this is true for all positive integers $ L$ and $ M$ in Fig.4.7.

More typically, when linear interpolation is used to provide fractional delay, identity is not obtained. Referring again to Fig.4.7, with $ L$ considered to be so large that it is effectively infinite, fractional-delay by $ \tau<1$ can be modeled as convolving the samples $ x(n)$ with $ h_l(t-\tau)$ followed by sampling at $ t=nT$. In this case, a linear phase term has been introduced in the interpolator frequency response, giving,

$\displaystyle H_\tau(f) \isdef e^{-j\tau 2\pi f} H_l(f)
$

so that now downsampling to the original sampling rate does not yield a perfectly flat amplitude response for $ fT\in[-1,1]$ (when $ \tau $ is non-integer). Moreover, the phase response is nonlinear as well; a sampled symmetric triangular pulse is only linear phase when the samples fall symmetrically about the midpoint. Some example frequency-responses for various delays $ \tau $ are graphed in Fig.4.2.


Large Delay Changes

When implementing large delay length changes (by many samples), a useful implementation is to cross-fade from the initial delay line configuration to the new configuration:

  • Computational requirements are doubled during the cross-fade.
  • The cross-fade should occur over a time interval long enough to yield a smooth result.
  • The new delay interpolation filter, if any, may be initialized in advance of the cross-fade, for maximum smoothness. Thus, if the transient response of the interpolation filter is $ N$ samples, the new delay-line + interpolation filter can be ``warmed up'' (executed) for $ N$ time steps before beginning the cross-fade. If the cross-fade time is long compared with the interpolation filter duration, ``pre-warming'' is not necessary.
  • This is not a true ``morph'' from one delay length to another since we do not pass through the intermediate delay lengths. However, it avoids a potentially undesirable Doppler effect.
  • A single delay line can be shared such that the cross-fade occurs from one read-pointer (plus associated filtering) to another.


Lagrange Interpolation

Lagrange interpolation is a well known, classical technique for interpolation [193]. It is also called Waring-Lagrange interpolation, since Waring actually published it 16 years before Lagrange [309, p. 323]. More generically, the term polynomial interpolation normally refers to Lagrange interpolation. In the first-order case, it reduces to linear interpolation.

Given a set of $ N+1$ known samples $ f(x_k)$, $ k=0,1,2,\ldots,N$, the problem is to find the unique order $ N$ polynomial $ y(x)$ which interpolates the samples.5.2The solution can be expressed as a linear combination of elementary $ N$th order polynomials:

$\displaystyle y(x) = \sum_{k=0}^N l_k(x)f(x_k) \protect$ (5.6)

where

$\displaystyle l_k(x) \isdef \frac{(x - x_0) \cdots (x - x_{k-1}) (x - x_{k+1}) ...
...x_N)
}{(x_k - x_0) \cdots (x_k - x_{k-1}) (x_k - x_{k+1}) \cdots (x_k - x_N)}.
$

From the numerator of the above definition, we see that $ l_k(x)$ is an order $ N$ polynomial having zeros at all of the samples except the $ k$th. The denominator is simply the constant which normalizes to give $ l_k(x_k)=1$. Thus, we have

$\displaystyle l_k(x_j) = \delta_{kj} \isdef \left\{\begin{array}{ll}
1, & j=k, \\ [5pt]
0, & j\neq k. \\
\end{array}\right.
$

The polynomial $ l_k$ can be interpreted as the $ k$th basis polynomial for constructing a polynomial interpolation of order $ N$ over the $ N+1$ sample points $ x_k$. It is an order $ N$ polynomial having zeros at all of the samples except the $ k$th, where it is 1. An example of a set of eight basis functions $ l_k$ for randomly selected interpolation points $ x_k$ is shown in Fig.4.10.

Figure 4.10: Example Lagrange basis functions in the eighth-order case for randomly selected interpolation points (marked by dotted lines). The unit-amplitude points are marked by dashed lines.
\includegraphics[width=\twidth]{eps/lagrangebases}

Interpolation of Uniformly Spaced Samples

In the uniformly sampled case ($ x_k=kT$ for some sampling interval $ T$), a Lagrange interpolator can be viewed as a Finite Impulse Response (FIR) filter [449]. Such filters are often called fractional delay filters [267], since they are filters providing a non-integer time delay, in general. Let $ h(n)$ denote the impulse response of such a fractional-delay filter. That is, assume the interpolation at point $ x$ is given by

\begin{eqnarray*}
y(x) &=& h(0)\,f(x_N) + h(1)\,f(x_{N-1}) + \cdots h(N)\,f(x_0)\\
&=& h(0)\,y(N) + h(1)\,y(N-1) + \cdots h(N)\,y(0).
\end{eqnarray*}

where we have set $ T=1$ for simplicity, and used the fact that $ y(x_k)=f(x_k)$ for $ k=0,1,\ldots,N$ in the case of ``true interpolators'' that pass through the given samples exactly. For best results, $ y(x)$ should be evaluated in a one-sample range centered about $ x=N/2$. For delays outside the central one-sample range, the coefficients can be shifted to translate the desired delay into that range.


Fractional Delay Filters

In fractional-delay filtering applications, the interpolator typically slides forward through time to produce a time series of interpolated values, thereby implementing a non-integer signal delay:

$\displaystyle \hat{y}\left(n-\frac{N}{2}-\eta\right)
= h(0)\,y(n) + h(1)\,y(n-1) + \cdots h(N)\,y(0)
$

where $ \eta\in[-1/2,1/2]$ spans the central one-sample range of the interpolator. Equivalently, the interpolator may be viewed as an FIR filter having a linear phase response corresponding to a delay of $ N/2 +
\eta$ samples. Such filters are often used in series with a delay line in order to implement an interpolated delay line4.1) that effectively provides a continuously variable delay for discrete-time signals.

The frequency response [449] of the fractional-delay FIR filter $ h(n)$ is

$\displaystyle H(e^{j\omega}) \eqsp \sum_{n=0}^N h(n)e^{-j\omega n}.
$

For an ideal fractional-delay filter, the frequency response should be equal to that of an ideal delay

$\displaystyle H^\ast(e^{j\omega}) \eqsp e^{-j\omega\Delta}
$

where $ \Delta\isdeftext N/2 + \eta$ denotes the total desired delay of the filter. Thus, the ideal desired frequency response is a linear phase term corresponding to a delay of $ \Delta$ samples.


Lagrange Interpolation Optimality

As derived in §4.2.14, Lagrange fractional-delay filters are maximally flat in the frequency domain at dc. That is,

$\displaystyle \left.\frac{d^m E(e^{j\omega})}{d\omega^m}\right\vert _{\omega=0} = 0, \quad m=0,1,2,\ldots,N,
$

where $ E(e^{j\omega})$ is the interpolation error expressed in the frequency domain:

$\displaystyle E(e^{j\omega})\isdefs H^\ast(e^{j\omega}) - H(e^{j\omega}),
$

where $ H$ and $ H^\ast$ are defined in §4.2.2 above. This is the same optimality criterion used for the power response of (recursive) Butterworth filters in classical analog filter design [343,449]. It can also be formulated in terms of ``Pade approximation'' [373,374]. To summarize, the basic idea of maximally flat filter design is to match exactly as many leading terms as possible in the Taylor series expansion of the desired frequency response. Equivalently, we zero the maximum number of leading terms in the Taylor expansion of the frequency-response error.

Figure 4.11 compares Lagrange and optimal Chebyshev fractional-delay filter frequency responses. Optimality in the Chebyshev sense means minimizing the worst-case error over a given frequency band (in this case, $ \vert\omega\vert\in
[0,0.8\pi]$). While Chebyshev optimality is often the most desirable choice, we do not have closed-form formulas for such solutions, so they must be laboriously pre-calculated, tabulated, and interpolated to produce variable-delay filtering [358].

Figure 4.11: Comparison of Lagrange and Optimal Chebyshev Fractional-Delay Filter Frequency Responses
\includegraphics[width=3.5in]{eps/lag}


Explicit Lagrange Coefficient Formulas

Given a desired fractional delay of $ \Delta$ samples, the Lagrange fraction-delay impulse response can be written in closed form as

$\displaystyle h_\Delta(n) = \prod_{\stackrel{k=0}{k\neq n}} \frac{\Delta-k}{n-k}, \quad n=0,1,2,\ldots,N. \protect$ (5.7)

The following table gives specific examples for orders 1, 2, and 3:

\begin{displaymath}
{\small
\begin{array}{\vert\vert r\vert\vert c\vert c\vert c...
...
\frac{\Delta(\Delta-1)(\Delta-2)}{6} \\
\hline
\end{array}}
\end{displaymath}

Note that, for $ N=1$, Lagrange interpolation reduces to linear interpolation, i.e., the interpolator impulse response is $ h = [1-\Delta,\Delta]$. Also, remember that, for order $ N$, the desired delay should be in a one-sample range centered about $ \Delta=N/2$.


Lagrange Interpolation Coefficient Symmetry

As shown in [502, §3.3.3], directly substituting into Eq.$ \,$(4.7) derives the following coefficient symmetry property for the interpolation coefficients (impulse response) of a Lagrange fractional delay filter:

$\displaystyle h_\Delta(n) \eqsp h_{N-\Delta}(N-n), \quad n =0,1,\ldots,N, \protect$ (5.8)

where $ N$ is the order of the interpolator. Thus, the interpolation coefficients for delay $ N-\Delta$ are the ``flip'' (time reverse) of the coefficients for delay $ \Delta$.


Matlab Code for Lagrange Interpolation

A simple matlab function for computing the coefficients of a Lagrange fractional-delay FIR filter is as follows:

function h = lagrange( N, delay )

n = 0:N;
h = ones(1,N+1);
for k = 0:N
    index = find(n ~= k);
    h(index) = h(index) *  (delay-k)./ (n(index)-k);
end


Maxima Code for Lagrange Interpolation

The maxima program is free and open-source, like Octave for matlab:5.3

  (%i1) lagrange(N, n) :=
           product(if equal(k,n) then 1
                  else (D-k)/(n-k), k, 0, N);

  (%o1) lagrange(N, n) := product(if equal(k, n) then 1

                                       D - k
                                  else -----, k, 0, N)
                                       n - k
Usage examples in maxima:
  (%i2) lagrange(1,0);
  (%o2)                             1 - D

  (%i3) lagrange(1,1);
  (%o3)                               D

  (%i4) lagrange(4,0);
                           (1 - D) (D - 4) (D - 3) (D - 2)
  (%o4)                 - -------------------------------
                                         24
  (%i5) ratsimp(lagrange(4,0));
                           4       3       2
                          D  - 10 D  + 35 D  - 50 D + 24
  (%o5)                  ------------------------------
                                      24

  (%i6) expand(lagrange(4,0));
                            4      3       2
                           D    5 D    35 D    25 D
  (%o6)                    -- - ---- + ----- - ---- + 1
                           24    12     24      12

  (%i7) expand(lagrange(4,0)), numer;
                           4                     3
  (%o7) 0.041666666666667 D  - 0.41666666666667 D
                           2
      + 1.458333333333333 D  - 2.083333333333333 D + 1.0


Faust Code for Lagrange Interpolation

The Faust programming language for signal processing [453,450] includes support for Lagrange fractional-delay filtering, up to order five, in the library file filter.lib. For example, the fourth-order case is listed below:

// fourth-order (quartic) case, delay d in [1.5,2.5]
fdelay4(n,d,x) = delay(n,id,x)   * fdm1*fdm2*fdm3*fdm4/24
               + delay(n,id+1,x) * (0-fd*fdm2*fdm3*fdm4)/6
               + delay(n,id+2,x) * fd*fdm1*fdm3*fdm4/4
               + delay(n,id+3,x) * (0-fd*fdm1*fdm2*fdm4)/6
               + delay(n,id+4,x) * fd*fdm1*fdm2*fdm3/24
with {
  o = 1.49999;
  dmo = d - o; // assumed nonnegative
  id = int(dmo);
  fd = o + frac(dmo);
  fdm1 = fd-1;
  fdm2 = fd-2;
  fdm3 = fd-3;
  fdm4 = fd-4;
};

An example calling program is shown in Fig.4.12.

Figure: Faust program tlagrange.dsp used to generate Figures 4.13 through 4.16.

 
// tlagrange.dsp - test Lagrange interpolation in Faust

import("filter.lib");

N = 16; % Allocated delay-line length

% Compare various orders:
D = 5.4;
process = 1-1' <: fdelay1(N,D),
                  fdelay2(N,D),
                  fdelay3(N,D),
                  fdelay4(N,D),
                  fdelay5(N,D);
// To see results:
// [in a shell]:
//   faust2octave tlagrange.dsp
// [at the Octave command prompt]:
//   plot(db(fft(faustout,1024)(1:512,:)));

// Alternate example for testing a range of 4th-order cases
// (change name to "process" and rename "process" above):
process2  = 1-1' <: fdelay4(N, 1.5),
                    fdelay4(N, 1.6),
                    fdelay4(N, 1.7),
                    fdelay4(N, 1.8),
                    fdelay4(N, 1.9),
                    fdelay4(N, 2.0),
                    fdelay4(N, 2.1),
                    fdelay4(N, 2.2),
                    fdelay4(N, 2.3),
                    fdelay4(N, 2.4),
                    fdelay4(N, 2.499),
                    fdelay4(N, 2.5);


Lagrange Frequency Response Examples

The following examples were generated using Faust code similar to that in Fig.4.12 and the faust2octave command distributed with Faust.

Orders 1 to 5 on a fractional delay of 0.4 samples

Figure [*] shows the amplitude responses of Lagrange interpolation, orders 1 through 5, for the case of implementing an interpolated delay line of length $ 5.4$ samples. In all cases the interpolator follows a delay line of appropriate length so that the interpolator coefficients operate over their central one-sample interval. Figure [*] shows the corresponding phase delays. As discussed in §4.2.10, the amplitude response of every odd-order case is constrained to be zero at half the sampling rate when the delay is half-way between integers, which this example is near. As a result, the curves for the two even-order interpolators lie above the three odd-order interpolators at high frequencies in Fig.[*]. It is also interesting to note that the 4th-order interpolator, while showing a wider ``pass band,'' exhibits more attenuation near half the sampling rate than the 2nd-order interpolator.

Figure 4.13: Amplitude responses, Lagrange interpolation, orders 1 to 5, for an interpolated delay of $ 5.4$ samples. From the bottom-right corner along the right edge, the curves represent orders 1,3,5,4,2.
\includegraphics[width=0.9\twidth]{eps/tlagrange-1-to-5-ar-c}

Figure 4.14: Phase delays, Lagrange interpolation, orders 1 to 5, for an interpolated delay of $ 5.4$ samples. From bottom to top, the curves represent orders 2,4,1,3,5.
\includegraphics[width=0.9\twidth]{eps/tlagrange-1-to-5-pd-c}

In the phase-delay plots of Fig.[*], all cases are exact at frequency zero. At half the sampling rate they all give 5 samples of delay.

Note that all three odd-order phase delay curves look generally better in Fig.[*] than both of the even-order phase delays. Recall from Fig.[*] that the two even-order amplitude responses outperformed all three odd-order cases. This illustrates a basic trade-off between gain accuracy and delay accuracy. The even-order interpolators show generally less attenuation at high frequencies (because they are not constrained to approach a gain of zero at half the sampling rate for a half-sample delay), but they pay for that with a relatively inferior phase-delay performance at high frequencies.


Order 4 over a range of fractional delays

Figures 4.15 and 4.16 show amplitude response and phase delay, respectively, for 4th-order Lagrange interpolation evaluated over a range of requested delays from $ 1.5$ to $ 2.5$ samples in increments of $ 0.1$ samples. The amplitude response is ideal (flat at 0 dB for all frequencies) when the requested delay is $ 2$ samples (as it is for any integer delay), while there is maximum high-frequency attenuation when the fractional delay is half a sample. In general, the closer the requested delay is to an integer, the flatter the amplitude response of the Lagrange interpolator.

Figure 4.15: Amplitude responses, Lagrange interpolation, order 4, for the range of requested delays $ [1.5 : 0.1 : 2.5]$, with $ 2.499$ thrown in as well (see next plot for why). From bottom to top, ignoring the almost invisible split in the bottom curve, the curves represent requested delays $ 1.5, 1.6, 1.7, 1.8, 1.9$, and $ 2.0$. Then, because the curve for requested delay $ 2+\eta $ is the same as the curve for delay $ 2-\eta $, for $ \vert\eta \vert<1/2$, the same curves, from top to bottom, represent requested delays $ 2.0, 2.1, 2.2, 2.3, 2.4$ and $ 2.5$ (which is nearly indistinguishable from $ 2.499$).
\includegraphics[width=0.9\twidth]{eps/tlagrange-4-ar}

Figure 4.16: Phase delays, Lagrange interpolation, order 4, for the range of requested delays $ [1.5 : 0.1 : 2.5]$, and additionally $ 2.499$.
\includegraphics[width=0.9\twidth]{eps/tlagrange-4-pd}

Note in Fig.4.16 how the phase-delay jumps discontinuously, as a function of delay, when approaching the desired delay of $ 2.5$ samples from below: The top curve in Fig.4.16 corresponds to a requested delay of 2.5 samples, while the next curve below corresponds to 2.499 samples. The two curves roughly coincide at low frequencies (being exact at dc), but diverge to separate integer limits at half the sampling rate. Thus, the ``capture range'' of the integer 2 at half the sampling rate is numerically suggested to be the half-open interval $ [1.5,2.5)$.


Order 5 over a range of fractional delays

Figures 4.17 and 4.18 show amplitude response and phase delay, respectively, for 5th-order Lagrange interpolation, evaluated over a range of requested delays between $ 2$ and $ 3$ samples in steps of $ 0.1$ samples. Note that the vertical scale in Fig.4.17 spans $ 100$ dB while that in Fig.4.15 needed less than $ 9$ dB, again due to the constrained zero at half the sampling rate for odd-order interpolators at the half-sample point.

Figure 4.17: Amplitude responses, Lagrange interpolation, order 5, for the range of requested delays $ [2.0 : 0.1 : 3.0]$, with $ 2.495$ and $ 2.505$ included as well (see next plot for why).
\includegraphics[width=0.9\twidth]{eps/tlagrange-5-ar}

Figure 4.18: Phase delays, Lagrange interpolation, order 5, for the range of requested delays $ [2.0 : 0.1 : 3.0]$, with $ 2.495$ and $ 2.505$ included as well.
\includegraphics[width=0.9\twidth]{eps/tlagrange-5-pd}

Notice in Fig.4.18 how suddenly the phase-delay curves near 2.5 samples delay jump to an integer number of samples as a function of frequency near half the sample rate. The curve for $ 2.495$ samples swings down to 2 samples delay, while the curve for $ 2.505$ samples goes up to 3 samples delay at half the sample rate. Since the gain is zero at half the sample rate when the requested delay is $ 2.5$ samples, the phase delay may be considered to be exactly $ 2.5$ samples at all frequencies in that special case.



Avoiding Discontinuities When Changing Delay

We have seen examples (e.g., Figures 4.16 and 4.18) of the general fact that every Lagrange interpolator provides an integer delay at frequency $ \omega T = \pi$, except when the interpolator gain is zero at $ \omega T = \pi$. This is true for any interpolator implemented as a real FIR filter, i.e., as a linear combination of signal samples using real coefficients.5.4Therefore, to avoid a relatively large discontinuity in phase delay (at high frequencies) when varying the delay over time, the requested interpolation delay should stay within a half-sample range of some fixed integer, irrespective of interpolation order. This provides that the requested delay stays within the ``capture zone'' of a single integer at half the sampling rate. Of course, if the delay varies by more than one sample, there is no way to avoid the high-frequency discontinuity in the phase delay using Lagrange interpolation.

Even-order Lagrange interpolators have an integer at the midpoint of their central one-sample range, so they spontaneously offer a one-sample variable delay free of high-frequency discontinuities.

Odd-order Lagrange interpolators, on the other hand, must be shifted by $ 1/2$ sample in either direction in order to be centered about an integer delay. This can result in stability problems if the interpolator is used in a feedback loop, because the interpolation gain can exceed 1 at some frequency when venturing outside the central one-sample range (see §4.2.11 below).

In summary, discontinuity-free interpolation ranges include

\begin{eqnarray*}
\frac{N}{2}-\frac{1}{2} < D < \frac{N}{2}+\frac{1}{2}&& \mbox{...
...\frac{1}{2} < D < \frac{N+1}{2}+\frac{1}{2}&& \mbox{($N$\ odd).}
\end{eqnarray*}

Wider delay ranges, and delay ranges not centered about an integer delay, will include a phase discontinuity in the delay response (as a function of delay) which is largest at frequency $ \omega T = \pi$, as seen in Figures 4.16 and 4.18.



Lagrange Frequency Response Magnitude Bound

The amplitude response of fractional delay filters based on Lagrange interpolation is observed to be bounded by 1 when the desired delay $ \Delta$ lies within half a sample of the midpoint of the coefficient span [502, p. 92], as was the case in all preceeding examples above. Moreover, even-order interpolators are observed to have this boundedness property over a two-sample range centered on the coefficient-span midpoint [502, §3.3.6]. These assertions are easily proved for orders 1 and 2. For higher orders, a general proof appears not to be known, and the conjecture is based on numerical examples. Unfortunately, it has been observed that the gain of some odd-order Lagrange interpolators do exceed 1 at some frequencies when used outside of their central one-sample range [502, §3.3.6].


Even-Order Lagrange Interpolation Summary

We may summarize some characteristics of even-order Lagrange fractional delay filtering as follows:

  • Two-sample bounded-by-1 delay-range instead of only one-sample
  • No gain zero at half the sampling rate for the middle delay
  • No phase-delay discontinuity when crossing the middle delay
  • Optimal (central) delay range is centered about an integer
To stay within the central, one-sample delay range for even-order Lagrange interpolators, the delay range should be

$\displaystyle \Delta\in\left(\frac{N}{2}-\frac{1}{2},\frac{N}{2}+\frac{1}{2}\right),
$

where $ N$ is the order of the interpolator.


Odd-Order Lagrange Interpolation Summary

In contrast to even-order Lagrange interpolation, the odd-order case has the following properties (in fractional delay filtering applications):

  • Improved phase-delay accuracy at the expense of decreased amplitude-response accuracy (low-order examples in Fig.[*])
  • Optimal (centered) delay range lies between two integers
The usual centered delay range is

$\displaystyle \Delta\in\left(\frac{N}{2}-\frac{1}{2},\frac{N}{2}+\frac{1}{2}\right),
$

which is between integers, and in this range, the amplitude response is observed to be bounded by 1.

To avoid a discontinuous phase-delay jump at high frequencies when crossing the middle delay, the delay range can be shifted to

$\displaystyle \Delta\in\left(\frac{N\pm 1}{2}-\frac{1}{2},\frac{N\pm 1}{2}+\frac{1}{2}\right),
$

but then the gain may exceed 1 at some frequencies.



Proof of Maximum Flatness at DC

The maximumally flat fractional-delay FIR filter is obtained by equating to zero all $ N+1$ leading terms in the Taylor (Maclaurin) expansion of the frequency-response error at dc:

\begin{eqnarray*}
0 &=& \left.\frac{d^k}{d\omega^k} E(e^{j\omega}) \right\vert _...
...ert _{\omega=0}\\
&=& (-j\Delta)^k - \sum_{n=0}^N (-jn)^k h(n)
\end{eqnarray*}

$\displaystyle \,\,\Rightarrow\,\,\zbox {\sum_{n=0}^N n^k h(n) = \Delta^k, \; k=0,1,\ldots,N}
$

This is a linear system of equations of the form $ V\underline{h}=\underline{\Delta}$, where $ V$ is a Vandermonde matrix. The solution can be written as a ratio of Vandermonde determinants using Cramer's rule [329]. As shown by Cauchy (1812), the determinant of a Vandermonde matrix $ [p_i^{j-1}]$, $ i,j=1,\ldots,N$ can be expressed in closed form as

\begin{eqnarray*}
\left\vert\left[p_i^{j-1}\right]\right\vert &=& \prod_{j>i}(p_...
...ts\\
&&(p_{N-1}-p_{N-2})(p_N-p_{N-2})\cdots\\
&&(p_N-p_{N-1}).
\end{eqnarray*}

Making this substitution in the solution obtained by Cramer's rule yields that the impulse response of the order $ N$, maximally flat, fractional-delay FIR filter may be written in closed form as

$\displaystyle h(n) = \prod_{\stackrel{k=0}{k\ne n}}^N \frac{D-k}{n-k}, \quad n=0,1,\ldots N,
$

which is the formula for Lagrange-interpolation coefficients (Eq.$ \,$(4.6)) adapted to this problem (in which abscissae are equally spaced on the integers from 0 to $ N-1$).

Further details regarding the theory of Lagrange interpolation can be found (online) in [502, Ch. 3, Pt. 2, pp. 82-84].


Variable Filter Parametrizations

In practical applications of Lagrange Fractional-Delay Filtering (LFDF), it is typically necessary to compute the FIR interpolation coefficients $ h_\Delta(n)$ as a function of the desired delay $ \Delta$, which is usually time varying. Thus, LFDF is a special case of FIR variable filtering in which the FIR coefficients must be time-varying functions of a single delay parameter $ \Delta(t)$.

Table Look-Up

A general approach to variable filtering is to tabulate the filter coefficients as a function of the desired variables. In the case of fractional delay filters, the impulse response $ h_\Delta(n)$ is tabulated as a function of delay $ \Delta = N/2+\eta$, $ \eta\in[-1/2,1/2)$, $ n=0,1,2,\ldots,N$, where $ N$ is the interpolation-filter order. For each $ n$, $ \Delta$ may be sampled sufficiently densely so that linear interpolation will give a sufficiently accurate ``interpolated table look-up'' of $ h_\Delta(n)$ for each $ n$ and (continuous) $ \Delta$. This method is commonly used in closely related problem of sampling-rate conversion [462].


Polynomials in the Delay

A more parametric approach is to formulate each filter coefficient $ h_\Delta(n)$ as a polynomial in the desired delay $ \Delta$:

$\displaystyle h_\Delta(n) \isdefs \sum_{m=0}^M c_n(m)\Delta^m, \quad n=0,1,2,\ldots,N \protect$ (5.9)

Taking the z transform of this expression leads to the interesting and useful Farrow structure for variable FIR filters [134].


Farrow Structure

Taking the z transform of Eq.$ \,$(4.9) yields

$\displaystyle h_\Delta(n)$ $\displaystyle \isdef$ $\displaystyle \sum_{m=0}^M c_n(m)\Delta^m, \quad n=0,1,2,\ldots,N$  
$\displaystyle \Longleftrightarrow \quad
H_\Delta(z)$ $\displaystyle \isdef$ $\displaystyle \sum_{n=0}^N h_\Delta(n)z^{-n}$  
  $\displaystyle =$ $\displaystyle \sum_{n=0}^N \left[\sum_{m=0}^M c_n(m)\Delta^m\right]z^{-n}$  
  $\displaystyle =$ $\displaystyle \sum_{m=0}^M \left[\sum_{n=0}^N c_n(m) z^{-n}\right]\Delta^m$  
  $\displaystyle \isdef$ $\displaystyle \sum_{m=0}^M C_m(z) \Delta^m
\protect$ (5.10)

Since $ H_\Delta(z)$ is an $ N$th-order FIR filter, at least one of the $ C_m(z)$ must be $ N$th order, so that we need $ M\ge N$. A typical choice is $ M=N$.

Such a parametrization of a variable filter as a polynomial in fixed filters $ C_m(z)$ is called a Farrow structure [134,502]. When the polynomial Eq.$ \,$(4.10) is evaluated using Horner's rule,5.5 the efficient structure of Fig.4.19 is obtained. Derivations of Farrow-structure coefficients for Lagrange fractional-delay filtering are introduced in [502, §3.3.7].

Figure 4.19: Farrow structure for implementing parametrized filters as a fixed-filter polynomial in the varying parameter.
\includegraphics[width=\twidth]{eps/farrow}

As we will see in the next section, Lagrange interpolation can be implemented exactly by the Farrow structure when $ M=N$. For $ M<N$, approximations that do not satisfy the exact interpolation property can be computed [148].


Farrow Structure Coefficients

Beginning with a restatement of Eq.$ \,$(4.9),

$\displaystyle h_\Delta(n) \isdefs \sum_{m=0}^M c_n(m)\Delta^m, \quad n=0,1,2,\ldots,N,
$

we can express each FIR coefficient $ h_\Delta(n)$ as a vector expression:

$\displaystyle h_\Delta(n) \eqsp
\underbrace{%
\left[\begin{array}{ccccc} 1 & \...
...y}{c} C_n(0) \\ [2pt] C_n(1) \\ [2pt] \vdots \\ [2pt] C_n(M)\end{array}\right]
$

Making a row-vector out of the FIR coefficients gives

$\displaystyle \underbrace{\left[\begin{array}{cccc}h_\Delta(0)\!&\!h_\Delta(1)\...
...\vdots \\
C_0(M) & C_1(M) & \cdots & C_N(M)
\end{array}\right]}_{\mathbf{C}}
$

or

$\displaystyle \underline{h}_\Delta \eqsp \underline{V}_\Delta^T \mathbf{C}.
\protect$

We may now choose a set of parameter values $ {\underline{\Delta}}^T=[\Delta_0,\Delta_1,\ldots,\Delta_L]$ over which an optimum approximation is desired, yielding the matrix equation

$\displaystyle \mathbf{H}_{\underline{\Delta}}\eqsp \mathbf{V}_{\underline{\Delta}}\mathbf{C}, \protect$ (5.11)

where

$\displaystyle \mathbf{H}_{\underline{\Delta}}\isdefs \left[\begin{array}{c} \un...
...elta_0}^T \\ [2pt] \vdots \\ [2pt] \underline{h}_{\Delta_L}^T\end{array}\right]$   and$\displaystyle \qquad
\mathbf{V}_{\underline{\Delta}}\isdefs \left[\begin{array}...
...ta_0}^T \\ [2pt] \vdots \\ [2pt] \underline{V}_{\Delta_L}^T\end{array}\right].
$

Equation (4.11) may be solved for the polynomial-coefficient matrix $ \mathbf{C}$ by usual least-squares methods. For example, in the unweighted case, with $ L\ge M$, we have

$\displaystyle \zbox {\mathbf{C}\eqsp \left(\mathbf{V}_{\underline{\Delta}}^T\ma...
...ight)^{-1}
\mathbf{V}_{\underline{\Delta}}^T \mathbf{H}_{\underline{\Delta}}.}
$

Note that this formulation is valid for finding the Farrow coefficients of any $ N$th-order variable FIR filter parametrized by a single variable $ \Delta$. Lagrange interpolation is a special case corresponding to a particular choice of $ \mathbf{H}_{\underline{\Delta}}$.


Differentiator Filter Bank

Since, in the time domain, a Taylor series expansion of $ x(n-\Delta)$ about time $ n$ gives

\begin{eqnarray*}
x(n-\Delta)
&=& x(n) -\Delta\, x^\prime(n)
+ \frac{\Delta^2...
...D^2(z) + \cdots
+ \frac{(-\Delta)^k}{k!}D^k(z) + \cdots \right]
\end{eqnarray*}

where $ D(z)$ denotes the transfer function of the ideal differentiator, we see that the $ m$th filter in Eq.$ \,$(4.10) should approach

$\displaystyle C_m(z) \eqsp \frac{(-1)^m}{m!}D^m(z), \protect$ (5.12)

in the limit, as the number of terms $ M$ goes to infinity. In other terms, the coefficient $ C_m(z)$ of $ \Delta^m$ in the polynomial expansion Eq.$ \,$(4.10) must become proportional to the $ m$th-order differentiator as the polynomial order increases. For any finite $ N$, we expect $ C_m(z)$ to be close to some scaling of the $ m$th-order differentiator. Choosing $ C_m(z)$ as in Eq.$ \,$(4.12) for finite $ N$ gives a truncated Taylor series approximation of the ideal delay operator in the time domain [184, p. 1748]. Such an approximation is ``maximally smooth'' in the time domain, in the sense that the first $ N$ derivatives of the interpolation error are zero at $ x(n)$.5.6 The approximation error in the time domain can be said to be maximally flat.

Farrow structures such as Fig.4.19 may be used to implement any one-parameter filter variation in terms of several constant filters. The same basic idea of polynomial expansion has been applied also to time-varying filters ( $ \Delta\leftarrow t$).


Recent Developments in Lagrange Interpolation

Franck (2008) [148] provides a nice overview of the various structures being used for Lagrange interpolation, along with their computational complexities and depths (relevant to parallel processing). He moreover proposes a novel computation having linear complexity and log depth that is especially well suited for parallel processing architectures.


Relation of Lagrange to Sinc Interpolation

For an infinite number of equally spaced samples, with spacing $ x_{k+1}-x_k = \Delta$, the Lagrangian basis polynomials converge to shifts of the sinc function, i.e.,

$\displaystyle l_k(x) =$   sinc$\displaystyle \left(\frac{x-k\Delta}{\Delta}\right), \quad k=\ldots,-2,-1,0,1,2,\ldots
$

where

   sinc$\displaystyle (x) \isdef \frac{\sin(\pi x)}{\pi x}
$

A simple argument is based on the fact that any analytic function is determined by its zeros and its value at one point. Since sinc$ (x)$ is zero on all the integers except 0, and since sinc$ (0)=1$, it must coincide with the infinite-order Lagrangian basis polynomial for the sample at $ x=0$ which also has its zeros on the nonzero integers and equals $ 1$ at $ x=0$.

The equivalence of sinc interpolation to Lagrange interpolation was apparently first published by the mathematician Borel in 1899, and has been rediscovered many times since [309, p. 325].

A direct proof can be based on the equivalance between Lagrange interpolation and windowed-sinc interpolation using a ``scaled binomial window'' [262,502]. That is, for a fractional sample delay of $ D$ samples, multiply the shifted-by-$ D$, sampled, sinc function

$\displaystyle h_s(n) =$   sinc$\displaystyle (n-D) = \frac{\sin[\pi(n-D)]}{\pi(n-D)}
$

by a binomial window

$\displaystyle w(n) = \left(\begin{array}{c}N\\ n\end{array}\right), \quad n=0,1,2,\ldots N
$

and normalize by [502]

$\displaystyle C(D) = (-1)^N\frac{\pi(N+1)}{\sin(\pi D)}\left(\begin{array}{c}D\\ N+1\end{array}\right),
$

which scales the interpolating filter to have a unit $ L_2$ norm, to obtain the $ N$th-order Lagrange interpolating filter

$\displaystyle h_D(n)=C(D)w(n)h_s(n), \quad n=0,1,2,\ldots,N
$

Since the binomial window converges to the Gaussian window as $ N\to\infty$, and since the window gets wider and wider, approaching a unit constant in the limit, the convergence of Lagrange to sinc interpolation can be seen.

A more recent alternate proof appears in [557].


Thiran Allpass Interpolators

Given a desired delay $ \Delta = N+\delta$ samples, an order $ N$ allpass filter

$\displaystyle H(z) = \frac{z^{-N}A\left(z^{-1}\right)}{A(z)}
= \frac{a_N + a_{N...
...^{-(N-1)} + z^{-N}}{1 + a_1 z^{-1}
+ \cdots + a_{N-1} z^{-(N-1)} + a_N z^{-N}}
$

can be designed having maximally flat group delay equal to $ \Delta$ at dc using the formula

$\displaystyle a_k=(-1)^k\left(\begin{array}{c} N \\ [2pt] k \end{array}\right)\prod_{n=0}^N\frac{\Delta-N+n}{\Delta-N+k+n},
\; k=0,1,2,\ldots,N
$

where

$\displaystyle \left(\begin{array}{c} N \\ [2pt] k \end{array}\right) = \frac{N!}{k!(N-k)!}
$

denotes the $ k$th binomial coefficient. Note, incidentally, that a lowpass filter having maximally flat group-delay at dc is called a Bessel filter [362, pp. 228-230].

  • $ a_0=1$ without further scaling
  • For sufficiently large $ \Delta$, stability is guaranteed
    Rule of thumb: $ \Delta \approx \hbox{order}$
  • It can be shown that the mean group delay of any stable $ N$th-order allpass filter is $ N$ samples [449].5.7
  • Only known closed-form case for allpass interpolators of arbitrary order
  • Effective for delay-line interpolation needed for tuning since pitch perception is most acute at low frequencies.
  • Since Thiran allpass filters have maximally flat group-delay at dc, like Lagrange FIR interpolation filters, they can be considered the recursive extension of Lagrange interpolation.

Thiran Allpass Interpolation in Matlab

function [A,B] = thiran(D,N)
%  [A,B] = thiran(D,N)
%  returns the order N Thiran allpass interpolation filter
%  for delay D (samples).

A = zeros(1,N+1);
for k=0:N
    Ak = 1;
    for n=0:N
        Ak = Ak * (D-N+n)/(D-N+k+n);
    end
    A(k+1) = (-1)^k * nchoosek(N,k) * Ak;
end

B = A(N+1:-1:1);


Group Delays of Thiran Allpass Interpolators

Figure [*] shows a family of group-delay curves for Thiran allpass interpolators, for orders 1, 2, 3, 5, 10, and 20. The desired group delay was equal to the order plus 0.3 samples (which is in the ``easy zone'' for an allpass interpolator).

Figure 4.20:
\includegraphics[width=\twidth]{eps/thirangdC}


Windowed Sinc Interpolation

Bandlimited interpolation of discrete-time signals is a basic tool having extensive application in digital signal processing.5.8In general, the problem is to correctly compute signal values at arbitrary continuous times from a set of discrete-time samples of the signal amplitude. In other words, we must be able to interpolate the signal between samples. Since the original signal is always assumed to be bandlimited to half the sampling rate, (otherwise aliasing distortion would occur upon sampling), Shannon's sampling theorem tells us the signal can be exactly and uniquely reconstructed for all time from its samples by bandlimited interpolation.

Considerable research has been devoted to the problem of interpolating discrete points. A comprehensive survey of ``fractional delay filter design'' is provided in [267]. A comparison between classical (e.g., Lagrange) and bandlimited interpolation is given in [407]. The book Multirate Digital Signal Processing [97] provides a comprehensive summary and review of classical signal processing techniques for sampling-rate conversion. In these techniques, the signal is first interpolated by an integer factor $ L$ and then decimated by an integer factor $ M$. This provides sampling-rate conversion by any rational factor $ L/M$. The conversion requires a digital lowpass filter whose cutoff frequency depends on $ \max\{L,M\}$. While sufficiently general, this formulation is less convenient when it is desired to resample the signal at arbitrary times or change the sampling-rate conversion factor smoothly over time.

In this section, a public-domain resampling algorithm is described which will evaluate a signal at any time specifiable by a fixed-point number. In addition, one lowpass filter is used regardless of the sampling-rate conversion factor. The algorithm effectively implements the ``analog interpretation'' of rate conversion, as discussed in [97], in which a certain lowpass-filter impulse response must be available as a continuous function. Continuity of the impulse response is simulated by linearly interpolating between samples of the impulse response stored in a table. Due to the relatively low cost of memory, the method is quite practical for hardware implementation.

In the next section, the basic theory is presented, followed by sections on implementation details and practical considerations.


Theory of Ideal Bandlimited Interpolation

We review briefly the ``analog interpretation'' of sampling rate conversion [97] on which the present method is based. Suppose we have samples $ x(nT)$ of a continuous absolutely integrable signal $ x(t)$, where $ t$ is time in seconds (real), $ n$ ranges over the integers, and $ T$ is the sampling period. We assume $ x(t)$ is bandlimited to $ \pm f_s/2$, where $ f_s=1/T$ is the sampling rate. If $ X(\omega)$ denotes the Fourier transform of $ x(t)$, i.e., $ X(\omega)=\int_{-\infty}^{\infty} x(t)
e^{-j\omega t} dt$, then we assume $ X(\omega)=0$ for $ \vert\omega\vert\geq\pi f_s$. Consequently, Shannon's sampling theorem gives us that $ x(t)$ can be uniquely reconstructed from the samples $ x(nT)$ via

$\displaystyle {\hat x}(t) \isdef \sum_{n=-\infty}^{\infty} x(nT) h_s(t-nT) \equiv x(t), \protect$ (5.13)

where

$\displaystyle h_s(t) \isdef$   sinc$\displaystyle (f_st) \isdef \frac{\sin(\pi f_st)}{\pi f_st}.
$

To resample $ x(t)$ at a new sampling rate $ f_s^\prime=1/T^\prime$, we need only evaluate Eq.$ \,$(4.13) at integer multiples of $ T^\prime$.

When the new sampling rate $ f_s^\prime$ is less than the original rate $ f_s$, the lowpass cutoff must be placed below half the new lower sampling rate. Thus, in the case of an ideal lowpass, $ h_s(t) = \min\{1,f_s^\prime/f_s\}$   sinc$ (\min\{f_s,f_s^\prime\}t)$, where the scale factor maintains unity gain in the passband.

A plot of the sinc function sinc$ (t) \isdeftext \sin(\pi t)/(\pi t)$ to the left and right of the origin $ t=0$ is shown in Fig.4.21. Note that peak is at amplitude $ 1$, and zero-crossings occur at all nonzero integers. The sinc function can be seen as a hyperbolically weighted sine function with its zero at the origin canceled out. The name sinc function derives from its classical name as the sine cardinal (or cardinal sine) function.

Figure 4.21: The sinc function plotted for seven zero-crossings to the left and right.
\includegraphics[width=3in]{eps/Sinc}

If ``$ \ast $'' denotes the convolution operation for digital signals, then the summation in Eq.$ \,$(4.13) can be written as $ (x\ast h_s)(t)$.

Equation Eq.$ \,$(4.13) can be interpreted as a superpositon of shifted and scaled sinc functions $ h_s$. A sinc function instance is translated to each signal sample and scaled by that sample, and the instances are all added together. Note that zero-crossings of sinc$ (z)$ occur at all integers except $ z=0$. That means at time $ t=nT$, (i.e., on a sample instant), the only contribution to the sum is the single sample $ x(nT)$. All other samples contribute sinc functions which have a zero-crossing at time $ t=nT$. Thus, the interpolation goes precisely through the existing samples, as it should.

A plot indicating how sinc functions sum together to reconstruct bandlimited signals is shown in Fig.4.22. The figure shows a superposition of five sinc functions, each at unit amplitude, and displaced by one-sample intervals. These sinc functions would be used to reconstruct the bandlimited interpolation of the discrete-time signal $ x = [\ldots ,0,1,1,1,1,1,0,\ldots ]$. Note that at each sampling instant $ t=nT$, the solid line passes exactly through the tip of the sinc function for that sample; this is just a restatement of the fact that the interpolation passes through the existing samples. Since the nonzero samples of the digital signal are all $ 1$, we might expect the interpolated signal to be very close to $ 1$ over the nonzero interval; however, this is far from being the case. The deviation from unity between samples can be thought of as ``overshoot'' or ``ringing'' of the lowpass filter which cuts off at half the sampling rate, or it can be considered a ``Gibbs phenomenon'' associated with bandlimiting.

Figure 4.22: Bandlimited reconstruction of the signal $ x(t)$ from its samples $ x = [\ldots ,0,1,1,1,1,1,0,\ldots ]$. The dots show the signal samples, the dashed lines show the component sinc functions, and the solid line shows the unique bandlimited reconstruction from the samples obtained by summing the component sinc functions.
\includegraphics{eps/SincSum}

A second interpretation of Eq.$ \,$(4.13) is as follows: to obtain the interpolation at time $ t$, shift the signal samples under one sinc function so that time $ t$ in the signal is translated under the peak of the sinc function, then create the output as a linear combination of signal samples where the coefficient of each signal sample is given by the value of the sinc function at the location of each sample. That this interpretation is equivalent to the first can be seen as a result of the fact that convolution is commutative; in the first interpretation, all signal samples are used to form a linear combination of shifted sinc functions, while in the second interpretation, samples from one sinc function are used to form a linear combination of samples of the shifted input signal. The practical bandlimited interpolation algorithm presented below is based on the second interpretation.


From Theory to Practice

The summation in Eq.$ \,$(4.13) cannot be implemented in practice because the ``ideal lowpass filter'' impulse response $ h_s(t)$ actually extends from minus infinity to infinity. It is necessary in practice to window the ideal impulse response so as to make it finite. This is the basis of the window method for digital filter design [115,362]. While many other filter design techniques exist, the window method is simple and robust, especially for very long impulse responses. In the case of the algorithm presented below, the filter impulse response is very long because it is heavily oversampled. Another approach is to design optimal decimated ``sub-phases'' of the filter impulse response, which are then interpolated to provide the ``continuous'' impulse response needed for the algorithm [358].

Figure 4.23 shows the frequency response of the ideal lowpass filter. This is just the Fourier transform of $ h_s(t)$.

Figure 4.23: Frequency response of the ideal lowpass filter.
\includegraphics{eps/IdealLowpass}

If we truncate $ h_s(t)$ at the fifth zero-crossing to the left and the right of the origin, we obtain the frequency response shown in Fig.4.24. Note that the stopband exhibits only slightly more than 20 dB rejection.

Figure 4.24: Frequency response of the ideal lowpass filter after rectangularly windowing the ideal (sinc) impulse response at the fifth zero crossing to the left and right of the time origin. The vertical axis is in units of decibels (dB), and the horizontal axis is labeled in units of spectral samples between plus and minus half the sampling rate.
\includegraphics{eps/SincTruncatedF}

If we instead use the Kaiser window [221,438] to taper $ h_s(t)$ to zero by the fifth zero-crossing to the left and the right of the origin, we obtain the frequency response shown in Fig.4.25. Note that now the stopband starts out close to $ -80$ dB. The Kaiser window has a single parameter which can be used to modify the stop-band attenuation, trading it against the transition width from pass-band to stop-band.

Figure 4.25: Frequency response of the ideal lowpass filter Kaiser windowed at the fifth zero crossing to the left and right.
\includegraphics{eps/SincKaiseredF}


Implementation

The implementation below provides signal evaluation at an arbitrary time, where time is specified as an unsigned binary fixed-point number in units of the input sampling period (assumed constant).

Figure 4.26: Time register format.
\includegraphics[scale=0.8]{eps/TimeRegisterFormat}

Figure 4.27: Waveforms and parameters in the interpolator.
\includegraphics[scale=0.8]{eps/Waveforms}

Figure 4.26 shows the time register $ t$, and Figure 4.27 shows an example configuration of the input signal and lowpass filter at a given time. The time register is divided into three fields: The leftmost field gives the number $ n$ of samples into the input signal buffer, the middle field is an initial index $ l$ into the filter coefficient table $ h(l)$, and the rightmost field is interpreted as a number $ \epsilon $ between 0 and $ 1$ for doing linear interpolation between samples $ l$ and $ l+1$ (initially) of the filter table. The concatenation of $ l$ and $ \epsilon $ are called $ P\in[0,1)$ which is interpreted as the position of the current time between samples $ n$ and $ n+1$ of the input signal.

Let the three fields have $ {n_n}$, $ n_l$, and $ n_\eta$ bits, respectively. Then the input signal buffer contains $ N=2^{n_n}$ samples, and the filter table contains $ L=2^n_l$ ``samples per zero-crossing.'' (The term ``zero-crossing'' is precise only for the case of the ideal lowpass; to cover practical cases we generalize ``zero-crossing'' to mean a multiple of time $ t_c=0.5/f_c$, where $ f_c$ is the lowpass cutoff frequency.) For example, to use the ideal lowpass filter, the table would contain $ h(l)=$sinc$ (l/L)$.

Our implementation stores only the ``right wing'' of a symmetric finite-impulse-response (FIR) filter (designed by the window method based on a Kaiser window [362]). Specifically, if $ h(l)$, $ l\in[-LN_z,LN_z]$, denotes a length $ 2LN_z+1$ symmetric finite impulse response, then the right wing of $ h$ is defined as the set of samples $ h(l)$ for $ l\in[0,LN_z]$. By symmetry, the left wing can be reconstructed as $ h(-l)=h(l)$, $ l=1,2,\ldots,LN_z$.

Our implementation also stores a table of differences $ \hbar(l)
= h(l+1) - h(l)$ between successive FIR sample values in order to speed up the linear interpolation. The length of each table is $ {\hat N}=LN_z+1$, including the endpoint definition $ \hbar({\hat N})=0$.

Consider a sampling-rate conversion by the factor $ \rho = f_s^\prime/f_s$. For each output sample, the basic interpolation Eq.$ \,$(4.13) is performed. The filter table is traversed twice--first to apply the left wing of the FIR filter, and second to apply the right wing. After each output sample is computed, the time register is incremented by $ 2^{n_l+n_\eta}/\rho$ (i.e., time is incremented by $ 1/\rho$ in fixed-point format). Suppose the time register $ t$ has just been updated, and an interpolated output $ y(t)$ is desired. For $ \rho\geq1$, output is computed via

\begin{eqnarray*}
v & \gets & \sum_{i=0}^{\mbox{$h$\ end}} x(n-i) \left[h(l+iL) ...
...$\ end}}
x(n+1+i) \left[h(l+iL) + \epsilon \hbar(l+iL)\right],
\end{eqnarray*}

where $ x(n)$ is the current input sample, and $ \epsilon \in[0,1)$ is the interpolation factor. When $ \rho<1$, the initial $ P$ is replaced by $ P^\prime=\rho P$, $ 1-P$ becomes $ \rho-P^\prime = \rho(1-P)$, and the step-size through the filter table is reduced to $ \rho L$ instead of $ L$; this lowers the filter cutoff to avoid aliasing. Note that $ \epsilon $ is fixed throughout the computation of an output sample when $ \rho\geq1$ but changes when $ \rho<1$.

When $ \rho<1$, more input samples are required to reach the end of the filter table, thus preserving the filtering quality. The number of multiply-adds per second is approximately $ (2N_z+1)\max\{f_s,f_s^\prime\}$. Thus the higher sampling rate determines the work rate. Note that for $ \rho<1$ there must be $ \lceil{N_zf_s/f_s^\prime}\rceil $ extra input samples available before the initial conversion time and after the final conversion time in the input buffer. As $ \rho\to 0$, the required extra input data becomes infinite, and some limit must be chosen, thus setting a minimum supported $ \rho$. For $ \rho\geq1$, only $ N_z$ extra input samples are required on the left and right of the data to be resampled, and the upper bound for $ \rho$ is determined only by the fixed-point number format, viz., $ \rho_{\mbox{max}}= 2^{n_l+n_\eta}$.

As shown below, if $ n_c$ denotes the word-length of the stored impulse-response samples, then one may choose $ n_l=n_c/2$, and $ n_\eta=n_c/2$ to obtain $ n_c-1$ effective bits of precision in the interpolated impulse response.

Note that rational conversion factors of the form $ \rho=L/m$, where $ L=2^n_l$ and $ m$ is an arbitrary positive integer, do not use the linear interpolation feature (because $ \epsilon \equiv0$). In this case our method reduces to the normal type of bandlimited interpolator [97]. With the availability of interpolated lookup, however, the range of conversion factors is boosted to the order of $ 2^{n_l+n_\eta}/m$. E.g., for $ \rho\approx1$, $ n_l=9,n_\eta=8$, this is about $ 5.1$ decimal digits of accuracy in the conversion factor $ \rho$. Without interpolation, the number of significant figures in $ \rho$ is only about $ 2.7$.

The number $ N_z$ of zero-crossings stored in the table is an independent design parameter. For a given quality specification in terms of aliasing rejection, a trade-off exists between $ N_z$ and sacrificed bandwidth. The lost bandwidth is due to the so-called ``transition band'' of the lowpass filter [362]. In general, for a given stop-band specification (such as ``80 dB attenuation''), lowpass filters need approximately twice as many multiply-adds per sample for each halving of the transition band width.

As a practical design example, we use $ N_z=13$ in a system designed for high audio quality at $ 20$% oversampling. Thus, the effective FIR filter is $ 27$ zero crossings long. The sampling rate in this case would be $ 50$ kHz.5.9In the most straightforward filter design, the lowpass filter pass-band would stop and the transition-band would begin at $ 20$ kHz, and the stop-band would begin (and end) at $ 25$ kHz. As a further refinement, which reduces the filter design requirements, the transition band is really designed to extend from $ 20$ kHz to $ 30$ kHz, so that the half of it between $ 25$ and $ 30$ kHz aliases on top of the half between $ 20$ and $ 25$ kHz, thereby approximately halving the filter length required. Since the entire transition band lies above the range of human hearing, aliasing within it is not audible.

Using $ 512$ samples per zero-crossing in the filter table for the above example (which is what we use at CCRMA, and which is somewhat over designed) implies desiging a length $ 27\times 512 = 13824$ FIR filter having a cut-off frequency near $ \pi/512$. It turns out that optimal Chebyshev design procedures such as the Remez multiple exchange algorithm used in the Parks-McLellan software [362] can only handle filter lengths up to a couple hundred or so. It is therefore necessary to use an FIR filter design method which works well at such very high orders, and the window method employed here is one such method.

It is worth noting that a given percentage increase in the original sampling rate (``oversampling'') gives a larger percentage savings in filter computation time, for a given quality specification, because the added bandwidth is a larger percentage of the filter transition bandwidth than it is of the original sampling rate. For example, given a cut-off frequency of $ 20$ kHz, (ideal for audio work), the transition band available with a sampling rate of $ 44$ kHz is about $ 2$ kHz, while a $ 48$ kHz sampling rate provides a $ 4$ kHz transition band. Thus, a $ 10$% increase in sampling rate halves the work per sample in the digital lowpass filter.

Choice of Table Size and Word Lengths

It is desirable that the stored filter impulse response be sampled sufficiently densely so that interpolating linearly between samples does not introduce error greater than the quantization error. It is shown in [462] that this condition is satisfied when the filter impulse-response table contains at least $ L=2^{1+n_c/2}$ entries per ``zero-crossing'', where $ n_c$ is the number of bits allocated to each table entry. (A later, sharper, error bound gives that $ L=2^{n_c/2}$ is sufficient.) It is additionally shown in [462] that the number of bits in the interpolation between impulse-response samples should be near $ n_c/2$ or more. With these choices, the linear interpolation error and the error due to quantized interpolation factors are each about equal to the coefficient quantization error. A signal resampler designed according to these rules will typically be limited primarily by the lowpass filter design, rather than by quantization effects.


Summary of Windowed Sinc Interpolation

The digital resampling method described in this section is convenient for bandlimited interpolation of discrete-time signals at arbitrary times and/or for arbitrary changes in sampling rate. The method is well suited for software or hardware implementation, and widely used free software is available.


Delay-Line Interpolation Summary

The most commonly used closed-form methods for delay-line interpolation may be summarized by the following table:

\begin{displaymath}
\begin{array}{c}
\mbox{Order}\\
\begin{array}{\vert\vert r\...
...\mbox{Thiran} & & \mbox{Sinc} \\
\hline
\end{array}\end{array}\end{displaymath}

In the first column, we have linear and first-order allpass interpolation, as discussed above in sections §4.1.1 and §4.1.2, respectively. These are the least-expensive methods computationally, and they find wide use, especially in audio applications with large sampling rates. While linear and first-order allpass interpolation cost the same, only linear interpolation offers ``random access mode''. That is, since the interpolated signal is a (finite) linear combination of known samples, the signal can be evaluated at any arbitrary time within the range of known samples -- the interpolation can ``jump around'' as desired. On the other hand, allpass interpolation has no gain error, so it may preferred inside a feedback loop to provide slowly varying fractional delay filtering.

In the second column -- $ N$th-order interpolation -- we list Lagrange for the FIR case (top) and Thiran for the IIR case (bottom), and these are introduced in §4.2 and §4.3, respectively. Lagrange and Thiran interpolators, properly implemented, enjoy the following advantages:

  • Gain bounded by 1 at all frequencies
  • Coefficients known in closed form as a function of desired delay
  • Maximally flat at low frequencies:
    -
    Lagrange: maximally flat frequency response at dc
    -
    Thiran: maximally flat group delay at dc

In the high-order FIR case, one should also consider ``windowed sinc'' interpolation (introduced in §4.4) as an alternative to Lagrange interpolation. In fact, as discussed in §4.2.16, Lagrange interpolation is a special case of windowed-sinc interpolation in which a scaled binomial window is used. By choosing different windows, optimalities other than ``maximally flat at dc'' can be achieved.

In the most general $ N$th-order case, the interpolation-filter impulse response may be designed to achieve any optimality objective, such as Chebyshev optimality (Fig.4.11). That is, design a digital filter (FIR or IIR) that approximates

$\displaystyle H_\Delta\left(e^{j\omega T}\right) = e^{-j\omega\Delta T},\quad \Delta = \hbox{Desired delay in samples}
$

optimally in some sense, with coefficients tabulated over a range of $ \Delta$ samples (and interpolated on lookup). Tabulated filter-designs of this nature, while generally giving the best interpolation quality, are not included in the above table because the filter-coefficients are not known in closed form.

FIR interpolators have the advantage that they can be used in ``random access'' mode. IIR interpolators, on the other hand, require a sequential stream of input samples and produce a sequential stream of interpolated signal samples (typically implementing a fractional delay). In IIR fractional-delay filters, the fractional delay must change slowly relative to the IIR duration.

Finally, we note in the last column of the above table that if ``order is no object'' ( $ N\to\infty$), then the ideal bandlimited-interpolator impulse-response is simply a sampled sinc function, as discussed in §4.4.


Next Section:
Time-Varying Delay Effects
Previous Section:
Artificial Reverberation