## Delay-Line Interpolation

As mentioned above, when an audio delay line needs to vary smoothly over time, some form of*interpolation*between samples is usually required to avoid ``zipper noise'' in the output signal as the delay length changes. There is a hefty literature on ``fractional delay'' in discrete-time systems, and the survey in [267] is highly recommended.

This section will describe the most commonly used cases.

*Linear interpolation*is perhaps most commonly used because it is very straightforward and inexpensive, and because it sounds very good when the signal bandwidth is small compared with half the sampling rate. For a delay line in a nearly lossless feedback loop, such as in a vibrating string simulation,

*allpass interpolation*is sometimes a better choice since it costs the same as linear interpolation in the first-order case and has no gain distortion. (Feedback loops can be very sensitive to gain distortions.) Finally, in later sections, some higher-order interpolation methods are described.

### Linear Interpolation

Linear interpolation works by effectively drawing a straight line between two neighboring samples and returning the appropriate point along that line. More specifically, let be a number between 0 and 1 which represents how far we want to interpolate a signal between time and time . Then we can define the linearly interpolated value as follows:For , we get exactly , and for , we get exactly . In between, the interpolation error is nonzero, except when happens to be a linear function between and .

#### One-Multiply Linear Interpolation

Note that by factoring out , we can obtain a*one-multiply*form,

#### Fractional Delay Filtering by Linear Interpolation

A linearly interpolated delay line is depicted in Fig.4.1. In contrast to Eq.(4.1), we interpolate linearly between times and , and is called the*fractional delay*in samples. The first-order (linear-interpolating) filter following the delay line in Fig.4.1 may be called a

*fractional delay filter*[267]. Equation (4.1), on the other hand, expresses the more general case of an

*interpolated table lookup*, where is regarded as a table of samples and is regarded as an interpolated table-lookup based on the samples stored at indices and . The difference between a fractional delay filter and an interpolated table lookup is that table-lookups can ``jump around,'' while fractional delay filters receive a

*sequential*stream of input samples and produce a corresponding sequential stream of interpolated output values. As a result of this sequential access, fractional delay filters may be

*recursive*IIR digital filters (provided the desired delay does not change too rapidly over time). In contrast, ``random-access'' interpolated table lookups are typically implemented using weighted linear combinations, making them equivalent to nonrecursive FIR filters in the sequential case.

^{5.1}The

`C++`class implementing a linearly interpolated delay line in the Synthesis Tool Kit (

`STK`) is called

`DelayL`. The frequency response of linear interpolation for fixed fractional delay ( fixed in Fig.4.1) is shown in Fig.4.2. From inspection of Fig.4.1, we see that linear interpolation is a one-zero FIR filter. When used to provide a fixed fractional delay, the filter is linear and time-invariant (LTI). When the fractional delay changes over time, it is a linear time-varying filter.

*oversampled*. Since natural audio spectra tend to be relatively concentrated at low frequencies, linear interpolation tends to sound very good at high sampling rates. When interpolation occurs inside a

*feedback loop*, such as in digital waveguide models for vibrating strings (see Chapter 6), errors in the amplitude response can be highly audible (particularly when the loop gain is close to 1, as it is for steel strings, for example). In these cases, it is possible to eliminate amplitude error (at some cost in delay error) by using an

*allpass filter*for delay-line interpolation.

### First-Order Allpass Interpolation

A delay line interpolated by a first-order allpass filter is drawn in Fig.4.3. Intuitively, ramping the coefficients of the allpass gradually ``grows'' or ``hides'' one sample of delay. This tells us how to handle resets when crossing sample boundaries. The difference equation isAt low frequencies (), the delay becomes

Figure 4.4 shows the

*phase delay*of the first-order digital allpass filter for a variety of desired delays at dc. Since the amplitude response of any allpass is 1 at all frequencies, there is no need to plot it. The first-order allpass interpolator is generally controlled by setting its dc delay to the desired delay. Thus, for a given desired delay , the allpass coefficient is (from Eq.(4.3))

*pole-zero cancellation*! Due to inevitable round-off errors, pole-zero cancellations are to be avoided in practice. For this reason and others (discussed below), allpass interpolation is best used to provide a delay range lying wholly above zero,

*e.g.*,

*recursive*so that it must run for enough samples to reach steady state. However, when the impulse response is reasonably short, as it is for delays near one sample, it can in fact be used in ``random access mode'' by giving it enough samples with which to work. The

`STK`class implementing allpass-interpolated delay is

`DelayA`.

#### Minimizing First-Order Allpass Transient Response

In addition to approaching a pole-zero cancellation at , another undesirable artifact appears as : The*transient response*also becomes long when the pole at gets close to the unit circle. A plot of the impulse response for is shown in Fig.4.6. We see a lot of ``ringing'' near half the sampling rate. We actually should expect this from the

*nonlinear-phase distortion*which is clearly evident near half the sampling rate in Fig.4.4. We can interpret this phenomenon as the signal components near half the sampling rate being delayed by different amounts than other frequencies, therefore ``sliding out of alignment'' with them. For audio applications, we would like to keep the impulse-response duration short enough to sound ``instantaneous.'' That is, we do not wish to have audible ``ringing'' in the time domain near . For high quality sampling rates, such as larger than kHz, there is no issue of direct audibility, since the ringing is above the range of human hearing. However, it is often convenient, especially for research prototyping, to work at lower sampling rates where is audible. Also, many commercial products use such sampling rates to save costs. Since the time constant of decay, in samples, of the impulse response of a pole of radius is approximately

### Linear Interpolation as Resampling

#### Convolution Interpretation

Linearly interpolated fractional delay is equivalent to filtering and resampling a weighted impulse train (the input signal samples) with a continuous-time filter having the simple triangular impulse responseConvolution of the weighted impulse train with produces a continuous-time linearly interpolated signal

This continuous result can then be resampled at the desired fractional delay. In discrete time processing, the operation Eq.(4.5) can be approximated arbitrarily closely by digital

*upsampling*by a large integer factor , delaying by samples (an integer), then finally downsampling by , as depicted in Fig.4.7 [96]. The integers and are chosen so that , where the desired fractional delay. The convolution interpretation of linear interpolation, Lagrange interpolation, and others, is discussed in [407].

#### Frequency Response of Linear Interpolation

Since linear interpolation can be expressed as a convolution of the samples with a triangular pulse, we can derive the*frequency response*of linear interpolation. Figure 4.7 indicates that the triangular pulse serves as an

*anti-aliasing lowpass filter*for the subsequent downsampling by . Therefore, it should ideally ``cut off'' all frequencies higher than .

#### Triangular Pulse as Convolution of Two Rectangular Pulses

The 2-sample wide triangular pulse (Eq.(4.4)) can be expressed as a convolution of the one-sample rectangular pulse with itself. The one-sample*rectangular pulse*is shown in Fig.4.8 and may be defined analytically as

*Heaviside unit step function*:

#### Linear Interpolation Frequency Response

Since linear interpolation is a convolution of the samples with a triangular pulse (from Eq.(4.5)), the frequency response of the interpolation is given by the Fourier transform , which yields a sinc function. This frequency response applies to linear interpolation from discrete time to continuous time. If the output of the interpolator is also sampled, this can be modeled by sampling the continuous-time interpolation result in Eq.(4.5), thereby*aliasing*the sinc frequency response, as shown in Fig.4.9. In slightly more detail, from , and sinc, we have

sinc

where we used the convolution theorem for Fourier transforms, and the
fact that
sinc.
The Fourier transform of is the same function aliased on
a block of size Hz. Both and its alias are plotted
in Fig.4.9. The example in this figure pertains to an
output sampling rate which is times that of the input signal.
In other words, the input signal is upsampled by a factor of
using linear interpolation. The ``main lobe'' of the interpolation
frequency response contains the original signal bandwidth;
note how it is attenuated near half the original sampling rate (
in Fig.4.9). The ``sidelobes'' of the frequency response
contain attenuated *copies*of the original signal bandwidth (see the DFT stretch theorem), and thus constitute

*spectral imaging distortion*in the final output (sometimes also referred to as a kind of ``aliasing,'' but, for clarity, that term will not be used for imaging distortion in this book). We see that the frequency response of linear interpolation is less than ideal in two ways:

- The spectrum is ``rolled'' off near half the sampling rate. In fact, it is nowhere flat within the ``passband'' (-1 to 1 in Fig.4.9).
- Spectral imaging distortion is suppressed by only 26 dB (the level of the first sidelobe in Fig.4.9.

#### Special Cases

In the limiting case of , the input and output sampling rates are equal, and all sidelobes of the frequency response (partially shown in Fig.4.9) alias into the main lobe. If the output is sampled at the same exact time instants as the input signal, the input and output are identical. In terms of the aliasing picture of the previous section, the frequency response aliases to a perfect flat response over , with all spectral images combining coherently under the flat gain. It is important in this reconstruction that, while the frequency response of the underlying continuous interpolating filter is aliased by sampling, the signal spectrum is only imaged--not aliased; this is true for all positive*integers*and in Fig.4.7. More typically, when linear interpolation is used to provide

*fractional delay*, identity is not obtained. Referring again to Fig.4.7, with considered to be so large that it is effectively infinite, fractional-delay by can be modeled as convolving the samples with followed by sampling at . In this case, a linear phase term has been introduced in the interpolator frequency response, giving,

*not*yield a perfectly flat amplitude response for (when is non-integer). Moreover, the

*phase response is nonlinear*as well; a sampled symmetric triangular pulse is only linear phase when the samples fall symmetrically about the midpoint. Some example frequency-responses for various delays are graphed in Fig.4.2.

### Large Delay Changes

When implementing large delay length changes (by many samples), a useful implementation is to*cross-fade*from the initial delay line configuration to the new configuration:

- Computational requirements are doubled during the cross-fade.
- The cross-fade should occur over a time interval long enough to yield a smooth result.
- The new delay interpolation filter, if any, may be initialized in advance of the cross-fade, for maximum smoothness. Thus, if the transient response of the interpolation filter is samples, the new delay-line + interpolation filter can be ``warmed up'' (executed) for time steps before beginning the cross-fade. If the cross-fade time is long compared with the interpolation filter duration, ``pre-warming'' is not necessary.
- This is not a true ``morph'' from one delay length to another since we do not pass through the intermediate delay lengths. However, it avoids a potentially undesirable Doppler effect.
- A single delay line can be
*shared*such that the cross-fade occurs from one*read-pointer*(plus associated filtering) to another.

**Next Section:**

Lagrange Interpolation

**Previous Section:**

FDN Reverberation