Mode Extraction Techniques

Free Books Physical Audio Signal Processing

The goal of resonator factoring is to identify and remove the least-damped resonant modes of the impulse response. In principle, this means ascertaining the precise resonance frequencies and bandwidths associated with each of the narrowest ``peaks'' in the resonator frequency response, and dividing them out via inverse filtering, so they can be implemented separately as resonators in series. If in addition the amplitude and phase of a resonance peak are accurately measurable in the complex frequency response, the mode can be removed by complex spectral subtraction (equivalent to subtracting the impulse-response of the resonant mode from the total impulse response); in this case, the parametric modes are implemented in a parallel bank as in [66]. However, in the parallel case, the residual impulse response is not readily commuted with the string.

In the inverse-filtering case, the factored resonator components are in cascade (series) so that the damped modes left behind may be commuted with the string and incorporated in the excitation table by convolving the residual impulse response with the desired string excitation signal. In the parallel case, the damped modes do not commute with the string since doing so would require somehow canceling them in the parallel filter sections. In principle, the string would have to be duplicated so that one instance can be driven by the residual signal with no body resonances at all, while the other is connected to the parallel resonator bank and driven only by the natural string excitation without any commuting of string and resonator. Since duplicating the string is unlikely to be cost-effective, the impulse response of the high-frequency modes can be commuted and convolved with the string excitation as in the series case to obtain qualitative results. The error in doing this is that the high-frequency modes are being multiplied by the parallel resonators rather than being added to them.

Various methods are available for estimating the mode parameters for inverse filtering:

Amplitude response peak measurement
Weighted digital filter design
Linear prediction
Sinusoidal modeling
Late impulse-response analysis

Amplitude response peak measurement

The longest ringing modes are associated with the narrowest bandwidths. When they are important resonances in the frequency response, they also tend to be the tallest peaks in the frequency response magnitude. (If they are not the tallest near the beginning of the impulse response, they will be the tallest near the end.) Therefore, one effective technique for measuring the least-damped resonances is simply to find the precise location and width of the narrowest and tallest spectral peaks in the measured amplitude response of the resonator. The center-frequency and bandwidth of a narrow frequency-response peak determine two poles in the resonator to be factored out. Expressing a filter in terms of its poles and zeros is one type of ``parametric'' filter representation, as opposed to ``nonparametric'' representations such as the impulse response or frequency response. Prony's method [449,297,273] is one well known technique for estimating the frequencies and bandwidths of sums of exponentially decaying sinusoids (two-pole resonator impulse responses).

In the factoring example presented in §8.8.6, the frequency and bandwidth of the main Helmholtz air mode are measured manually using an interactive spectrum analysis tool. However, it is a simple matter to automate peak-finding in FFT magnitude data. (See, for example, the peak finders used in sinusoidal modeling, discussed a bit further in §8.8.1 below.)

Weighted digital filter design

**Figure 8.13:** Illustration of one way to determine the parameters of a least-damped resonant mode.
$\includegraphics[width=0.8\twidth]{eps/fbrUsingInvfreqz}$

Many methods for digital filter design support spectral weighting functions that can be used to focus in on the least-damped modes in the frequency response. One is the weighted equation-error method which is available in the matlab invfreqz() function (§8.6.4). Figure 8.13 illustrates use of it. For simplicity, only one frequency-response peak plus noise is shown in this synthetic example. First, the peak center-frequency is measured using a quadratically interpolating peak finder operating on the dB spectral magnitude. This is used to set the spectral weighting function. Next, invfreqz() is called to design a two-pole filter having a frequency response that approximates the measured data as closely as possible. The weighting function is also shown in Fig.8.13, renormalized to overlay on the scale of the plot. Finally, the amplitude response of the two-pole filter designed by the equation-error method is shown overlaid in the figure. Note that the agreement is quite good near the peak which is what matters most. The interpolated peak frequency measured initially in the nonparametric spectral magnitude data can be used to fine-tune the pole-angles of the designed filter, thus rendering the equation-error method a technique for measuring only the peak bandwidth in this case. There are of course many, many techniques in the signal processing literature for measuring spectral peaks.

Linear prediction

Another well known method for converting the least-damped modes into parametric form is Linear Predictive Coding (LP) followed by polynomial factorization to obtain resonator poles. LP is particularly good at fitting spectral peaks due to the nature of the error criterion it minimizes [428]. The poles closest to the unit circle in the plane can be chosen for the ``ringy'' part of the resonator. It is well known that when using techniques like LP to model spectral peaks for extraction, orders substantially higher than twice the number of spectral peaks should be used. The extra degrees of freedom in the LP fit give more poles for modeling spectral detail other than the peaks, allowing the poles modeling the peaks to fit them with less distraction. On the other hand, if the order chosen is too high, sometimes more than two poles will home in on the same peak. In some cases, this may be appropriate since the body resonances are not necessary resolvable so as to separate the peaks, especially at high frequencies.

Sinusoidal modeling

Another way to find the least-damped mode parameters is by means of an intermediate sinusoidal model of the body impulse response, or, more appropriately, the energy decay relief (EDR) computed from the body impulse response (see §3.2.2). Such sinusoidal models have been used to determine the string loop filter in digital waveguide strings models. In the case of string loop-filter estimation, the sinusoidal model is applied to the impulse response (or ``pluck'' response) of a vibrating string or acoustic tube. In the present application, it is ideally applied to the EDR of the body impulse response (or ``hammer-strike'' response).

Since sinusoidal modeling software (e.g. [424]) typically quadratically interpolates the peak frequencies, the resonance frequencies are generally quite accurately estimated provided the frame size is chosen large enough to span many cycles of the underlying resonance.

The sinusoidal amplitude envelopes yield a particularly robust measurement of resonance bandwidth. Theoretically, the modal decay should be exponential. Therefore, on a dB scale, the amplitude envelope should decay linearly. Linear regression can be used to fit a straight line to the measured log-amplitude envelope of the impulse response of each long-ringing mode. Note that even when amplitude modulation is present due to modal couplings, the ripples tend to average out in the regression and have little effect on the slope measurement. This robustness can be enhanced by starting and ending the linear regression on local maxima in the amplitude envelope. A method for estimating modal decay parameters in the presence of noise is given in [125,234,235].

Below is a section of matlab code which performs linear regression:

    function [slope,offset] = fitline(x,y);
    %FITLINE    fit line 'y = slope * x + offset'
    %           to column vectors x and y.

        phi = [x, ones(length(x),1)];
        p = phi' * y;
        r = phi' * phi;
        t = r\p;
        slope = t(1);
        offset = t(2);

Late impulse-response analysis

All methods useable with inverse filtering can be modified based on the observation that late in the impulse response, the damped modes have died away, and the least-damped modes dominate. Therefore, by discarding initial impulse-response data, the problem in some sense becomes ``easier'' at the price of working closer to the noise floor. This technique is most appropriate in conjunction with the inverse filtering method for mode extraction (discussed below), since for subtraction, the modal impulse response must be extrapolated back to the beginning of the data record. However, methods used to compute the filter numerator in variations on Prony's method can be used to scale and phase-align a mode for subtraction [428,297].

One simple approximate technique based on looking only at the late impulse response is to take a zero-padded FFT of the latest Hanning-windowed data segment. The least-damped modes should give very clearly dominant peaks in the FFT magnitude data. As discussed above, the peak(s) can be interpolated to estimate the mode resonance frequency, and the bandwidth can be measured to determine the time-constant of decay. Alternatively, the time-constant of decay can be measured in the time domain by measuring the decay slope of the log amplitude envelope across the segment (this time using a rectangular window). Since the least-damped mode is assumed to be isolated in the late decay, it is easy to form a pitch-synchronous amplitude envelope.

Next Section:
Inverse Filtering
Previous Section:
Further Reading in Commuted Synthesis