# Virtual Musical Instruments

This chapter discusses aspects of synthesis models for specific musical instruments such as guitars, pianos, woodwinds, bowed strings, and brasses. Initially we look at model components for electric guitars, followed by some special considerations for acoustic guitars. Next we focus on basic models for string excitation, first by a mass (a simplified piano hammer), then by a damped spring (plectrum model). Piano modeling is then considered, using commuted synthesis for the acoustic resonators and a parametric signal-model for the force-pulse of the hammer. Finally, elementary synthesis of single-reed woodwinds, such as the clarinet, and bowed-string instruments, such as the violin, are summarized, and some pointers are given regarding other wind and percussion instruments.## Electric Guitars

For electric guitars, which are mainly about the vibrating string and subsequent effects processing, we pick up where §6.7 left off, and work our way up to practical complexity step by step. We begin with practical enhancements for the waveguide string model that are useful for all kinds of guitar models.### Length Three FIR Loop Filter

The simplest nondegenerate example of the loop filters of §6.8 is the three-tap FIR case ( ). The symmetry constraint leaves two degrees of freedom in the frequency response:^{10.1}

*damping*control. As damping is increased, the duration of free vibration is reduced at all nonzero frequencies, and the decay of higher frequencies is accelerated relative to lower frequencies, provided

### Length FIR Loop Filter Controlled by ``Brightness'' and ``Sustain''

Another convenient parametrization of the second-order symmetric FIR case is when the dc normalization is relaxed so that two degrees of freedom are retained. It is then convenient to control them as*brightness*and

*sustain*according to the formulas

where is the period in seconds (total loop delay), is the desired sustain time in seconds, and is the brightness parameter in the interval . The sustain parameter is defined here as the time to decay by dB (or time-constants) when brightness is maximum () in which case the loop gain is at all frequencies, or . As the brightness is lowered, the dc gain remains fixed at while higher frequencies decay faster. At the minimum brightness, the gain at half the sampling rate reaches zero, and the loop-filter amplitude-response assumes the form

### One-Zero Loop Filter

If we relax the constraint that be odd, then the simplest case becomes the*one-zero*digital filter:

*linear phase,*and its phase delay and group delay are equal to sample [362]. In practice, the half-sample delay must be compensated elsewhere in the filtered delay loop, such as in the delay-line interpolation filter [207]. Normalizing the dc gain to unity removes the last degree of freedom so that , and . See [454] for related discussion from a software implementation perspective.

### The Karplus-Strong Algorithm

The simulation diagram for the ideal string with the simplest frequency-dependent loss filter is shown in Fig. 9.1. Readers of the computer music literature will recognize this as the structure of the*Karplus-Strong algorithm*[236,207,489].

*per se,*is obtained when the delay-line initial conditions used to ``pluck'' the string consist of random numbers, or ``white noise.'' We know the initial

*shape*of the string is obtained by

*adding*the upper and lower delay lines of Fig. 6.11,

*i.e.*, . It is shown in §C.7.4 that the initial

*velocity*distribution along the string is determined by the

*difference*between the upper and lower delay lines. Thus, in the Karplus-Strong algorithm, the string is ``plucked'' by a

*random initial displacement and initial velocity distribution.*This is a very energetic excitation, and usually in practice the white noise is lowpass filtered; the lowpass cut-off frequency gives an effective

*dynamic level*control since natural stringed instruments are typically brighter at louder dynamic levels [428,207]. Karplus-Strong sound examples are available on the Web. An implementation of the Karplus-Strong algorithm in the Faust programming language is described (and provided) in [454].

### The Extended Karplus-Strong Algorithm

Figure 9.2 shows a block diagram of the*Extended Karplus-Strong*(EKS) algorithm [207]. The EKS adds the following features to the KS algorithm:

### Nonlinear Distortion

In §6.13,*nonlinear elements*were introduced in the context of general digital waveguide synthesis. In this section, we discuss specifically virtual electric guitar distortion, and mention other instances of audible nonlinearity in stringed musical instruments. As discussed in Chapter 6, typical vibrating strings in musical acoustics are well approximated as linear, time-invariant systems, there are special cases in which

*nonlinear*behavior is desired.

#### Tension Modulation

In every freely vibrating string, the fundamental frequency declines over time as the amplitude of vibration decays. This is due to*tension modulation*, which is often audible at the beginning of plucked-string tones, especially for low-tension strings. It happens because higher-amplitude vibrations

*stretch*the string to a longer average length, raising the average string tension faster wave propagation higher fundamental frequency. The are several methods in the literature for simulating tension modulation in a digital waveguide string model [494,233,508,512,513,495,283], as well as in membrane models [298]. The methods can be classified into two categories,

*local*and

*global*. Local tension-modulation methods modulate the speed of sound locally as a function of amplitude. For example, opposite delay cells in a force-wave digital waveguide string can be summed to obtain the instantaneous vertical force across that string sample, and the length of the adjacent propagation delay can be modulated using a first-order allpass filter. In principle the string slope reduces as the local tension increases. (Recall from Chapter 6 or Appendix C that force waves are minus the string tension times slope.) Global tension-modulation methods [495,494] essentially modulate the string delay-line length as a function of the total energy in the string.

#### String Length Modulation

A number of stringed musical instruments have a ``nonlinear sound'' that comes from modulating the physical termination of the string (as opposed to its acoustic length in the case of tension modulation). The Finnish Kantele [231,513] has a different effective string-length in the vertical and horizontal vibration planes due to a loose knot attaching the string to a metal bar. There is also nonlinear feeding of the second harmonic due to a nonrigid tuning peg. Perhaps a better known example is the Indian*sitar*, in which a curved ``jawari'' (functioning as a nonlinear bridge) serves to shorten the string gradually as it displaces toward the bridge. The Indian

*tambura*also employs a

*thread*perpendicular to the strings a short distance from the bridge, which serves to shorten the string whenever string displacement toward the bridge exceeds a certain distance. Finally, the

*slap bass*playing technique for bass guitars involves hitting the string hard enough to cause it to beat against the neck during vibration [263,366]. In all of these cases, the string length is physically modulated in some manner each period, at least when the amplitude is sufficiently large.

#### Hard Clipping

A widespread class of distortion used in*electric guitars*, is

*clipping*of the guitar waveform. it is easy to add this effect to any string-simulation algorithm by passing the output signal through a nonlinear

*clipping function*. For example, a

*hard clipper*has the characteristic (in normalized form)

where denotes the current input sample , and denotes the output of the nonlinearity.

#### Soft Clipping

A*soft clipper*is similar to a hard clipper, but with the corners smoothed. A common choice of soft-clipper is the

*cubic nonlinearity*,

*e.g.*[489],

This particular soft-clipping characteristic is diagrammed in Fig.9.3. An analysis of its spectral characteristics, with some discussion of aliasing it may cause, was given in in §6.13. An input gain may be used to set the desired degree of distortion.

#### Enhancing Even Harmonics

A cubic nonlinearity, as well as*any*odd distortion law,

^{10.2}generates only odd-numbered harmonics (like in a square wave). For best results, and in particular for

*tube distortion*simulation [31,395], it has been argued that some amount of even-numbered harmonics should also be present. Breaking the odd symmetry in any way will add even-numbered harmonics to the output as well. One simple way to accomplish this is to add an

*offset*to the input signal, obtaining

where controls the amount of square-law distortion in the more general third-order polynomial. The square-law is the most gentle nonlinear distortion in existence, adding only some second harmonic to a sinusoidal input signal. The constant can be set to zero the mean, on average; if the input signal is zero-mean with variance is 1, then will cancel the nonzero mean induced by the squaring term . Typically, the output of any audio effect is

*mixed*with the original input signal to allow easy control over the amount of effect. The term can be used for this, provided the constant gains for and are modified accordingly, or is hard-clipped to the desired range at the input.

#### Software for Cubic Nonlinear Distortion

The function`cubicnl`in the Faust software distribution (file

`effect.lib`) implements cubic nonlinearity distortion [454]. The Faust programming example

`cubic_distortion.dsp`provides a real-time demo with adjustable parameters, including an offset for bringing in even harmonics. The free, open-source

`guitarix`application (for Linux platforms) uses this type of distortion effect along with many other guitar effects. In the Synthesis Tool Kit (STK) [91], the class

`Cubicnl.cpp`implements a general cubic distortion law.

### Amplifier Feedback

A more extreme effect used with distorted electric guitars is*amplifier feedback*. In this case, the amplified guitar waveforms couple back to the strings with some gain and delay, as depicted schematically in Fig.9.4 [489]. The Amplifier Feedback Delay in the figure can be adjusted to emphasize certain partial overtones over others. If the loop gain, controllable via the Amplifier Feedback Gain, is greater than 1 at any frequency, a sustained ``feedback howl'' will be produced. Note that in commercial devices, the Pre-distortion gain and Post-distortion gain are frequency-dependent,

*i.e.*, they are implemented as pre- and post-

*equalizers*(typically only a few bands, such as three). Another simple choice is an integrator for the pre-distortion gain, and a differentiator for the post-distortion gain. Faust software implementing electric-guitar amplifier feedback may be found in [454].

### Cabinet Filtering

The distortion output signal is often further filtered by an*amplifier cabinet filter*, representing the enclosure, driver responses, etc. It is straightforward to measure the impulse response (or frequency response) of a speaker cabinet and fit it with a recursive digital filter design function such as

`invfreqz`in matlab (§8.6.4). Individual speakers generally have one major resonance--the ``cone resonance''--which is determined by the mass, compliance, and damping of the driver subassembly. The enclosing cabinet introduces many more resonances in the audio range, generally tuned to even out the overall response as much as possible. Faust software implementing an idealized speaker filter may be found in [454].

### Duty-Cycle Modulation

For class A tube amplifier simulation, there should be level-dependent*duty-cycle modulation*as a function:

^{10.3}

- 50% at low amplitude levels (no duty-cycle modulation)
- 55-65% duty cycle at high levels

### Vacuum Tube Modeling

The elementary nonlinearities discussed above are aimed at approximating the sound of*amplifier distortion*, such as often used by guitar players. Generally speaking, the distortion characteristics of vacuum-tube amplifiers are considered superior to those of solid-state amplifiers. A real-time numerically solved nonlinear differential algebraic model for class A single-ended guitar power amplification is presented in [84]. Real-time simulation of a tube guitar-amp (triode preamp, inverter, and push-pull amplifier) based on a precomputed approximate solution of the nonlinear differential equations is presented in [293]. Other references in this area include [338,228,339].

## Acoustic Guitars

This section addresses some modeling considerations specific to*acoustic*guitars. Section 9.2.1 discusses special properties of acoustic guitar

*bridges*, and §9.2.1 gives ways to simulate passive bridges, including a look at some laboratory measurements. For further reading in this area, see,

*e.g.*, [281,128].

###

Bridge Modeling

In §6.3 we analyzed the effect of *rigid string terminations*on traveling waves. We found that waves derived by time-derivatives of displacement (displacement, velocity, acceleration, and so on) reflect with a sign inversion, while waves defined in terms of the first spatial derivative of displacement (force, slope) reflect with no sign inversion. We now look at the more realistic case of

*yielding terminations*for strings. This analysis can be considered a special case of the loaded string junction analyzed in §C.12. Yielding string terminations (at the bridge) have a large effect on the sound produced by acoustic stringed instruments. Rigid terminations can be considered a reasonable model for the

*solid-body electric guitar*in which

*maximum sustain*is desired for played notes. Acoustic guitars, on the other hand, must transduce sound energy from the strings into the body of the instrument, and from there to the surrounding air. All audible sound energy comes from the string vibrational energy, thereby reducing the sustain (decay time) of each played note. Furthermore, because the bridge vibrates more easily in one direction than another, a kind of ``chorus effect'' is created from the detuning of the horizontal and vertical planes of string vibration (as discussed further in §6.12.1). A perfectly rigid bridge, in contrast, cannot transmit any sound into the body of the instrument, thereby requiring some other transducer, such as the magnetic pickups used in electric guitars, to extract sound for output.

^{10.4}

#### Passive String Terminations

When a traveling wave reflects from the bridge of a real stringed instrument, the bridge moves, transmitting sound energy into the instrument body. How far the bridge moves is determined by the*driving-point impedance*of the bridge, denoted . The driving point impedance is the ratio of Laplace transform of the force on the bridge to the velocity of motion that results . That is, . For passive systems (

*i.e.*, for all unamplified acoustic musical instruments), the driving-point impedance is

*positive real*(a property defined and discussed in §C.11.2). Being positive real has strong implications on the nature of . In particular, the phase of cannot exceed plus or minus degrees at any frequency, and in the lossless case, all poles and zeros must

*interlace*along the axis. Another implication is that the

*reflectance*of a passive bridge, as seen by traveling waves on the string, is a so-called

*Schur function*(defined and discussed in §C.11); a Schur reflectance is a stable filter having gain not exceeding 1 at any frequency. In summary, a guitar bridge is passive if and only if its driving-point impedance is positive real and (equivalently) its reflectance is Schur. See §C.11 for a fuller discussion of this point. At , the force on the bridge is given by (§C.7.2)

*slope*of the string at . In the Laplace frequency domain, we have

#### A Terminating Resonator

Suppose a guitar bridge couples an ideal vibrating string to a single resonance, as depicted schematically in Fig.9.5. This is often an accurate model of an acoustic bridge impedance in a narrow frequency range, especially at low frequencies where the resonances are well separated. Then, as developed in Chapter 7, the driving-point impedance seen by the string at the bridge is*positive real*(§C.11.2).

#### Bridge Reflectance

The bridge*reflectance*is needed as part of the loop filter in a digital waveguide model (Chapter 6). As derived in §C.11.1, the force-wave reflectance of seen on the string is

where denotes the wave impedance of the ideal string, and denotes the resonance frequency in radians per second. The velocity reflectance is simply minus the force reflectance (§C.11.1).

#### Bridge Transmittance

The bridge*transmittance*is the filter needed for the signal path from the vibrating string to the resonant acoustic body. Since the bridge velocity equals the string endpoint velocity (a ``series'' connection), the velocity transmittance is simply

#### Digitizing Bridge Reflectance

Converting continuous-time transfer functions such as and to the digital domain is analogous to converting an analog electrical filter to a corresponding digital filter--a problem which has been well studied [343]. For this task, the*bilinear transform*(§7.3.2) is a good choice. In addition to preserving order and being free of aliasing, the bilinear transform preserves the positive-real property of passive impedances (§C.11.2). Digitizing via the bilinear transform (§7.3.2) transform gives

*i.e.*, it is a

*Schur*filter that becomes an allpass as the damping approaches zero. The choice of bilinear-transform constant maps the peak-frequency without error (see Problem 4).

#### A Two-Resonance Guitar Bridge

Now let's consider a two-resonance guitar bridge, as shown in Fig. 9.6.*graphical method*for calculating the

*phase response*of a linear system [449].) Below the first resonance, we may say that the system is

*stiffness controlled*(admittance phase ), while above the first resonance, we say it is

*mass controlled*(admittance phase ). This qualitative description is typical of any lightly damped, linear, dynamic system. As we proceed up the axis, we'll next encounter a near-zero, or ``anti-resonance,'' above which the system again appears ``stiffness controlled,'' or spring-like, and so on in alternation to infinity. The strict alternation of poles and zeros near the axis is required by the

*positive real*property of all passive admittances (§C.11.2).

#### Measured Guitar-Bridge Admittance

*measured*driving-point admittance [269] for a real guitar bridge is shown in Fig. 9.7. Note that at very low frequencies, the phase information does not appear to be bounded by as it must be for the admittance to be positive real (passive). This indicates a poor signal-to-noise ratio in the measurements at very low frequencies. This can be verified by computing and plotting the

*coherence*function between the bridge input and output signals using multiple physical measurements.

^{10.5}Figures 9.8 and 9.9 show overlays of the admittance magnitudes and phases, and also the input-output coherence, for three separate measurements. A coherence of 1 means that all the measurements are identical, while a coherence less than 1 indicates variation from measurement to measurement, implying a low signal-to-noise ratio. As can be seen in the figures, at frequencies for which the coherence is very close to 1, successive measurements are strongly in agreement, and the data are reliable. Where the coherence drops below 1, successive measurements disagree, and the measured admittance is not even positive real at very low frequencies.

#### Building a Synthetic Guitar Bridge Admittance

To construct a synthetic guitar bridge model, we can first measure empirically the admittance of a real guitar bridge, or we can work from measurements published in the literature, as shown in Fig. 9.7. Each peak in the admittance curve corresponds to a*resonance*in the guitar body that is well coupled to the strings via the bridge. Whether or not the corresponding vibrational mode

*radiates efficiently*depends on the geometry of the vibrational mode, and how well it couples to the surrounding air. Thus, a complete bridge model requires not only a synthetic bridge admittance which determines the reflectance ``seen'' on the string, but also a separate filter which models the

*transmittance*from the bridge to the outside air; the transmittance filter normally contains the same poles as the reflectance filter, but the zeros are different. Moreover, keep in mind that each string sees a slightly different reflectance and transmittance because it is located at a slightly different point relative to the guitar top plate; this changes the coupling coefficients to the various resonating modes to some extent. (We normally ignore this for simplicity and use the same bridge filters for all the strings.) Finally, also keep in mind that each string excites the bridge in

*three dimensions*. The two most important are the

*horizontal and vertical planes of vibration*, corresponding to the two planes of transverse vibration on the string. The vertical plane is normal to the guitar top plate, while the horizontal plane is parallel to the top plate. Longitudinal waves also excite the bridge, and they can be important as well, especially in the piano. Since longitudinal waves are much faster in strings than transverse waves, the corresponding overtones in the sound are normally inharmonically related to the main (nearly harmonic) overtones set up by the transverse string vibrations. The frequency, complex amplitude, and width of each peak in the measured admittance of a guitar bridge can be used to determine the parameters of a second-order digital resonator in a parallel bank of such resonators being used to model the bridge impedance. This is a variation on

*modal synthesis*[5,299]. However, for the bridge model to be

*passive*when attached to a string, its transfer function must be

*positive real*, as discused previously. Since strings are very nearly lossless, passivity of the bridge model is actually quite critical in practice. If the bridge model is even slightly active at some frequency, it can either make the whole guitar model unstable, or it might objectionably lengthen the decay time of certain string overtones. We will describe two methods of constructing passive ``bridge filters'' from measured modal parameters. The first is guaranteed to be positive real but has some drawbacks. The second method gives better results, but it has to be checked for passivity and possibly modified to give a positive real admittance. Both methods illustrate more generally applicable signal processing methods.

#### Passive Reflectance Synthesis--Method 1

The first method is based on constructing a*passive reflectance*having the desired poles, and then converting to an admittance via the fundamental relation where is an arbitrary real, positive number which can be interpreted as the wave admittance of the string on which waves enter and return from the bridge. As we saw in §C.11.1, every passive impedance corresponds to a passive reflectance which is a Schur function (stable and having gain not exceeding around the unit circle). Since damping is light in a guitar bridge impedance (otherwise the strings would not vibrate very long, and sustain is a highly prized feature of real guitars), we can expect the bridge reflectance to be close to an allpass transfer function . It is well known that every allpass transfer function can be expressed as

^{10.6}We require which means its frequency response is real and positive. Note that being a symmetric FIR filter is not sufficient; there also can be no zero-crossings in the frequency response, which tends to limit how far the impedance magnitudes can be shaped. (For simplicity, feel free to forget about and accept that all the admittance resonances will be constrained to the same height. The second method is easier anyway.) Recall that in every allpass filter with real coefficients, to every pole at radius there corresponds a zero at radius .

^{10.7}Because the impedance is lightly damped, the poles and zeros of the corresponding reflectance are close to the unit circle. This means that at points along the unit circle between the poles, the poles and zeros tend to cancel. It can be easily seen using the graphical method for computing the phase of the frequency response that the pole-zero angles in the allpass filter are very close to the resonance frequencies in the corresponding passive impedance [429]. Furthermore, the distance of the allpass poles to the unit circle controls the bandwidth of the impedance peaks. Therefore, to a first approximation, we can treat the allpass pole-angles as the same as those of the impedance pole angles, and the pole radii in the allpass can be set to give the desired impedance peak bandwidth. The zero-phase shaping filter gives the desired mode height. From the measured peak frequencies and bandwidths in the guitar bridge admittance, we may approximate the pole locations as

#### Passive Reflectance Synthesis--Method 2

The second method is based on constructing a partial fraction expansion of the admittance directly:*tends*to be positive real is that by using the same phase for each section (the 1 in all the section numerators), we are sure to get a zero forming at some frequency near the middle between the resonance frequencies, and at a similar distance from the unit circle. This means we are constructing interlacing poles and zeros by simply adding the resonators in parallel. The extra zero near dc is to ensure that the admittance looks like a lightly damped spring at zero frequency. Since half the sampling rate merely ``cuts off'' the frequency response, there is no corresponding requirement to add a pole near as if it were the point at infinity. However, the phase should be checked not to exceed plus or minus degrees there (or at any other frequency), and a pole or zero added if necessary to pull it into the positive-real range. A simple example of a synthetic bridge constructed using this method with is shown in Fig.9.11.

#### Matlab for Passive Reflectance Synthesis Method 1

fs = 8192; % Sampling rate in Hz (small for display convenience) fc = 300; % Upper frequency to look at nfft = 8192;% FFT size (spectral grid density) nspec = nfft/2+1; nc = round(nfft*fc/fs); f = ([0:nc-1]/nfft)*fs; % Measured guitar body resonances F = [4.64 96.52 189.33 219.95]; % frequencies B = [ 10 10 10 10 ]; % bandwidths nsec = length(F); R = exp(-pi*B/fs); % Pole radii theta = 2*pi*F/fs; % Pole angles poles = R .* exp(j*theta); A1 = -2*R.*cos(theta); % 2nd-order section coeff A2 = R.*R; % 2nd-order section coeff denoms = [ones(size(A1)); A1; A2]' A = [1,zeros(1,2*nsec)]; for i=1:nsec, % polynomial multiplication = FIR filtering: A = filter(denoms(i,:),1,A); end;Now A contains the (stable) denominator of the desired bridge admittance. We want now to construct a numerator which gives a positive-real result. We'll do this by first creating a passive reflectance and then computing the corresponding PR admittance.

g = 0.9; % Uniform loss factor B = g*fliplr(A); % Flip(A)/A = desired allpass Badm = A - B; Aadm = A + B; Badm = Badm/Aadm(1); % Renormalize Aadm = Aadm/Aadm(1); % Plots fr = freqz(Badm,Aadm,nfft,'whole'); nc = round(nfft*fc/fs); spec = fr(1:nc); f = ([0:nc-1]/nfft)*fs; dbmag = db(spec); phase = angle(spec)*180/pi; subplot(2,1,1); plot(f,dbmag); grid; title('Synthetic Guitar Bridge Admittance'); ylabel('Magnitude (dB)'); subplot(2,1,2); plot(f,phase); grid; ylabel('Phase (degrees)'); xlabel('Frequency (Hz)');

#### Matlab for Passive Reflectance Synthesis Method 2

... as in Method 1 for poles ... % Construct a resonator as a sum of arbitrary modes % with unit residues, adding a near-zero at dc. B = zeros(1,2*nsec+1); impulse = [1,zeros(1,2*nsec)]; for i=1:nsec, % filter() does polynomial division here: B = B + filter(A,denoms(i,:),impulse); end; % add a near-zero at dc B = filter([1 -0.995],1,B); ... as in Method 1 for display ...

#### Matrix Bridge Impedance

A passive*matrix*bridge admittance (

*e.g.*, taking all six strings of the guitar together) can be conveniently synthesized as a vector of positive-real scalar admittances multiplied by a positive definite matrix [25]. In [25], a dual-polarization guitar-string model is connected to a matrix reflectance computed from the matrix driving-point impedance at the bridge.

### Body Modeling

As introduced in §8.7, the*commuted synthesis*technique involves

*commuting*the LTI filter representing acoustic resonator (guitar body) with the string, so that the string is excited by the

*pluck response*of the body in place of the plectrum directly. The techniques described in §8.8 can be used to build practical computational synthesis models of acoustic guitar bodies and the like [229].

## String Excitation

In §2.4 and §6.10 we looked at the basic architecture of a digital waveguide string excited by some external disturbance. We now consider the specific cases of string excitation by a*hammer*(mass) and plectrum (spring).

### Ideal String Struck by a Mass

In §6.6, the ideal struck string was modeled as a simple initial velocity distribution along the string, corresponding to an instantaneous transfer of linear momentum from the striking hammer into the transverse motion of a string segment at time zero. (See Fig.6.10 for a diagram of the initial traveling velocity waves.) In that model, we neglected any effect of the striking hammer after time zero, as if it had bounced away at time 0 due to a so-called*elastic collision*. In this section, we consider the more realistic case of an

*inelastic collision*,

*i.e.*, where the mass hits the string and

*remains in contact*until something, such as a wave, or gravity, causes the mass and string to separate. For simplicity, let the string length be infinity, and denote its wave impedance by . Denote the colliding mass by and its speed prior to collision by . It will turn out in this analysis that we may approximate the size of the mass by zero (a so-called

*point mass*). Finally, we neglect the effects of gravity and drag by the surrounding air. When the mass collides with the string, our model must switch from two separate models (mass-in-flight and ideal string), to that of two ideal strings joined by a mass at , as depicted in Fig.9.12. The ``force-velocity port'' connections of the mass and two semi-infinite string endpoints are formally in

*series*because they all move together; that is, the mass velocity equals the velocity of each of the two string endpoints connected to the mass (see §7.2 for a fuller discussion of impedances and their parallel/series connection). The equivalent circuit for the mass-string assembly after time zero is shown in Fig.9.13. Note that the string wave impedance appears twice, once for each string segment on the left and right. Also note that there is a single common velocity for the two string endpoints and mass. LTI circuit elements in series can be arranged in any order.

All of the signs are `' in this equation because the ``current'' (velocity ) flows into the `' sign of each element. These

*reference directions*indicated by `' on each element in Fig.9.13 may be chosen arbitrarily, but it is convenient to adopt the convention that flows into the `' sign for each

*reaction force*, and into the `' sign for each

*action force*(or

*driving force*or

*voltage/current source*, etc.). With this convention, Kirchoff's Loop Rule effectively states ``the sum of all action forces equals the sum of all reaction forces,'' where the forces are understood to act on a common point. This is essentially Newton's third law of motion ``for every action there is an equal and opposite reaction'' (§B.1). In our mass-string example, all three forces are defined to be reaction forces because there is no external driving force on the mass, and no incoming waves from the string segments. External driving forces on the mass-string junction will be formulated later below.

^{10.8}We could equally well have defined the mass inertial force as a driving force on the two string segments, or the string segments could comprise a pair of driving forces for the mass. Ultimately, any force polarities may be assigned and consistently handled. Taking the Laplace transform

^{10.9}of Eq.(9.8) yields, by linearity,

where and denote the Laplace transforms of and , respectively. As discussed above, the impedance relation for each string endpoint is given by

^{10.10}Note that the mass is characterized by its

*impedance*

when the initial velocity is zero (§7.1). Substituting these relations into Eq.(9.9) yields

We see that the initial momentum of the mass effectively provides an

*impulsive external driving force*

*impedance description*(§7.1) is complete. In other words, the system can be fully described as a series combination of the three impedances , (on the left), and (on the right), driven by an external force-source . Solving Eq.(9.10) for yields

^{10.11}the velocity of the contact point is

*time integral*of applied force:

#### Mass Termination Model

The previous discussion solved for the motion of an ideal mass striking an ideal string of infinite length. We now investigate the same model from the string's point of view. As before, we will be interested in a digital waveguide (sampled traveling-wave) model of the string, for efficiency's sake (Chapter 6), and we therefore will need to know what the mass ``looks like'' at the end of each string segment. For this we will find that the impedance description (§7.1) is especially convenient.where denotes the force applied by string-segment 1 to the mass (defined as positive in the ``up'', or positive- direction), is the force applied by string-segment 2 to the mass (again positive upwards), and denotes the inertial force applied by the mass to both string endpoints (where again, a positive force points up). To derive the traveling-wave relations in a digital waveguide model, we want to use the force-wave variables and that we defined for vibrating strings in §6.1.5;

*i.e.*, we defined , where is the string tension and is the string slope, .

or, substituting the definitions of these forces,

The inertial force of the mass is because the mass must be accelerated downward in order to produce an upward reaction force. The signs of the two string forces follow from the definition of force-wave variables on the string, as discussed above. The force relations can be checked individually. For string 1,

#### Mass Reflectance from Either String

Let's first consider how the mass looks from the viewpoint of string 1, assuming string 2 is at rest. In this situation (no incoming wave from string 2), string 2 will appear to string 1 as a simple resistor (or dashpot) of Ohms in series with the mass impedance . (This observation will be used as the basis of a rapid solution method in §9.3.1 below.) When a wave from string 1 hits the mass, it will cause the mass to move. This motion carries both string endpoints along with it. Therefore, both the reflected and transmitted waves include this mass motion. We can say that we see a ``dispersive transmitted wave'' on string 2, and a dispersive reflection back onto string 1. Our object in this section is to calculate the transmission and reflection*filters*corresponding to these transmitted and reflected waves. By physical symmetry the

*velocity*reflection and transmission will be the same from string 1 as it is from string 2. We can say the same about force waves, but we will be more careful because the sign of the transverse force flips when the direction of travel is reversed.

^{10.12}Thus, we expect a scattering junction of the form shown in Fig.9.17 (recall the discussion of physically interacting waveguide inputs in §2.4.3). This much invokes the superposition principle (for simultaneous reflection and transmission), and imposes the expected symmetry: equal reflection filters and equal transmission filters (for either force or velocity waves).

The traveling-wave decompositions can be written out as

where a ``+'' superscript means ``right-going'' and a ``-'' superscript means ``left-going'' on either string.

^{10.13}Let's define the mass position to be zero, so that Eq.(9.14) with the substitutions Eq.(9.15) becomes

From this, the reflected velocity is immediate:

*velocity reflection transfer function*(or velocity

*reflectance*) of the mass as seen from string 1:

*i.e.*, the transverse-velocity reflectance is the same on either side of the mass. Thus we have found both velocity reflection filters at the mass-string junction. In summary, the velocity reflectance of the mass is from either string segment. This is a simple first-order filter model of what the mass (and string beyond) looks like dynamically from either string. It is always good to check that our answers make physical sense in limiting cases. For this problem, easy cases to check are and . When the mass is , the reflectance goes to zero (no reflected wave at all). When the mass goes to infinity, the reflectance approaches , corresponding to a rigid termination, which also makes sense. The results of this section can be more quickly obtained as a special case of the main result of §C.12, by choosing waveguides meeting at a load impedance . The next section gives another fast-calculation method based on a standard formula.

#### Simplified Impedance Analysis

The above results are quickly derived from the general reflection-coefficient for force waves (or voltage waves, pressure waves, etc.):where is the reflection coefficient of impedance as ``seen'' from impedance . If a force wave traveling along in impedance suddenly hits a new impedance , the wave will split into a reflected wave , and a transmitted wave . It therefore follows that a velocity wave will split into a reflected wave and transmitted wave . This rule is derived in §C.8.4 (and implicitly above as well). In the mass-string-collision problem, we can immediately write down the

*force reflectance*of the mass as seen from either string:

*velocity reflectance*is simply

#### Mass Transmittance from String to String

Referring to Fig.9.15, the*velocity transmittance*from string 1 to string 2 may be defined as

#### Force Wave Mass-String Model

The velocity transmittance is readily converted to a*force transmittance*using the Ohm's-law relations:

#### Summary of Mass-String Scattering Junction

In summary, we have characterized the mass on the string in terms of its reflectance and transmittance from either string. For force waves, we have outgoing waves given by*dynamic scattering junction*on the string. (If there were no dependency on , such as when a dashpot is affixed to the string, we would simply call it a

*scattering junction*.) The above form of the dynamic scattering junction is analogous to the

*Kelly-Lochbaum scattering junction*(§C.8.4). The general relation can be used to simplify the Kelly-Lochbaum form to a

*one-filter scattering junction*analogous to the one-multiply scattering junction (§C.8.5):

#### Digital Waveguide Mass-String Model

To obtain a force-wave digital waveguide model of the string-mass assembly after the mass has struck the string, it only remains to digitize the model of Fig.9.20. The delays are obviously to be implemented using digital delay lines. For the mass, we must digitize the force reflectance appearing in the one-filter model of Fig.9.20:A common choice of digitization method is the

*bilinear transform*(§7.3.2) because it preserves losslessness and does not alias. This will effectively yield a

*wave digital filter*model for the mass in this context (see Appendix F for a tutorial on wave digital filters). The bilinear transform is typically scaled as

*dc-blocking filter*[449]. Comparing with Eq.(9.18), we see that the behavior at dc is correct, and that the behavior at infinite frequency ( ) is now the behavior at half the sampling rate ( ). Physically, the mass reflectance is zero at dc because sufficiently slow waves can freely move a mass of any finite size. The reflectance is 1 at infinite frequency because there is no time for the mass to start moving before it is pushed in the opposite direction. In short, a mass behaves like a rigid termination at infinite frequency, and a free end (no termination) at zero frequency. The reflectance of a mass is therefore a ``dc blocker''. The final digital waveguide model of the mass-string combination is shown in Fig.9.21. Additional examples of lumped-element modeling, including masses, springs, dashpots, and their various interconnections, are discussed the Wave Digital Filters (WDF) appendix (Appendix F). A nice feature WDFs is that they employ traveling-wave input/output signals which are ideal for interfacing to digital waveguides. The main drawback is that the WDFs operate over a

*warped frequency axis*(due to the bilinear transform), while digital delay lines have a normal (unwarped) frequency axis. On the plus side, WDFs cannot

*alias*, while digital waveguides do alias in the frequency domain for signals that are not bandlimited to less than half the sampling rate. At low frequencies (or given sufficient oversampling), the WDF frequency warping is minimal, and in such cases, WDF ``lumped element models'' may be connected directly to digital waveguides, which are ``sampled-wave distributed parameter'' models. Even when the bilinear-transform frequency-warping is severe, it is often well tolerated when the frequency response has only one ``important frequency'', such as a second-order resonator, lowpass, or highpass response. In other words, the bilinear transform can be scaled to map any single analog frequency to any desired corresponding digital frequency (see §7.3.2 for details), and the frequency-warped responses above and below the exactly mapped frequency may ``sound as good as'' the unwarped responses for musical purposes. If not, higher order filters can be used to model lumped elements (Chapter 7).

#### Displacement-Wave Simulation

As discussed in [121], displacement waves are often preferred over force or velocity waves for guitar-string simulations, because such strings often hit obstacles such as frets or the neck. To obtain displacement from velocity at a given , we may time-integrate velocity as above to produce displacement at any spatial sample along the string where a collision might be possible. However, all these integrators can be eliminated by simply going to a displacement-wave simulation, as has been done in nearly all papers to date on plucking models for digital waveguide strings. To convert our force-wave simulation to a displacement-wave simulation, we may first convert force to velocity using the Ohm's law relations and and then conceptually integrate all signals with respect to time (in advance of the simulation). is the same on both sides of the finger-junction, which means we can convert from force to velocity by simply negating all left-going signals. (Conceptually, all signals are converted from force to velocity by the Ohm's law relations and then divided by , but the common scaling by can be omitted (or postponed) unless signal values are desired in particular physical units.) An all-velocity-wave simulation can be converted to displacement waves even more easily by simply changing to everywhere, because velocity and displacement waves scatter identically. In more general situations, we can go to the Laplace domain and replace each occurrence of by , each by , divide all signals by , push any leftover around for maximum simplification, perhaps absorbing it into a nearby filter. In an all-velocity-wave simulation, each signal gets multiplied by in this procedure, which means it cancels out of all definable transfer functions. All filters in the diagram (just in this example) can be left alone because their inputs and outputs are still force-valued in principle. (We expressed each force wave in terms of velocity and wave impedance without changing the signal flow diagram, which remains a force-wave simulation until minus signs, scalings, and operators are moved around and combined.) Of course, one can absorb scalings and sign reversals into the filter(s) to change the physical input/output units as desired. Since we routinely assume zero initial conditions in an impedance description, the integration constants obtained by time-integrating velocities to get displacements are all defined to be zero. Additional considerations regarding the choice of displacement waves over velocity (or force) waves are given in §E.3.3. In particular, their initial conditions can be very different, and traveling-wave components tend not to be as well behaved for displacement waves.### Piano Hammer Modeling

The previous section treated an ideal point-mass striking an ideal string. This can be considered a simplified piano-hammer model. The model can be improved by adding a damped spring to the point-mass, as shown in Fig.9.22 (*cf.*Fig.9.12).

Thus, the scattering filters in the digital waveguide model are second order (biquads), while for the string struck by a mass (§9.3.1) we had first-order scattering filters. This is expected because we added another energy-storage element (a spring). The impedance formulation of Eq.(9.19) assumes all elements are linear and time-invariant (LTI), but in practice one can normally modulate element values as a function of time and/or state-variables and obtain realistic results for low-order elements. For this we must maintain filter-coefficient formulas that are explicit functions of physical state and/or time. For best results, state variables should be chosen so that any nonlinearities remain

*memoryless*in the digitization [361,348,554,555].

#### Nonlinear Spring Model

In the musical acoustics literature, the piano hammer is classically modeled as a*nonlinear spring*[493,63,178,76,60,486,164].

^{10.14}Specifically, the piano-hammer damping in Fig.9.22 is typically approximated by , and the spring is

*nonlinear*and

*memoryless*according to a simple power law:

(10.20) |

This force is balanced at all times by the downward string force (string tension times slope difference), exactly as analyzed in §9.3.1 above.

#### Including Hysteresis

Since the compressed hammer-felt (wool) on real piano hammers shows significant*hysteresis memory*, an improved piano-hammer felt model is

where (s), and again denotes piano key number [487]. Equation (9.21) is said to be a good approximation under normal playing conditions. A more complete hysteresis model is [487]

#### Piano Hammer Mass

The piano-hammer*mass*may be approximated across the keyboard by [487]

### Pluck Modeling

The piano-hammer model of the previous section can also be configured as a*plectrum*by making the mass and damping small or zero, and by releasing the string when the contact force exceeds some threshold . That is, to a first approximation, a plectrum can be modeled as a

*spring*(linear or nonlinear) that disengages when either it is far from the string or a maximum spring-force is exceeded. To avoid discontinuities when the plectrum and string engage/disengage, it is good to taper both the damping and spring-constant to zero at the point of contact (as shown below). Starting with the piano-hammer impedance of Eq.(9.19) and setting the mass to infinity (the plectrum holder is immovable), we define the

*plectrum impedance*as

The force-wave reflectance of impedance in Eq.(9.22), as seen from the string, may be computed exactly as in §9.3.1:

If the spring damping is much greater than twice the string wave impedance (), then the plectrum looks like a rigid termination to the string (force reflectance ), which makes physical sense. Again following §9.3.1, the transmittance for force waves is given by

*i.e.*, if the plectrum is to be modeled as a simple linear spring, then the impedance becomes , and the force-wave reflectance becomes [128]

#### Digital Waveguide Plucked-String Model

When plucking a string, it is necessary to detect ``collisions'' between the plectrum and string. Also, more complete plucked-string models will allow the string to ``buzz'' on the frets and ``slap'' against obstacles such as the fingerboard. For these reasons, it is convenient to choose*displacement waves*for the waveguide string model. The reflection and transmission filters for displacement waves are the same as for velocity, namely, and . As in the mass-string collision case, we obtain the one-filter scattering-junction implementation shown in Fig.9.23. The filter may now be digitized using the bilinear transform as previously (§9.3.1).

#### Incorporating Control Motion

Let denote the vertical position of the*mass*in Fig.9.22. (We still assume .) We can think of as the position of the

*control point*on the plectrum,

*e.g.*, the position of the ``pinch-point'' holding the plectrum while plucking the string. In a harpsichord, can be considered the

*jack*position [347]. Also denote by the

*rest length*of the spring in Fig.9.22, and let denote the position of the ``end'' of the spring while not in contact with the string. Then the plectrum makes

*contact*with the string when

*collision detection*equation. Let the subscripts and each denote one side of the scattering system, as indicated in Fig.9.23. Then, for example, is the displacement of the string on the left (side ) of plucking point, and is on the right side of (but still located at point ). By continuity of the string, we have

^{10.15}For or the applied force is zero and the entire plucking system disappears to leave and , or equivalently, the force reflectance becomes and the transmittance becomes . During contact, force equilibrium at the plucking point requires (

*cf.*§9.3.1)

where as usual (§6.1), with denoting the string tension. Using Ohm's laws for traveling-wave components (p. ), we have

Substituting and taking the Laplace transform yields

*i.e.*, when the plectrum does not affect the string displacement at the current time. It is therefore exact at the time of collision and also applicable just after release. Similarly, can be used to trigger a release of the string from the plectrum.

#### Successive Pluck Collision Detection

As discussed above, in a simple 1D plucking model, the plectrum comes up and engages the string when , and above some maximum force the plectrum releases the string. At this point, it is ``above'' the string. To pluck again in the same direction, the collision-detection must be disabled until we again have , requiring one bit of state to keep track of that.^{10.16}The harpsichord jack plucks the string only in the ``up'' direction due to its asymmetric behavior in the two directions [143]. If only ``up picks'' are supported, then engagement can be suppressed after a release until comes back down below the envelope of string vibration (

*e.g.*, ). Note that intermittent disengagements as a plucking cycle begins are normal; there is often an audible ``buzzing'' or ``chattering'' when plucking an already vibrating string. When plucking up and down in alternation, as in the

*tremolo*technique (common on mandolins), the collision detection alternates between and , and again a bit of state is needed to keep track of which comparison to use.

#### Plectrum Damping

To include damping in the plectrum model, the load impedance goes back to Eq.(9.22):#### Digitization of the Damped-Spring Plectrum

Applying the bilinear transformation (§7.3.2) to the reflectance in Eq.(9.23) (including damping) yields the following first-order digital force-reflectance filter:The transmittance filter is again , and there is a one-filter form for the scattering junction as usual.

#### Feathering

Since the pluck model is linear, the parameters are not signal-dependent. As a result, when the string and spring separate, there is a discontinuous change in the reflection and transmission coefficients. In practice, it is useful to ``feather'' the switch-over from one model to the next [470]. In this instance, one appealing choice is to introduce a*nonlinear spring*, as is commonly used for piano-hammer models (see §9.3.2 for details). Let the nonlinear spring model take the form

## Piano Synthesis

The main elements of the piano we need to model are the- hammer (§9.3.2),
- string (Chapter 6, §6.6),
- bridge (§9.2.1), and
- soundboard, enclosure, acoustic space, mic placement, etc.

*stiff*, we need to include a

*dispersion allpass filter*in the string model, as introduced in (§6.11.3). The next section provides additional details specific to piano strings. From the bridge (which runs along the soundboard) to each ``virtual microphone'' (or ``virtual ear''), the soundboard, enclosure, and listening space can be well modeled as a high-order LTI filter characterized by its impulse response (Chapter 8). Such long impulse responses can be converted to more efficient recursive digital filters by various means (Chapter 3,§8.6.2). A straightforward piano model along the lines indicated above turns out to be relatively expensive computationally. Therefore, we also discuss, in §9.4.4 below, the more specialized

*commuted piano synthesis*technique [467], which is capable of high quality piano synthesis at low cost relative to other model-based approaches. It can be seen as a

*hybrid*method in which the string is

*physically*modeled, the hammer is represented by a

*signal*model, and the acoustic resonators (soundboard, enclosure, etc.) are handled by ordinary

*sampling*(of their impulse response).

### Stiff Piano Strings

Piano strings are audibly*inharmonic*due to stiffness [211,210]. General stiff-string modeling was introduced in §6.9. In this section, we discuss further details specific to piano strings. The main effect of string stiffness is to

*stretch*the partial overtone series, so that piano tones are not precisely harmonic [144]. As a result, piano tuners typically stretch the tuning of the piano slightly. For example, the total amount of tuning stretch from the lowest to highest note has been measured to be approximately 35 cents on a Kurzweil PC88, 45 cents on a Steinway Model M, and 60 cents on a Steinway Model D.

^{10.17}

#### Piano String Wave Equation

A wave equation suitable for modeling linearized piano strings is given by [77,45,317,517]where the partial derivative notation and are defined on page , and

*ideal rod*(or

*bar*), this term is dominant [317,261,169]. The final two terms provide

*damping*. The damping associated with is frequency-independent, while the damping due increases with frequency.

#### Damping-Filter Design

In [46], the damping in real piano strings was modeled using a length 17 FIR filter for the lowest strings, and a length 9 FIR filter for the remaining strings. Such FIR filters (or nonparametric measured frequency-response data) can be converted to lower-order IIR filters by the usual methods (§8.6.2). It is convenient to have separate damping and dispersion filters in the string model. The damping filter in piano strings is significantly less demanding than the dispersion filter.#### Dispersion Filter-Design

In the context of a digital waveguide string model, dispersion associated with stiff strings can be supplied by an*allpass filter*in the basic feedback loop. Methods for designing dispersion allpass filters were summarized in §6.11.3. In this section, we are mainly concerned with how to

*specify*the desired dispersion allpass filter for piano strings. Perceptual studies regarding the audibility of inharmonicity in stringed instrument sounds [211] indicate that the just noticeable coefficient of inharmonicity is given approximately by

where is the fundamental frequency of the string vibration in hertz, and --the so-called

*coefficient of inharmonicity*--affects the th partial overtone tuning via

For a stiff string with Young's modulus , radius , length , and tension , the coefficient of inharmonicity is predicted from theory [144, p. 65],[211] to be

where is the string cross-sectional area, and is the radius of gyration of the string cross-section (see §B.4.9). In general, when designing dispersion filters for vibrating string models, it is highly cost-effective to obtain an allpass filter which correctly tunes only the lowest-frequency partial overtones, where the number of partials correctly tuned is significantly less than the total number of partials present, as in [384]. Application of the method of [2] to piano-string dispersion-filter design is reported in [1]. A Faust implementation of a closed-form expression [367] for dispersion allpass coefficients as a function of inharmonicity coefficient may be found in the function

`piano_dispersion_filter`within

`effect.lib`.

### Nonlinear Piano Strings

It turns out that piano strings exhibit audible nonlinear effects, especially in the first three octaves of its pitch range at fortissimo playing levels and beyond [26]. As a result, for highest quality piano synthesis, we need more than what is obtainable from a linearized wave equation such as Eq.(9.30). As can be seen from a derivation of the wave equation for an ideal string vibrating in 3D space (§B.6), there is fundamentally*nonlinear coupling*between transverse and longitudinal string vibrations. It turns out that the coupling from transverse-to-longitudinal is much stronger than vice versa, so that piano synthesis models can get by with one-way coupling at normal dynamic playing levels [30,163]. As elaborated in §B.6 and the references cited there, the longitudinal displacement is driven by longitudinal changes in the

*squared slope*of the string:

*high-frequency inharmonic overtones*(corresponding to longitudinal modes of vibration) in the sound. Since the nonlinear coupling is

*distributed*along the length of the string, the longitudinal modes are continuously being excited by the transverse vibration across time and position along the string. In addition to the excitation of longitudinal modes, the nonlinear transverse-to-longitudinal coupling results in a powerful

*longitudinal attack pulse*, which is the leading component of the initial ``shock noise'' audible in a piano tone. This longitudinal attack pulse hits the bridge well before the first transverse wave and is therefore quite significant perceptually. A detailed simulation of both longitudinal and transverse waves in an ideal string excited by a Gaussian pulse is given in [391]. Another important (

*i.e.*, audible) effect due to nonlinear transverse-to-longitudinal coupling is so-called

*phantom partials*. Phantom partials are ongoing

*intermodulation products*from the transverse partials as they transduce (nonlinearly) into longitudinal waves. The term ``phantom partial'' was coined by Conklin [85]. The Web version of [18] includes some illuminating sound examples by Conklin.

#### Nonlinear Piano-String Synthesis

When one-way transverse-to-longitudinal coupling is sufficiently accurate, we can model its effects*separately*based on

*observations*of transverse waves. For example [30,28],

- longitudinal modes can be implemented as
*second-order resonators*(``modal synthesis''), with driving coefficients given by *orthogonally projecting*[451] the spatial derivative of the squared string slope onto the longitudinal mode shapes (both functions of position ).- If tension variations along the string are neglected (reasonable
since longitudinal waves are so much faster than transverse waves),
then the longitudinal force on the bridge can be derived from the
estimated
*instantaneous tension*in the string, and efficient methods for this have been developed for guitar-string synthesis, particularly by Tolonen (§9.1.6).

#### Regimes of Piano-String Vibration

In general, more complex synthesis models are needed at higher dynamic playing levels. The main three regimes of vibration are as follows [27]:- The simplest piano-string vibration regime is characterized by
*linear superposition*in which transverse and longitudinal waves*decouple*into separate modes, as implied by Eq.(9.30). In this case, transverse and longitudinal waves can be simulated in separate digital waveguides (Ch. 6). The longitudinal waveguide is of course an order of magnitude shorter than the transverse waveguide(s). - As dynamic playing level is increased, transverse-to-longitudinal coupling becomes audible [26].
- At very high dynamic levels, the model should also include longitudinal-to-transverse coupling. However, this is usually neglected.

#### Efficient Waveguide Synthesis of Nonlinear Piano Strings

The following nonlinear waveguide piano-synthesis model is under current consideration [427]:- Initial striking force determines the starting regime (1, 2, or 3 above).
- The string model is simplified as it decays.

*e.g.*, fortissimo for the first three octaves of the piano keyboard [26]), the string model starts in regime 2. Because the energy stored in a piano string decays monotonically after a hammer strike (neglecting coupling from other strings), we may

*switch to progressively simpler models*as the energy of vibration falls below corresponding thresholds. Since the purpose of model-switching is merely to save computation, it need not happen immediately, so it may be triggered by a string-energy estimate based on observing the string at a

*single point*over the past period or so of vibration. Perhaps most simply, the model-regime classifier can be based on the maximum magnitude of the

*bridge force*over at least one period. If the regime 2 model includes an instantaneous string-length (tension) estimate, one may simply compare that to a threshold to determine when the simpler model can be used. If the longitudinal components and/or phantom partials are not completely inaudible when the model switches, then standard cross-fading techniques should be applied so that inharmonic partials are faded out rather than abruptly and audibly cut off. To obtain a realistic initial shock noise in the tone for regime 1 (and for any model that does not compute it automatically), the appropriate shock pulse, computed as a function of hammer striking force (or velocity, displacement, etc.), can be summed into the longitudinal waveguide during the hammer contact period. The longitudinal bridge force may be generated from the estimated string length (§9.1.6). This force should be exerted on the bridge in the direction coaxial with the string at the bridge (a direction available from the two transverse displacements one sample away from the bridge). Phantom partials may be generated in the longitudinal waveguide as explicit intermodulation products based on the transverse-wave overtones known to be most contributing; for example, the Goetzel algorithm [451] could be used to track relevant partial amplitudes for this purpose. Such explicit synthesis of phantom partials, however, makes modal synthesis more compelling for the longitudinal component [30]; in a modal synthesis model, the longitudinal attack pulse can be replaced by a one-shot (per hammer strike) table playback, scaled and perhaps filtered as a function of the hammer striking velocity. On the high end for regime 2 modeling, a full nonlinear coupling may be implemented along the three waveguides (two transverse and one longitudinal). At this level of complexity, a wide variety of finite-difference schemes should also be considered (§7.3.1) [53,555].

#### Checking the Approximations

The preceding discussion considered*several*possible approximations for nonlinear piano-string synthesis. Other neglected terms in the stiff-string wave equation were not even discussed, such as terms due to

*shear deformation*and

*rotary inertia*that are included in the (highly accurate)

*Timoshenko beam theory*formulation [261,169]. The following questions naturally arise:

- How do we know for sure our approximations are inaudible?
- We can listen, but could we miss an audible effect?
- Could a difference become audible after more listening?

*truth reference*--a ``perceptually exact'' model. Note that there are software tools (

*e.g.*, from the world of

*perceptual audio coding*[62,472]) that can be used to measure the

*audible equivalence*of two sounds [456]. For an audio coder, these tools predict the audibility of the difference between original and compressed sounds. For sound synthesis applications, we want to compare our ``exact'' and ``computationally efficient'' synthesis models.

### High-Accuracy Piano-String Modeling

In [265,266], an extension of the mass-spring model of [391] was presented for the purpose of high-accuracy modeling of nonlinear piano strings struck by a hammer model such as described in §9.3.2. This section provides a brief overview. Figure 9.25 shows a mass-spring model in 3D space. From*Hooke's Law*(§B.1.3), we have

*vector equation of motion*for mass 1 is given by Newton's second law :

*chain*of masses and spring is shown in Fig.9.26. Mass-spring chains--also called

*beaded strings*--have been analyzed in numerous textbooks (

*e.g.*, [295,318]), and numerical software simulation is described in [391]. The force on the th mass can be expressed as

where

#### A Stiff Mass-Spring String Model

Following the classical derivation of the stiff-string wave equation [317,144], an obvious way to introduce*stiffness*in the mass-spring chain is to use a

*bundle*of mass-spring chains to form a kind of ``lumped stranded cable''. One section of such a model is shown in Fig.9.27. Each mass is now modeled as a 2D

*mass disk*. Complicated rotational dynamics can be avoided by assuming

*no torsional waves*(no ``twisting'' motion) (§B.4.20). A three-spring-per-mass model is shown in Fig.9.28 [266]. The spring positions alternate between angles , say, on one side of a mass disk and on the other side in order to provide effectively

*six*spring-connection points around the mass disk for only three connecting springs per section. This improves

*isotropy*of the string model with respect to bending direction. A problem with the simple mass-spring-chain-bundle is that there is no resistance whatsoever to

*shear deformation*, as is clear from Fig.9.29. To rectify this problem (which does not arise due implicit assumptions when classically deriving the stiff-string wave equation), diagonal springs can be added to the model, as shown in Fig.. In the simulation results reported in [266], the spring-constants of the shear springs were chosen so that their stiffness in the longitudinal direction would equal that of the longitudinal springs.

#### Nonlinear Piano-String Equations of Motion in State-Space Form

For the flexible (non-stiff) mass-spring string, referring to Fig.9.26 and Eq.(9.34), we have the following equations of motion:#### Finite Difference Implementation

Digitizing via the centered second-order difference [Eq.(7.5)]*explicit finite-difference scheme*(§D.1):

### Commuted Piano Synthesis

The general technique of*commuted synthesis*was introduced in §8.7. This method is enabled by the

*linearity*and

*time invariance*of both the vibrating string and its acoustic resonator, allowing them to be interchanged, in principle, without altering the overall transfer function [449]. While the piano-hammer is a strongly nonlinear element (§9.3.2), it is nevertheless possible to synthesize a piano using commuted synthesis with high fidelity and low computational cost given the approximation that the string itself is linear and time-invariant (§9.4.2) [467,519,30]. The key observation is that the interaction between the hammer and string is essentially

*discrete*(after deconvolution) at only one or a few time instants per hammer strike. The deconvolution needed is a function of the hammer-string collision velocity . As a result,

*the hammer-string interaction can be modeled as one or a few discrete impulses that are filtered in a -dependent way.*Figure 9.31 illustrates a typical series of interaction forces-pulses at the contact point between a piano-hammer and string. The vertical lines indicate the locations and amplitudes of three single-sample impulses passed through three single-pulse filters to produce the overlapping pulses shown.

#### Force-Pulse Synthesis

*and decreases in width*, which means the filter is nonlinear. In other words, the force pulse gets ``brighter'' as its amplitude (dynamic level) increases. In a real piano, this brightness increase is caused by the nonlinear felt-compression in the piano hammer. Recall from §9.3.2 that piano-hammer felt is typically modeled as a nonlinear spring described by , where is felt compression. Here, the brightness is increased by shrinking the duration of the filter impulse response as increases. The key property enabling commuted synthesis is that, when is constant, the filter operates as a normal LTI filter. In this way, the entire piano has been ``linearized'' with respect to a given collision velocity .

#### Multiple Force-Pulse Synthesis

One method of generating multiple force pulses as shown in Fig.9.31 is to sum several single-pulse filters, as shown in Fig.9.33. Furthermore, the three input impulses can be generated using a single impulse into a tapped delay line (§2.5). The sum of the three filter outputs gives the desired superposition of three hammer-string force pulses. As the collision velocity increases, the output pulses become taller and thinner, showing less overlap. The filters LPF1-LPF3 can be given as side information, or the input impulse amplitudes can be set to , or the like.#### Commuted Piano Synthesis Architecture

Figure 9.34 shows a complete piano synthesis system along the lines discussed. At a fixed dynamic level, we have the critical feature that the model is*linear*and

*time invariant*. Therefore we may

*commute*the ``Soundboard & Enclosure'' filter with not only the string, but with the hammer filter-bank as well. The result is shown in Fig.9.35. As in the case of commuted guitar synthesis (§8.7), we replace a high-order digital filter (at least thousands of poles and zeros [229]) with a simple

*excitation table*containing that filter's impulse response. Thus, the large digital filter required to implement the soundboard, piano enclosure, and surrounding acoustic space, has been eliminated. At the same time, we still have explicit models for the hammer and string, so physical variations can be implemented, such as harder or softer hammers, more or less string stiffness, and so on. Even some resonator modifications remain possible, however, such as changing the virtual mic positions (§2.2.7). However, if we now want to ``open the top'' of our virtual piano, we have to measure the impulse response that case separately and have a separate table for it, which is not a problem, but it means we're doing ``sampling'' instead of ``modeling'' of the various resonators. An approximation made in the commuted technique is that detailed reflections generated on the string during hammer-string contact are neglected; that is, the hammer force-pulse depends only on the hammer-string velocity at the time of their initial contact, and the string velocity is treated as remaining constant throughout the contact duration.

#### Progressive Filtering

To save computation in the filtering, we can make use of the observation that, under the assumption of a string initially at rest, each interaction pulse is smoother than the one before it. That suggests applying the force-pulse filtering*progressively*, as was done with Leslie cabinet reflections in §5.7.6. In other words, the second force-pulse is generated as a filtering of the first force-pulse. This arrangement is shown in Fig.9.36.

#### Excitation Factoring

As another refinement, it is typically more efficient to implement the highest Q resonances of the soundboard and piano enclosure using actual digital filters (see §8.8). By factoring these out, the impulse response is shortened and thus the required excitable length is reduced. This provides a classical computation vs. memory trade-off which can be optimized as needed in a given implementation. For lack of a better name, let us refer to such a resonator bank as an ``equalizer'' since it can be conveniently implemented using parametric equalizer sections, one per high-Q resonance. A possible placement of the comb filter and equalizer is shown in Fig.9.37. The resonator/eq/reverb/comb-filter block may include filtering for partially implementing the response of the soundboard and enclosure, equalization sections for piano color variations, reverberation, comb-filter(s) for flanging, chorus, and simulated hammer-strike echoes on the string. Multiple outputs having different spectral characteristics can be extracted at various points in the processing.#### Excitation Synthesis

For a further reduction in cost of implementation, particularly with respect to memory usage, it is possible to*synthesize*the excitation using a noise signal through a time-varying lowpass filter [440]. The synthesis replaces the recorded soundbard/enclosure sound. It turns out that the sound produced by tapping on a piano soundboard sounds very similar to noise which is lowpass-filtered such that the filter bandwidth contracts over time. This approach is especially effective when applied only to the high-frequency residual excitation after factoring out all long-ringing modes as biquads.

#### Coupled Piano Strings

Perhaps it should be emphasized that for good string-tone synthesis,*multiple*filtered delay loops should be employed for each note rather than the single-loop case so often used for simplicity (§6.12; §C.13). For piano, two or three delay loops can correspond to the two or three strings hit by the same hammer in the plane orthogonal to both the string and bridge. This number of delay loops is

*doubled*from there if the transverse vibration planes

*parallel*to the bridge are added; since these are slightly detuned, they are worth considering. Finally, another 2-3 (shorter) delay loops are needed to incorporate longitudinal waves for each string, and all loops should be appropriately coupled. In fact, for full accurancy, longitudinal waves and transverse waves should be

*nonlinearly*coupled along the length of the string [391] (see §B.6.2).

#### Commuted Synthesis of String Reverberation

The sound of all strings ringing can be summed with the excitation to simulate the effect of many strings resonating with the played string when the sustain pedal is down [519]. The string loop filters out the unwanted frequencies in this signal and selects only the overtones which would be excited by the played string.#### High Piano Key Numbers

At very high pitches, the delay-line length used in the string may become too short for the implementation used, especially when using vectorized module signals (sometimes called ``chunks'' in place of samples [355]). In this case, good results can be obtained by replacing the filtered-delay-loop string model by a simpler model consisting only of a few parallel, second-order filter sections or enveloped sinusoidal oscillators, etc. In other words,*modal synthesis*(§8.5) is a good choice for very high key numbers.

#### Force-Pulse Filter Design

In the commuted piano synthesis technique, it is necessary to fit a filter impulse response to the desired hammer-string force-pulse shape*for each key*and

*for each key velocity*, as indicated in Figures 9.32 and 9.33. Such a desired curve can be found in the musical acoustics literature [488] or it can be easily generated from a good piano-hammer model (§9.3.2) striking the chosen piano string model. For psychoacoustic reasons, it is preferable to optimize the force-pulse filter-coefficients in the

*frequency domain*§8.6.2. Some additional suggestions are given in [467,519].

### Literature on Piano Acoustics and Synthesis

There has been much research into the musical acoustics of the piano, including models of piano strings [543,18,75,76,144,42] and the piano hammer [493,63,178,60,486,164,487]. Additionally, there has been considerable work to date on the problem of piano synthesis by digital waveguide methods [467,519,523,56,42]. A careful computational model of the piano, partially based on physical modeling and suitable for high-quality real-time sound synthesis, was developed by Bensa in the context of his thesis work [42]. Related publications include [43,45,46]. An excellent simulation of the*clavichord*, also using the commuted waveguide synthesis technique, is described in [501]. A detailed simulation of the

*harpsichord*, along similar lines, but with some interesting new variations, may be found in [514]. The 2003 paper of Bank et al. [29] includes good summaries of prior research on elements of piano modeling such as efficient computational string models, loss-filter design, dispersion filter design, and soundboard models. A more recent paper in this area is [30].

## Woodwinds

We now consider models of woodwind instruments, focusing on the single-reed case. In §9.5.4, tonehole modeling is described in some detail.### Single-Reed Instruments

A simplified model for a single-reed woodwind instrument is shown in Fig. 9.38 [431]. If the bore is cylindrical, as in the clarinet, it can be modeled quite simply using a bidirectional delay line. If the bore is conical, such as in a saxophone, it can still be modeled as a bidirectional delay line, but interfacing to it is slightly more complex, especially at the mouthpiece [37,7,160,436,506,507,502,526,406,528], Because the main control variable for the instrument is air pressure in the mouth at the reed, it is convenient to choose*pressure wave variables*. To first order, the bell passes high frequencies and reflects low frequencies, where ``high'' and ``low'' frequencies are divided by the wavelength which equals the bell's diameter. Thus, the bell can be regarded as a simple ``cross-over'' network, as is used to split signal energy between a woofer and tweeter in a loudspeaker cabinet. For a clarinet bore, the nominal ``cross-over frequency'' is around Hz [38]. The flare of the bell lowers the cross-over frequency by decreasing the bore characteristic impedance toward the end in an approximately non-reflecting manner [51]. Bell flare can therefore be considered analogous to a transmission-line

*transformer*. Since the length of the clarinet bore is only a quarter wavelength at the fundamental frequency, (in the lowest, or ``chalumeau'' register), and since the bell diameter is much smaller than the bore length, most of the sound energy traveling into the bell reflects back into the bore. The low-frequency energy that makes it out of the bore radiates in a fairly omnidirectional pattern. Very high-frequency traveling waves do not ``see'' the enclosing bell and pass right through it, radiating in a more directional beam. The directionality of the beam is proportional to how many wavelengths fit along the bell diameter; in fact, many wavelengths away from the bell, the radiation pattern is proportional to the two-dimensional spatial Fourier transform of the exit aperture (a disk at the end of the bell) [308]. The theory of the single reed is described,

*e.g.*, in [102,249,308]. In the digital waveguide clarinet model described below [431], the reed is modeled as a signal- and embouchure-dependent

*nonlinear reflection coefficient*terminating the bore. Such a model is possible because the reed mass is neglected. The player's embouchure controls damping of the reed, reed aperture width, and other parameters, and these can be implemented as parameters on the contents of the lookup table or nonlinear function.

#### Digital Waveguide Single-Reed Implementation

A diagram of the basic clarinet model is shown in Fig.9.39. The delay-lines carry left-going and right-going*pressure*samples and (respectively) which sample the traveling pressure-wave components within the bore. The reflection filter at the right implements the bell or tone-hole losses as well as the round-trip attenuation losses from traveling back and forth in the bore. The bell output filter is highpass, and

*power complementary*with respect to the bell reflection filter [500]. Power complementarity follows from the assumption that the bell itself does not vibrate or otherwise absorb sound. The bell is also

*amplitude complementary*. As a result, given a reflection filter designed to match measured mode decay-rates in the bore, the transmission filter can be written down simply as for velocity waves, or for pressure waves. It is easy to show that such amplitude-complementary filters are also power complementary by summing the transmitted and reflected power waves:

*z*transform transform of the incident pressure wave, and denotes the

*z*transform of the incident volume-velocity. (All

*z*transform have omitted arguments , where denotes the sampling interval in seconds.) At the far left is the reed mouthpiece controlled by

*mouth pressure*. Another control is

*embouchure,*changed in general by modifying the

*reflection-coefficient*function , where . A simple choice of embouchure control is an offset in the reed-table address. Since the main feature of the reed table is the pressure-drop where the reed begins to open, a simple embouchure offset can implement the effect of biting harder or softer on the reed, or changing the reed stiffness.

### A View of Single-Reed Oscillation

To start the oscillation, the player applies a pressure at the mouthpiece which ``biases'' the reed in a ``negative-resistance'' region. (The pressure drop across the reed tends to close the air gap at the tip of the reed so that an increase in pressure will result in a net decrease in volume velocity--this is negative resistance.) The high-pressure front travels down the bore at the speed of sound until it encounters an open air hole or the bell. To a first approximation, the high-pressure wave reflects with a sign inversion and travels back up the bore. (In reality a lowpass filtering accompanies the reflection, and the complementary highpass filter shapes the spectrum that emanates away from the bore.) As the negated pressure wave travels back up the bore, it cancels the elevated pressure that was established by the passage of the first wave. When the negated pressure front gets back to the mouthpiece, it is reflected again, this time with no sign inversion (because the mouthpiece looks like a closed end to a first approximation). Therefore, as the wave travels back down to the bore, a*negative*pressure zone is left behind. Reflecting from the open end again with a sign inversion brings a return-to-zero wave traveling back to the mouthpiece. Finally the positive traveling wave reaches the mouthpiece and starts the second ``period'' of oscillation, after four trips along the bore. So far, we have produced oscillation without making any use of the negative-resistance of the reed aperture. This is merely the start-up transient. Since in reality there are places of pressure loss in the bore, some mechanism is needed to feed energy back into the bore and prevent the oscillation just described from decaying exponentially to zero. This is the function of the reed: When a traveling pressure-drop reflects from the mouthpiece, making pressure at the mouthpiece switch from high to low, the reed changes from open to closed (to first order). The closing of the reed increases the reflection coefficient ``seen'' by the impinging traveling wave, and so as the pressure falls, it is amplified by an increasing gain (whose maximum is unity when the reed shuts completely). This process sharpens the falling edge of the pressure drop. But this is not all. The closing of the reed also cuts back on the steady incoming flow from the mouth. This causes the pressure to drop even more, potentially providing effective

*amplification*by more than unity. An analogous story can be followed through for a rising pressure appearing at the mouthpiece. However, in the rising pressure case, the reflection coefficient falls as the pressure rises, resulting in a progressive attenuation of the reflected wave; however, the increased pressure let in from the mouth amplifies the reflecting wave. It turns out that the reflection of a positive wave is boosted when the incoming wave is below a certain level and it is attenuated above that level. When the oscillation reaches a very high amplitude, it is limited on the negative side by the shutting of the reed, which sets a maximum reflective amplification for the negative excursions, and it is limited on the positive side by the attenuation described above.

*Unlike*classical negative-resistance oscillators, in which the negative-resistance device is terminated by a simple resistance instead of a lossy transmission line, a

*dynamic equilibrium*is established between the amplification of the negative excursion and the dissipation of the positive excursion. In the first-order case, where the reflection-coefficient varies linearly with pressure drop, it is easy to obtain an exact quantitative description of the entire process. In this case it can be shown, for example, that amplification occurs only on the positive half of the cycle, and the amplitude of oscillation is typically close to half the incoming mouth pressure (when losses in the bore are small). The threshold blowing pressure (which is relatively high in this simplified case) can also be computed in closed form.

### Single-Reed Theory

A simplified diagram of the clarinet mouthpiece is shown in Fig. 9.40. The pressure in the mouth is assumed to be a constant value , and the bore pressure is defined located at the mouthpiece. Any pressure drop across the mouthpiece causes a flow into the mouthpiece through the reed-aperture impedance which changes as a function of since the reed position is affected by . To a first approximation, the clarinet reed can be regarded as a spring flap regulated Bernoulli flow (§B.7.5), [249]). This model has been verified well experimentally until the reed is about to close, at which point viscosity effects begin to appear [102]. It has also been verified that the mass of the reed can be neglected to first order,^{10.18}so that is a positive real number for all values of . Possibly the most important neglected phenomenon in this model is sound generation due to turbulence of the flow, especially near reed closure. Practical synthesis models have always included a noise component of some sort which is modulated by the reed [431], despite a lack of firm basis in acoustic measurements to date. The fundamental equation governing the action of the reed is

*continuity of volume velocity,*

*i.e.*,

where

and

is the volume velocity corresponding to the incoming pressure wave and outgoing pressure wave . (The physical pressure in the bore at the mouthpiece is of course .) The wave impedance of the bore air-column is denoted (computable as the air density times sound speed divided by cross-sectional area). In operation, the mouth pressure and incoming traveling bore pressure are given, and the reed computation must produce an outgoing bore pressure which satisfies (9.35),

*i.e.*, such that

Solving for is not immediate because of the dependence of on which, in turn, depends on . A graphical solution technique was proposed [151,244,308] which, in effect, consists of finding the intersection of the two terms of the equation as they are plotted individually on the same graph, varying . This is analogous to finding the operating point of a transistor by intersecting its operating curve with the ``load line'' determined by the load resistance. It is helpful to normalize (9.38) as follows: Define , and note that , where . Then (9.38) can be multiplied through by and written as , or

The solution is obtained by plotting and on the same graph, finding the point of intersection at coordinates , and computing finally the outgoing pressure wave sample as

(10.40) |

An example of the qualitative appearance of overlaying is shown in Fig. 9.41.

#### Scattering-Theoretic Formulation

Equation (9.38) can be solved for to obtainwhere

(10.44) |

We interpret as a

*signal-dependent reflection coefficient.*Since the mouthpiece of a clarinet is nearly closed, which implies and . In the limit as goes to infinity relative to , (9.42) reduces to the simple form of a rigidly capped acoustic tube,

*i.e.*, . If it were possible to open the reed wide enough to achieve matched impedance, , then we would have and , in which case , with no reflection of , as expected. If the mouthpiece is removed altogether to give (regarding it now as a tube section of infinite radius), then , , and .

#### Computational Methods

Since finding the intersection of and requires an expensive iterative algorithm with variable convergence times, it is not well suited for real-time operation. In this section, fast algorithms based on precomputed nonlinearities are described. Let denote*half-pressure*,

*i.e.*, and . Then (9.43) becomes

Subtracting this equation from gives

The last expression above can be used to precompute as a function of . Denoting this newly defined function as

(10.47) |

(9.45) becomes

This is the form chosen for implementation in Fig. 9.39 [431]. The control variable is mouth half-pressure , and is computed from the incoming bore pressure using only a single subtraction. The table is indexed by , and the result of the lookup is then multiplied by . Finally, the result of the multiplication is subtracted from . The cost of the reed simulation is only two subtractions, one multiplication, and one table lookup per sample. Because the table contains a coefficient rather than a signal value, it can be more heavily quantized both in address space and word length than a direct lookup of a signal value such as or the like. A direct signal lookup, though requiring much higher resolution, would eliminate the multiplication associated with the scattering coefficient. For example, if and are 16-bit signal samples, the table would contain on the order of 64K 16-bit samples. Clearly, some compression of this table would be desirable. Since is smoothly varying, significant compression is in fact possible. However, because the table is directly in the signal path, comparatively little compression can be done while maintaining full audio quality (such as 16-bit accuracy). In the field of computer music, it is customary to use simple piecewise linear functions for functions other than signals at the audio sampling rate,

*e.g.*, for amplitude envelopes, FM-index functions, and so on [382,380]. Along these lines, good initial results were obtained [431] using the simplified

*qualitatively*chosen reed table

depicted in Fig. 9.42 for . The corner point is the smallest pressure difference giving reed closure.

^{10.19}Embouchure and reed stiffness correspond to the choice of offset and slope . Brighter tones are obtained by increasing the curvature of the function as the reed begins to open; for example, one can use for increasing . Another variation is to replace the table-lookup contents by a piecewise polynomial approximation. While less general, good results have been obtained in practice [89,91,92]. For example, one of the SynthBuilder [353] clarinet patches employs this technique using a cubic polynomial. An intermediate approach between table lookups and polynomial approximations is to use interpolated table lookups. Typically, linear interpolation is used, but higher order polynomial interpolation can also be considered (see Chapter 4).

#### Clarinet Synthesis Implementation Details

To finish off the clarinet example, the remaining details of the SynthBuilder clarinet patch`Clarinet2.sb`are described. The input mouth pressure is summed with a small amount of white noise, corresponding to turbulence. For example, 0.1% is generally used as a minimum, and larger amounts are appropriate during the attack of a note. Ideally, the turbulence level should be computed automatically as a function of pressure drop and reed opening geometry [141,530]. It should also be lowpass filtered as predicted by theory. Referring to Fig. 9.39, the reflection filter is a simple one-pole with transfer function

(10.50) |

where , , is vibrato amplitude (

*e.g.*, ), and is vibrato frequency (

*e.g.*, Hz). Further loop filtering occurs as a result of using simple linear interpolation of the delay line. (There is only one delay line in the actual implementation since the lower delay line of Fig. 9.39 can be commuted with the reflection filter and combined with the upper delay line, ignoring the path to the output filter since a pure delay of less than a period in the final output sound is inconsequential.) There is no transmission filter or tone-hole modeling. Legato note transitions are managed using two delay line taps and cross-fading from one to the other during a transition [208,441,405]. In general, legato problems arise when the bore length is changed suddenly while sounding, corresponding to a new fingering. The reason is that really the model itself should be changed during a fingering change from that of a statically terminated bore to that of a bore with a new scattering junction appearing where each ``finger'' is lifting, and with disappearing scattering junctions where tone holes are being covered. In addition, if a hole is covered abruptly (especially when there are large mechanical caps, as in the saxophone), there will also be new signal energy injected in both directions on the bore in superposition with the signal scattering. As a result of this ideal picture, is difficult to get high quality legato performance using only a single delay line. A reduced-cost, approximate solution for obtaining good sounding note transitions in a single delay-line model was proposed in [208]. In this technique, the bore delay line is ``branched'' during the transition,

*i.e.*, a second feedback loop is formed at the new loop delay, thus forming two delay lines sharing the same memory, one corresponding to the old pitch and the other corresponding to the new pitch. A cross-fade from the old-pitch delay to the new-pitch delay sounds good if the cross-fade time and duration are carefully chosen. Another way to look at this algorithm is in terms of ``read pointers'' and ``write pointers.'' A normal delay line consists of a single write pointer followed by a single read pointer, delayed by one period. During a legato transition, we simply cross-fade from a read-pointer at the old-pitch delay to a read-pointer at the new-pitch delay. In this type of implementation, the write-pointer always traverses the full delay memory corresponding to the minimum supported pitch in order that read-pointers may be instantiated at any pitch-period delay at any time. Conceptually, this simplified model of note transitions can be derived from the more rigorous model by replacing the tone-hole scattering junction by a single reflection coefficient. STK software implementing a model as in Fig.9.39 can be found in the file

`Clarinet.cpp`.

### Tonehole Modeling

Toneholes in woodwind instruments are essentially cylindrical holes in the bore. One modeling approach would be to treat the tonehole as a small waveguide which connects to the main bore via one port on a three-port junction. However, since the tonehole length is small compared with the distance sound travels in one sampling instant ( in,*e.g.*), it is more straightforward to treat the tonehole as a lumped load along the bore, and most modeling efforts have taken this approach. The musical acoustics literature contains experimentally verified models of tone-hole acoustics, such as by Keefe [238]. Keefe's tonehole model is formulated as a ``transmission matrix'' description, which we may convert to a traveling-wave formulation by a simple linear transformation (described in §9.5.4 below) [465]. For typical fingerings, the first few open tone holes jointly provide a bore termination [38]. Either the individual tone holes can be modeled as (interpolated) scattering junctions, or the whole ensemble of terminating tone holes can be modeled in aggregate using a single reflection and transmission filter, like the bell model. Since the tone hole diameters are small compared with most audio frequency wavelengths, the reflection and transmission coefficients can be implemented to a reasonable approximation as constants, as opposed to cross-over filters as in the bell. Taking into account the inertance of the air mass in the tone hole, the tone hole can be modeled as a two-port loaded junction having load impedance equal to the air-mass inertance [143,509]. At a higher level of accuracy, adapting transmission-matrix parameters from the existing musical acoustics literature leads to first-order reflection and transmission filters [238,406,403,404,465]. The individual tone-hole models can be simple lossy two-port junctions, modeling only the internal bore loss characteristics, or three-port junctions, modeling also the transmission characteristics to the outside air. Another approach to modeling toneholes is the ``wave digital'' model [527] (see §F.1 for a tutorial introduction to this approach). The subject of tone-hole modeling is elaborated further in [406,502]. For simplest practical implementation, the bell model can be used unchanged for all tunings, as if the bore were being cut to a new length for each note and the same bell were attached. However, for best results in dynamic performance, the tonehole model should additionally include an explicit valve model for physically accurate behavior when slowly opening or closing the tonehole [405].

#### The Clarinet Tonehole as a Two-Port Junction

The clarinet tonehole model developed by Keefe [240] is parametrized in terms of series and shunt resistance and reactance, as shown in Fig. 9.43. The*transmission matrix*description of this two-port is given by the product of the transmission matrices for the series impedance , shunt impedance , and series impedance , respectively:

where all quantities are written in the frequency domain, and the impedance parameters are given by

(open-hole shunt impedance) | |||

(closed-hole shunt impedance) | (10.51) | ||

(open-hole series impedance) | |||

(closed-hole series impedance) |

where is the wave impedance of the tonehole entrance,

*i.e.*, that of an acoustic tube of cross-sectional area ( is air density and is sound speed as usual), is the tonehole radius, is the wavenumber (radian spatial frequency), is the open-tonehole effective length (which is slightly greater than its physical length due to the formation of a small air-piston inside the open tonehole), is the ``specific resistance'' of the open tonehole due to air viscosity in and radiation from the hole, is the closed-tonehole height, defined such that its product times the cross-sectional area of the tonehole exactly equals the geometric volume of the closed tonehole. Finally, and are the equivalent

*series*lengths of the open and closed tonehole, respectively, and are given by

where is the radius of the main bore. The closed-tonehole height can be estimated as [240]

ln

where is the radius of curvature of the tonehole, is the
viscous boundary layer thickness which expressible in terms of the shear
viscosity of air as
*i.e.*, when the tube radius is large compared with the viscous boundary layer), is given by

(10.52) | |||

(10.53) |

for , into (9.51) to convert physical variables to wave variables, ( is the bore wave impedance), we may solve for the outgoing waves in terms of the incoming waves . Mathematica code for obtaining the general conversion formula from lumped parameters to scattering parameters is as follows:

Clear["t*", "p*", "u*", "r*"] transmissionMatrix = {{t11, t12}, {t21, t22}}; leftPort = {{p2p+p2m}, {(p2p-p2m)/r2}}; rightPort = {{p1p+p1m}, {(p1p-p1m)/r1}}; Format[t11, TeXForm] := "{T_{11}}" Format[p1p, TeXForm] := "{P_1^+}" ... (etc. for all variables) ... TeXForm[Simplify[Solve[leftPort == transmissionMatrix . rightPort, {p1m, p2p}]]]The above code produces the following formulas:

Substituting relevant values for Keefe's tonehole model, we obtain, in matrix notation,

We thus obtain the scattering formulation depicted in Fig. 9.44, where

is the

*reflectance*of the tonehole (the same from either direction), and

is the

*transmittance*of the tonehole (also the same from either direction). The notation ``'' for reflectance is chosen because every reflectance is a

*Schur function*(stable and not exceeding unit magnitude on the unit circle in the plane) [428, p. 221]. The approximate forms in (9.57) and (9.58) are obtained by neglecting the negative series inertance which serves to adjust the effective length of the bore, and which therefore can be implemented elsewhere in the interpolated delay-line calculation as discussed further below. The open and closed tonehole cases are obtained by substituting and , respectively, from (9.53). In a manner analogous to converting the four-multiply Kelly-Lochbaum (KL) scattering junction [245] into a one-multiply form (cf. (C.60) and (C.62) on page ), we may pursue a ``one-filter'' form of the waveguide tonehole model. However, the series inertance gives some initial trouble, since

*I.e.*, if denotes the transmittance from branch to all other branches meeting at the junction, then is the reflectance seen on branch . Substituting

(10.58) |

and, similarly,

(10.59) |

The resulting tonehole implementation is shown in Fig. 9.45. We call this the ``shared reflectance'' form of the tonehole junction. In the same way, an alternate form is obtained from the substitution

(10.60) | |||

(10.61) |

shown in Fig. 9.46.

In this form, it is clear that is a first-order

*allpass*filter with a single pole-zero pair near infinity. Unfortunately, the pole is in the right-half-plane and hence

*unstable*. We cannot therefore implement it as shown in Fig. 9.45 or Fig. 9.46. Using elementary manipulations, the unstable allpasses in Figs. 9.45 and Fig. 9.46 can be moved to the configuration shown in Figs. 9.47 and 9.48, respectively. Note that is stable whenever is stable. The unstable allpasses now operate only on the two incoming wave variables, and they can be implemented implicitly by slightly reducing the (interpolated) delay-lines leading to the junction from either side. The tonehole then requires only one filter or . We now see precisely how the negative series inertance provides a

*negative, frequency-dependent, length correction*for the bore. From (9.63), the phase delay of can be computed as

#### Tonehole Filter Design

The tone-hole reflectance and transmittance must be converted to discrete-time form for implementation in a digital waveguide model. Figure 9.49 plots the responses of second-order discrete-time filters designed to approximate the continuous-time magnitude and phase characteristics of the reflectances for closed and open toneholes, as carried out in [403,406]. These filter designs assumed a tonehole of radius mm, minimum tonehole height mm, tonehole radius of curvature mm, and air column radius mm. Since the measurements of Keefe do not extend to 5 kHz, the continuous-time responses in the figures are extrapolated above this limit. Correspondingly, the filter designs were weighted to produce best results below 5 kHz. The closed-hole filter design was carried out using weighted equation-error minimization [428, p. 47],*i.e.*, by minimizing , where is the weighting function, is the desired frequency response, denotes discrete-time radian frequency, and the designed filter response is . Note that both phase and magnitude are matched by equation-error minimization, and this error criterion is used extensively in the field of system identification [288] due to its ability to design optimal IIR filters via quadratic minimization. In the spirit of the well-known Steiglitz-McBride algorithm [287], equation-error minimization can be iterated, setting the weighting function at iteration to the inverse of the inherent weighting of the previous iteration,

*i.e.*, . However, for this study, the weighting was used only to increase accuracy at low frequencies relative to high frequencies. Weighted equation-error minimization is implemented in the matlab function

`invfreqz()`(§8.6.4). The open-hole discrete-time filter was designed using Kopec's method [297], [428, p. 46] in conjunction with weighted equation-error minimization. Kopec's method is based on linear prediction:

- Given a desired complex frequency response , compute an allpole model using linear prediction
- Compute the error spectrum .
- Compute an allpole model
for
by
minimizing

*ratio error*

*upper spectral envelope*of the desired frequency-response. Since the first step of Kopec's method captures the upper spectral envelope, the ``nulls'' and ``valleys'' are largely ``saved'' for the next step which computes zeros. When computing the zeros, the spectral ``dips'' become ``peaks,'' thereby receiving more weight under the ratio-error norm. Thus, in Kopec's method, the poles model the upper spectral envelope, while the zeros model the lower spectral envelope. To apply Kopec's method to the design of an open-tonehole filter, a one-pole model was first fit to the continuous-time response, Subsequently, the inverse error spectrum, was modeled with a two-pole digital filter, The discrete-time approximation to was then given by

#### The Tonehole as a Two-Port Loaded Junction

It seems reasonable to expect that the tonehole should be representable as a*load*along a waveguide bore model, thus creating a loaded two-port junction with two identical bore ports on either side of the tonehole. From the relations for the loaded parallel junction (C.101), in the two-port case with , and considering pressure waves rather than force waves, we have

(10.63) | |||

(10.64) | |||

(10.65) |

Thus, the loaded two-port junction can be implemented in ``one-filter form'' as shown in Fig. 9.48 with ( ) and

*i.e.*, the parallel load impedance is simply the shunt impedance in the tonehole model. Each series impedance in the split-T model of Fig. 9.43 can be modeled as a

*series*waveguide junction with a load of . To see this, set the transmission matrix parameters in (9.55) to the values , , and from (9.51) to get

(10.66) |

where is the alpha parameter for a series loaded waveguide junction involving two impedance waveguides joined in series with each other and with a load impedance of , as can be seen from (C.99). To obtain exactly the loaded series scattering relations (C.100), we first switch to the more general convention in which the ``'' superscript denotes waves traveling

*into*a junction of any number of waveguides. This exchanges ``'' with ``'' at port 2 to yield

(10.67) |

Next we convert pressure to velocity using and to obtain

(10.68) |

Finally, we toggle the reference direction of port 2 (the ``current'' arrow for on port 2 in Fig. 9.43) so that velocity is positive flowing

*into*the junction on both ports (which is the convention used to derive (C.100) and which is typically followed in circuit theory). This amounts to negating , giving

(10.69) |

where . This is then the canonical form (C.100).

## Bowed Strings

A schematic block diagram for bowed strings is shown in Fig.9.51 [431]. The bow divides the string into two sections, so the bow model is a nonlinear two-port, in contrast with the reed which was a nonlinear one-port terminating the bore at the mouthpiece. In the case of bowed strings, the primary control variable is bow velocity, so*velocity waves*are the natural choice for the delay lines.

### Digital Waveguide Bowed-String

A more detailed diagram of the digital waveguide implementation of the bowed-string instrument model is shown in Fig.9.52. The right delay-line pair carries left-going and right-going velocity waves samples and , respectively, which sample the traveling-wave components within the string to the right of the bow, and similarly for the section of string to the left of the bow. The `' superscript refers to waves traveling*into*the bow. String velocity at any point is obtained by adding a left-going velocity sample to the right-going velocity sample immediately opposite in the other delay line, as indicated in Fig.9.52 at the bowing point. The reflection filter at the right implements the losses at the bridge, bow, nut or finger-terminations (when stopped), and the round-trip attenuation/dispersion from traveling back and forth on the string. To a very good degree of approximation, the nut reflects incoming velocity waves (with a sign inversion) at all audio wavelengths. The bridge behaves similarly to a first order, but there are additional (complex) losses due to the finite bridge driving-point impedance (necessary for transducing sound from the string into the resonating body). According to [95, page 27], the bridge of a violin can be modeled up to about kHz, for purposes of computing string loss, as a single spring in parallel with a frequency-independent resistance (``dashpot''). Bridge-filter modeling is discussed further in §9.2.1. Figure 9.52 is drawn for the case of the lowest note. For higher notes the delay lines between the bow and nut are shortened according to the distance between the bow and the finger termination. The bow-string interface is controlled by

*differential velocity*which is defined as the bow velocity minus the total incoming string velocity. Other controls include

*bow force*and

*angle*which are changed by modifying the reflection-coefficient . Bow position is changed by taking samples from one delay-line pair and appending them to the other delay-line pair. Delay-line interpolation can be used to provide continuous change of bow position [267].

### The Bow-String Scattering Junction

The theory of bow-string interaction is described in [95,151,244,307,308]. The basic operation of the bow is to reconcile the nonlinear bow-string friction curve with the string wave impedance :*i.e.*, bow force . (Force and velocity point in the same direction when they have the same sign.) Here, denotes transverse velocity on the segment of the bowed string to the

*right*of the bow, and denotes velocity waves to the

*left*of the bow. The corresponding normalized functions to be used in the Friedlander-Keller graphical solution technique are depicted in Fig.9.53.

where , is bow velocity, and

### Bowed String Synthesis Extensions

For maximally natural interaction between the bow and string, bow-hair dynamics should be included in a proper bow bodel [176]. In addition, the width of the bow should be variable, depending on bow ``angle''. A finite difference model for the finite width bow was reported in [351,352]. The bow-string friction characteristic should employ a*thermodynamic model*of bow friction in which the bow rosin has a

*time-varying viscosity*due to temperature variations within a period of sound [549]. It is well known by bowed-string players that rosin is sensitive to temperature. Thermal models of dynamic friction in bowed strings are described in [426], and they have been used in virtual bowed strings for computer music [419,423,21]. Given a good model of a bowed-string instrument, it is most natural to interface it to a physical bow-type controller, with sensors for force, velocity, position, angle, and so on [420,294]. A real-time software implementation of a bowed-string model, similar to that shown in Fig.9.52, is available in the Synthesis Tool Kit (STK) distribution as

`Bowed.cpp`. It provides a convenient starting point for more refined bowed-string models [276].

### Linear Commuted Bowed Strings

The commuted synthesis technique (§8.7) can be extended to bowed strings in special case of ``ideal'' bowed attacks [440]. Here, an ideal attack is defined as one in which*Helmholtz motion*

^{10.20}is instantly achieved. This technique will be called ``linear commuted synthesis'' of bowed strings. Additionally, the linear commuted-synthesis model for bowed strings can be driven by a separate nonlinear model of bowed-string dynamics. This gives the desirable combination of a full range of complex bow-string interaction behavior together with an efficiently implemented body resonator.

#### Bowing as Periodic Plucking

The ``leaning sawtooth'' waveforms observed by Helmholtz for steady state bowed strings can be obtained by periodically ``plucking'' the string in only one direction along the string [428]. In principle, a traveling impulsive excitation is introduced into the string in the right-going direction for a ``down bow'' and in the left-going direction for an ``up bow.'' This simplified bowing simulation works best for smooth bowing styles in which the notes have slow attacks. More varied types of attack can be achieved using the more physically accurate McIntyre-Woodhouse theory [307,431]. Commuting the string and resonator means that the string is now plucked by a*periodically repeated resonator impulse response.*A nice simplified vibrato implementation is available by varying the impulse-response retriggering period,

*i.e.*, the vibrato is implemented in the excitation oscillator and not in the delay loop. The string loop delay need not be modulated at all. While this departs from being a physical model, the vibrato quality is satisfying and qualitatively similar to that obtained by a rigorous physical model. Figure 9.55 illustrates the overall block diagram of the simplified bowed string and its commuted and response-excited versions.

*string excitation tables.*The excitation can contribute to many aspects of the tone to be synthesized, such as whether it is a violin or a cello, the force of the bow, and where the bow is playing on the string. Also, graphical equalization and other time-invariant filtering can be provided in the form of alternate excitation-table choices. During the synthesis of a single bowed-string tone, the excitation signal is played into the string quasi-periodically. Since the excitation signal is typically longer than one period of the tone, it is necessary to either (1) interrupt the excitation playback to replay it from the beginning, or (2) start a new playback which overlaps with the playback in progress. Variant (2) requires a separate incrementing pointer and addition for each instance of the excitation playback; thus it is more expensive, but it is preferred from a quality standpoint. This same issue also arises in the Chant synthesis technique for the singing voice [389]. In the Chant technique, a sum of three or more enveloped sinusoids (called FOFs) is periodically played out to synthesize a sung vowel tone. In the unlikely event that the excitation table is less than one period long, it is of course extended by zeros, as is done in the VOSIM voice synthesis technique [219] which can be considered a simplified forerunner of Chant. Sound examples for linear commuted bowed-string synthesis may be heard here: (WAV) (MP3) . Of course, ordinary wavetable synthesis [303,199,327] or any other type of synthesis can also be used as an excitation signal in which case the string loop behaves as a pitch-synchronous comb filter following the wavetable oscillator. Interesting effects can be obtained by slightly detuning the wavetable oscillator and delay loop; tuning the wavetable oscillator to a harmonic of the delay loop can also produce an ethereal effect.

#### More General Quasi-Periodic Synthesis

Figure 9.56 illustrates a more general version of the table-excited, filtered delay loop synthesis system [440]. The generalizations help to obtain a wider class of timbres. The multiple excitations summed together through time-varying gains provide for timbral evolution of the tone. For example, a violin can transform smoothly into a cello, or the bow can move smoothly toward the bridge by interpolating among two or more tables. Alternatively, the tables may contain ``principal components'' which can be scaled and added together to approximate a wider variety of excitation timbres. An excellent review of multiple wavetable synthesis appears in [199]. The nonlinearity is useful for obtaining distortion guitar sounds and other interesting evolving timbres [387]. Finally, the ``attack signal'' path around the string has been found to be useful for reducing the cost of implementation: the highest frequency components of a struck string, say, tend to emanate immediately from the string to the resonator with very little reflection back into the string (or pipe, in the case of wind instrument simulation). Injecting them into the delay loop increases the burden on the loop filter to quickly filter them out. Bypassing the delay loop altogether alleviates requirements on the loop filter and even allows the filtered delay loop to operate at a lower sampling rate; in this case, a signal interpolator would appear between the string output and the summer which adds in the scaled attack signal in Fig. 9.56. For example, it was found that the low E of an electric guitar (Gibson Les Paul) can be synthesized quite well using a filtered delay loop running at a sampling rate of 3 kHz. (The pickups do not pick up much energy above 1.5 kHz.) Similar savings can be obtained for any instrument having a high-frequency content which decays much more quickly than its low-frequency content.#### Stochastic Excitation for Quasi-Periodic Synthesis

For good generality, at least one of the excitation signals should be a filtered noise signal [440]. An example implementation is shown in Fig. 9.57, in which there is a free-running bandlimited noise generator filtered by a finite impulse response (FIR) digital filter. As noted in §9.4.4, such a filtered-noise signal can synthesize the perceptual equivalent of the impulse response of many high-frequency modes that have been separated from the lower frequency modes in commuted synthesis (§8.7.1). It can also handle pluck models in which successive plucking variations are imposed by the FIR filter coefficients. In a simple implementation, only two gains might be used, allowing simple interpolation from one filter to the next, and providing an overall amplitude control for the noise component of the excitation signal.## Brasses

Brass instruments include the trumpet, trombone, French horn, tuba, and other instruments consisting of a horn driven by lip vibrations at one end. The mouthpiece of a brass instrument is generally cup-shaped, and the lips vibrate at a frequency near that of the sounding note. There are three main components needed in synthesis models for brass instruments: the mouthpiece, bore, and bell. Both cylindrical and conical bores are accurately modeled using digital waveguide techniques, as discussed in previous chapters (see §C.18 for further discussion). However, bore models may need to simulate nonlinear shock wave formation at high dynamic levels [198,341], which is relatively expensive to do [536]. The bell, which flares more rapidly than conical (typically*exponentially*

^{10.21}) does not support true traveling waves [357], and so it must be modeled using a more general simulation technique, such as by means of reflection and transmission filters (discussed in §9.7.2 below).

### Modeling the Lips and Mouthpiece

Figure 9.58 depicts a schematic of a brass-player's lips positioned within a mouthpiece. The air pressure^{10.22}is denoted inside the mouth, between the lips, and within the mouthpiece cup. A simple first approximation is to assume that the air flow through the lips is

*inviscid*(``frictionless'') and

*incompressible*. In this case, the pressure between the lips is governed by

*Bernoulli's equation*(see §B.7.4):

*particle velocity*(see §B.7.1) (distance per unit time) between the lips. The particle velocity in the mouth is so small that it is usually neglected (approximated by zero). Following Cullen et al. [98] and previous researchers [159,161], one may assume that the flow between the lips creates a jet in the mouthpiece having velocity . The pressure within the jet is therefore , the same as between the lips. The jet is assumed to mix turbulently with the air in the mouthpiece so that it dissipates its energy without raising pressure in the mouthpiece.

^{10.23}One- and two-mass models were also developed by Rodet and Vergez [388,533]. An early one-mass, ``swinging door'' model that works surprisingly well is in Cook's

`HosePlayer`[89] and and

`TBone`[88]. It is also used in the

`Brass.cpp`patch in the Synthesis Tool Kit [91]; in this simplified model, the lip-valve is modeled as a second-order resonator whose output is squared and hard-clipped to a maximum magnitude. The acoustic tube in a horn can be modeled as a simple digital waveguide [22,37,436,506,528], whether it is cylindrical or conical. More complicated, flaring horn sections may be modeled as two-port digital filters which provide the appropriate reflection and transmission from the bell [39,50,71,118,406,528]. Section 9.7.2 below summarizes a particularly economical approach to brass bell modeling. It is known that nonlinear

*shock waves*occur in the bore for large amplitude vibrations [198,341]. A simple nonlinear waveguide simulation of this effect can be implemented by shortening the delay elements slightly when the instantaneous amplitude is large. (The first-order effect of air nonlinearity in large-amplitude wave propagation is an increase in sound speed at very high pressures [52].) It is also known that the bores of ``bent horns,'' such as the trumpet, do not behave as ideal waveguides [425]. This is because sharp bends in the metal tube cause some scattering, and mode conversion at high frequencies. Such points along the horn can be modeled using high-frequency loss and reflection.

### Bell Models

The flaring bell of a horn cannot be accurately modeled as a sparse digital waveguide, because traveling pressure waves only propagate without reflection in conical bores (which include cylindrical bores as a special case) [357].^{10.24}Digital waveguides are ``sparse'' (free of internal scattering) only when there are long sections at a constant wave impedance. The most cost-effective bell filters (and, more generally, ``flare filters'') to date appears to be the use of

*truncated IIR*(TIIR) digital filters [540]. These filters use an unstable pole to produce exponentially rising components in the impulse response, but the response is cut off after a finite time, as is needed in the case of a bell impulse response. By fitting a piecewise polynomial/exponential approximation to the reflection impulse response of the trumpet bell, very good approximations can be had for the computational equivalent of approximately a 10th order IIR filter (but using more memory in the form of a delay line, which costs very little computation). In more detail, the most efficient computational model for flaring bells in brass instruments seems to be one that consists of one or more sections having an impulse response given by the sum of a growing exponential and a constant,

*i.e.*,

`Brass.cpp`.

#### Literature Relevant to Brasses

A nice summary of the musical acoustics of brass instruments is given by Campbell [70]. Research by Cullen et al. [98] and Gilbert et al. [159,161] has been concerned with artificial mouth experiments to verify a theoretical model of vibrating lips, and to calibrate the model to realistic playing conditions. These data can be used in the construction of improved lip-valve models for virtual brass instruments. Additional literature relevant to brass instruments includes [187,186,188,247,529,319,331,336,534,535,536,445,93].## Other Instruments

One of the most interesting aspects of building virtual musical instruments is that every instrument has its own unique characteristics and modeling considerations. One disadvantage of this, however, is that it is not feasible to cover all cases in a single book. This section provides some literature pointers that can hopefully serve as a starting point for further research in some of the instrument areas not discussed here. While these pointers are neither complete nor up to date, the most recent papers and dissertations by the cited researchers can help fill in the gaps. (This is true of the preceding sections as well.)### Singing Voice

An historical summary of voice modeling and synthesis appears in §A.6.3. In [290], a model is proposed for the singing voice in which the driving glottal pulse train is estimated jointly with filter parameters describing the shape of the vocal tract (the complete airway from the base of the throat to the lip opening). The model can be seen as an improvement over linear-predictive coding (LPC) of voice in the direction of a more accurate physical model of voice production, while maintaining a low computational cost relative to more complex articulatory models of voice production. In particular, the parameter estimation involves only convex optimization plus a one-dimensional (possibly non-convex) line search over a compact interval. The line search determines the so-called ``open quotient'' which is fraction of the time there is glottal flow within each period. The glottal pulse parameters are based on the derivative-glottal-wave models of Liljencrants, Fant, and Klatt [133,257]. Portions of this research have been published in the ICMC-00 [291] and WASPAA-01 [292] proceedings. Related subsequent work includes [250,213,251,214,212] Earlier work in voice synthesis, some summarized in Appendix A, includes [40,81,87,90,206,257,389,492]; see also the KTH ``Research Topics'' home page.### Flutes, Recorders, and Pipe Organs

A chapter on the fundamental aero-acoustics of wind instruments appears in [196], and a comprehensive treatment of the acoustics of air jets in recorder-like instruments is given in [530]. A comprehensive review article on lumped models for flue instruments appears [130] in a special issue (July/August 2000) of the*Acustica*journal on ``musical wind instrument acoustics.'' An overview of research on woodwinds and organs appears in [129]. Follow-up publications by this research group include papers concerning the influence of mouth geometry on sound production in flue instruments [108,418].

### Percussion Instruments

While sample-playback synthesis is especially effective for percussion instruments that are supposed to play ``in the background,'' some form of parametrization is needed for the more expressive performances, or for highly variable percussion instruments such as the Indian tabla. Highly efficient computational models of 2D membranes and 3D volumes may be built using the*digital waveguide mesh*[146,520,518,521]. More recently, Lauri and Välimäki have developed a frequency-warping approach to compensating for dispersive wave propagation in a variety of mesh types [398,399,401]. The 2001 thesis of Bilbao [54] provides a unified view of the digital waveguide mesh and

*wave digital filters*[136] as particular classes of energy invariant finite difference schemes [481]. The problem of modeling diffusion at a mesh boundary was addressed by [268].

**Next Section:**

Conclusion

**Previous Section:**

Transfer Function Models