# Virtual Musical Instruments

This chapter discusses aspects of synthesis models for specific musical instruments such as guitars, pianos, woodwinds, bowed strings, and brasses. Initially we look at model components for electric guitars, followed by some special considerations for acoustic guitars. Next we focus on basic models for string excitation, first by a mass (a simplified piano hammer), then by a damped spring (plectrum model). Piano modeling is then considered, using commuted synthesis for the acoustic resonators and a parametric signal-model for the force-pulse of the hammer. Finally, elementary synthesis of single-reed woodwinds, such as the clarinet, and bowed-string instruments, such as the violin, are summarized, and some pointers are given regarding other wind and percussion instruments.

## Electric Guitars

For electric guitars, which are mainly about the vibrating string and subsequent effects processing, we pick up where §6.7 left off, and work our way up to practical complexity step by step. We begin with practical enhancements for the waveguide string model that are useful for all kinds of guitar models.

### Length Three FIR Loop Filter

The simplest nondegenerate example of the loop filters of §6.8
is the three-tap FIR case (
). The symmetry constraint
leaves two degrees of freedom in the frequency response:^{10.1}

*damping*control. As damping is increased, the duration of free vibration is reduced at all nonzero frequencies, and the decay of higher frequencies is accelerated relative to lower frequencies, provided

### Length FIR Loop Filter Controlled by ``Brightness'' and ``Sustain''

Another convenient parametrization of the second-order symmetric FIR
case is when the dc normalization is relaxed so that two degrees of
freedom are retained. It is then convenient to control them
as *brightness* and *sustain* according to the
formulas

where is the period in seconds (total loop delay), is the desired sustain time in seconds, and is the brightness parameter in the interval . The sustain parameter is defined here as the time to decay by dB (or time-constants) when brightness is maximum () in which case the loop gain is at all frequencies, or . As the brightness is lowered, the dc gain remains fixed at while higher frequencies decay faster. At the minimum brightness, the gain at half the sampling rate reaches zero, and the loop-filter amplitude-response assumes the form

### One-Zero Loop Filter

If we relax the constraint that
be odd, then the simplest case
becomes the *one-zero* digital filter:

*linear phase,*and its phase delay and group delay are equal to sample [362]. In practice, the half-sample delay must be compensated elsewhere in the filtered delay loop, such as in the delay-line interpolation filter [207]. Normalizing the dc gain to unity removes the last degree of freedom so that , and .

See [454] for related discussion from a software implementation perspective.

### The Karplus-Strong Algorithm

The simulation diagram for the ideal string with the simplest
frequency-dependent loss filter is shown in
Fig. 9.1. Readers of the computer music
literature will recognize this as the structure of the
*Karplus-Strong algorithm*
[236,207,489].

The Karplus-Strong algorithm, *per se,* is obtained when the
delay-line initial conditions used to ``pluck'' the string consist of
random numbers, or ``white noise.'' We know the initial *shape*
of the string is obtained by *adding* the upper and lower delay
lines of Fig. 6.11, *i.e.*,
. It is shown in §C.7.4 that the initial
*velocity* distribution along the string is determined by the
*difference* between the upper and lower delay lines. Thus, in
the Karplus-Strong algorithm, the string is ``plucked'' by a
*random initial displacement and initial velocity distribution.*
This is a very energetic excitation, and usually in practice the white
noise is lowpass filtered; the lowpass cut-off frequency gives an
effective *dynamic level* control since natural stringed
instruments are typically brighter at louder dynamic levels
[428,207].

Karplus-Strong sound examples are available on the Web.

An implementation of the Karplus-Strong algorithm in the Faust programming language is described (and provided) in [454].

### The Extended Karplus-Strong Algorithm

Figure 9.2 shows a block diagram of the *Extended Karplus-Strong* (EKS)
algorithm [207].

The EKS adds the following features to the KS algorithm:

where

Note that while can be used in the tuning allpass, it is better to offset it to to avoid delays close to zero in the tuning allpass. (A zero delay is obtained by a pole-zero cancellation on the unit circle.) First-order allpass interpolation of delay lines was discussed in §4.1.2.

A history of the Karplus-Strong algorithm and its extensions is given in §A.8. EKS sound examples are also available on the Web. Techniques for designing the string-damping filter and/or the string-stiffness allpass filter are summarized below in §6.11.

An implementation of the Extended Karplus-Strong (EKS) algorithm in the Faust programming language is described (and provided) in [454].

### Nonlinear Distortion

In §6.13, *nonlinear elements* were introduced in the
context of general digital waveguide synthesis. In this section, we
discuss specifically virtual electric guitar distortion, and mention
other instances of audible nonlinearity in stringed musical
instruments.

As discussed in Chapter 6, typical vibrating strings in
musical acoustics are well approximated as linear, time-invariant
systems, there are special cases in which *nonlinear* behavior is
desired.

#### Tension Modulation

In every freely vibrating string, the fundamental frequency declines
over time as the amplitude of vibration decays. This is due to
*tension modulation*, which is often audible at the beginning of
plucked-string tones, especially for low-tension strings. It happens
because higher-amplitude vibrations *stretch* the string to a
longer average length, raising the average string tension
faster wave propagation
higher fundamental
frequency.

The are several methods in the literature for simulating tension
modulation in a digital waveguide string model
[494,233,508,512,513,495,283],
as well as in membrane models [298].
The methods can be classified into two categories, *local* and
*global*.

Local tension-modulation methods modulate the speed of sound locally as a function of amplitude. For example, opposite delay cells in a force-wave digital waveguide string can be summed to obtain the instantaneous vertical force across that string sample, and the length of the adjacent propagation delay can be modulated using a first-order allpass filter. In principle the string slope reduces as the local tension increases. (Recall from Chapter 6 or Appendix C that force waves are minus the string tension times slope.)

Global tension-modulation methods [495,494] essentially modulate the string delay-line length as a function of the total energy in the string.

#### String Length Modulation

A number of stringed musical instruments have a ``nonlinear sound'' that comes from modulating the physical termination of the string (as opposed to its acoustic length in the case of tension modulation).

The Finnish Kantele [231,513] has a different effective string-length in the vertical and horizontal vibration planes due to a loose knot attaching the string to a metal bar. There is also nonlinear feeding of the second harmonic due to a nonrigid tuning peg.

Perhaps a better known example is the Indian *sitar*, in which a
curved ``jawari'' (functioning as a nonlinear bridge) serves to
shorten the string gradually as it displaces toward the bridge.

The Indian *tambura* also employs a *thread* perpendicular
to the strings a short distance from the bridge, which serves to
shorten the string whenever string displacement toward the bridge
exceeds a certain distance.

Finally, the *slap bass* playing technique for bass guitars
involves hitting the string hard enough to cause it to beat against
the neck during vibration
[263,366].

In all of these cases, the string length is physically modulated in some manner each period, at least when the amplitude is sufficiently large.

#### Hard Clipping

A widespread class of distortion used in *electric guitars*, is
*clipping* of the guitar waveform. it is easy to add this effect
to any string-simulation algorithm by passing the output signal
through a nonlinear *clipping function*. For example, a
*hard clipper* has the characteristic (in normalized form)

where denotes the current input sample , and denotes the output of the nonlinearity.

#### Soft Clipping

A *soft clipper* is similar to a hard clipper, but with the
corners smoothed. A common choice of soft-clipper is the *cubic
nonlinearity*, *e.g.* [489],

This particular soft-clipping characteristic is diagrammed in Fig.9.3. An analysis of its spectral characteristics, with some discussion of aliasing it may cause, was given in in §6.13. An input gain may be used to set the desired degree of distortion.

#### Enhancing Even Harmonics

A cubic nonlinearity, as well as *any* odd distortion
law,^{10.2} generates only odd-numbered harmonics (like in a square
wave). For best results, and in particular for *tube distortion*
simulation [31,395], it has been argued that some
amount of even-numbered harmonics should also be present.
Breaking the odd symmetry in any way will add even-numbered harmonics
to the output as well. One simple way to accomplish this is to add an
*offset* to the input signal, obtaining

Another method for breaking the odd symmetry is to add some square-law nonlinearity to obtain

where controls the amount of square-law distortion in the more general third-order polynomial. The square-law is the most gentle nonlinear distortion in existence, adding only some second harmonic to a sinusoidal input signal. The constant can be set to zero the mean, on average; if the input signal is zero-mean with variance is 1, then will cancel the nonzero mean induced by the squaring term . Typically, the output of any audio effect is

*mixed*with the original input signal to allow easy control over the amount of effect. The term can be used for this, provided the constant gains for and are modified accordingly, or is hard-clipped to the desired range at the input.

#### Software for Cubic Nonlinear Distortion

The function `cubicnl` in the Faust software distribution (file
`effect.lib`) implements cubic nonlinearity distortion
[454]. The Faust programming example
`cubic_distortion.dsp` provides a real-time demo with
adjustable parameters, including an offset for bringing in even
harmonics. The free, open-source `guitarix` application (for
Linux platforms) uses this type of distortion effect along with many
other guitar effects.

In the Synthesis Tool Kit (STK) [91], the class
`Cubicnl.cpp` implements a general cubic distortion law.

### Amplifier Feedback

A more extreme effect used with distorted electric guitars is
*amplifier feedback*. In this case, the amplified guitar
waveforms couple back to the strings with some gain and delay, as
depicted schematically in Fig.9.4 [489].

The Amplifier Feedback Delay in the figure can be adjusted to
emphasize certain partial overtones over others. If the loop gain,
controllable via the Amplifier Feedback Gain, is greater than 1 at any
frequency, a sustained ``feedback howl'' will be
produced. Note that in commercial devices, the Pre-distortion gain
and Post-distortion gain are frequency-dependent, *i.e.*, they are
implemented as pre- and post-*equalizers* (typically only a few
bands, such as three). Another simple choice is an integrator
for the pre-distortion gain, and a differentiator
for the post-distortion gain.

Faust software implementing electric-guitar amplifier feedback may be found in [454].

### Cabinet Filtering

The distortion output signal is often further filtered by an
*amplifier cabinet filter*, representing the enclosure,
driver responses, etc. It is straightforward to measure the impulse
response (or frequency response) of a speaker cabinet and fit it with
a recursive digital filter design function such as `invfreqz`
in matlab (§8.6.4). Individual speakers generally have one major
resonance--the ``cone resonance''--which is determined by the mass,
compliance, and damping of the driver subassembly. The enclosing
cabinet introduces many more resonances in the audio range, generally
tuned to even out the overall response as much as possible.

Faust software implementing an idealized speaker filter may be found in [454].

### Duty-Cycle Modulation

For class A tube amplifier simulation, there should be level-dependent
*duty-cycle modulation* as a function:^{10.3}

- 50% at low amplitude levels (no duty-cycle modulation)
- 55-65% duty cycle at high levels

### Vacuum Tube Modeling

The elementary nonlinearities discussed above are aimed at
approximating the sound of *amplifier distortion*,
such as often used by guitar players. Generally speaking, the distortion
characteristics of vacuum-tube amplifiers are considered superior to
those of solid-state amplifiers. A real-time numerically solved
nonlinear differential algebraic model for class A single-ended guitar
power amplification is presented in [84].
Real-time simulation of a tube guitar-amp (triode preamp, inverter,
and push-pull amplifier) based on a precomputed approximate solution
of the nonlinear differential equations is presented in
[293]. Other references in this area include
[338,228,339].

## Acoustic Guitars

This section addresses some modeling considerations specific to
*acoustic* guitars. Section 9.2.1 discusses special properties
of acoustic guitar *bridges*, and §9.2.1 gives ways
to simulate passive bridges, including a look at some laboratory
measurements. For further reading in this area, see, *e.g.*,
[281,128].

###

Bridge Modeling

In §6.3 we analyzed the effect of *rigid string
terminations* on traveling waves. We found that
waves derived by time-derivatives of displacement (displacement,
velocity, acceleration, and so on) reflect with a sign inversion,
while waves defined in terms of the first spatial derivative of
displacement (force, slope) reflect with no sign inversion. We
now look at the more realistic case of *yielding
terminations* for strings. This analysis can be considered a special
case of the loaded string junction analyzed in §C.12.

Yielding string terminations (at the bridge) have a large effect on
the sound produced by acoustic stringed instruments. Rigid
terminations can be considered a reasonable model for the
*solid-body electric guitar* in which *maximum sustain* is
desired for played notes. Acoustic guitars, on the other hand, must
transduce sound energy from the strings into the body of the
instrument, and from there to the surrounding air. All audible sound
energy comes from the string vibrational energy, thereby reducing the
sustain (decay time) of each played note. Furthermore, because the
bridge vibrates more easily in one direction than another, a kind of
``chorus effect'' is created from the detuning of the horizontal and
vertical planes of string vibration (as discussed further in
§6.12.1). A perfectly rigid bridge, in contrast,
cannot transmit any sound into the body of the instrument, thereby
requiring some other transducer, such as the magnetic pickups used in
electric guitars, to extract sound for output.^{10.4}

#### Passive String Terminations

When a traveling wave reflects from the bridge of a real stringed
instrument, the bridge moves, transmitting sound energy into the
instrument body. How far the bridge moves is determined by the
*driving-point impedance* of the bridge, denoted . The
driving point impedance is the ratio of Laplace transform of the force
on the bridge to the velocity of motion that results
. That is,
.

For passive systems (*i.e.*, for all unamplified acoustic musical
instruments), the driving-point impedance is *positive
real* (a property defined and discussed in §C.11.2). Being
positive real has strong implications on the nature of . In
particular, the phase of
cannot exceed plus or minus
degrees at any frequency, and in the lossless case, all poles and
zeros must *interlace* along the axis. Another
implication is that the *reflectance* of a passive bridge, as
seen by traveling waves on the string, is a so-called *Schur
function* (defined and discussed in §C.11); a Schur
reflectance is a stable filter having gain not exceeding 1 at any
frequency. In summary, a guitar bridge is passive if and only if its
driving-point impedance is positive real and (equivalently) its
reflectance is Schur. See §C.11 for a fuller discussion of
this point.

At , the force on the bridge is given by (§C.7.2)

*slope*of the string at . In the Laplace frequency domain, we have

#### A Terminating Resonator

Suppose a guitar bridge couples an ideal vibrating string to a single resonance, as depicted schematically in Fig.9.5. This is often an accurate model of an acoustic bridge impedance in a narrow frequency range, especially at low frequencies where the resonances are well separated. Then, as developed in Chapter 7, the driving-point impedance seen by the string at the bridge is

*positive real*(§C.11.2).

#### Bridge Reflectance

The bridge *reflectance* is needed as part of the loop filter in
a digital waveguide model (Chapter 6).

As derived in §C.11.1, the force-wave reflectance of seen on the string is

where denotes the wave impedance of the ideal string, and denotes the resonance frequency in radians per second. The velocity reflectance is simply minus the force reflectance (§C.11.1).

#### Bridge Transmittance

The bridge *transmittance* is the filter needed for the signal
path from the vibrating string to the resonant acoustic body.

Since the bridge velocity equals the string endpoint velocity (a ``series'' connection), the velocity transmittance is simply

If the bridge is rigid, then its motion becomes a velocity input to the acoustic resonator. In principle, there are three such velocity inputs for each point along the bridge. However, it is typical in stringed instrument models to consider only the vertical transverse velocity on the string as significant, which results in one (vertical) driving velocity at the base of the bridge. In violin models (§9.6), the use of a ``sound post'' on one side of the bridge and ``bass bar'' on the other strongly suggests supporting a rocking motion along the bridge.

#### Digitizing Bridge Reflectance

Converting continuous-time transfer functions such as
and
to the digital domain is analogous to converting an analog
electrical filter to a corresponding digital filter--a problem which
has been well studied [343]. For this task, the
*bilinear transform* (§7.3.2) is a good choice. In
addition to preserving order and being free of aliasing, the bilinear
transform preserves the positive-real property of passive impedances
(§C.11.2).

Digitizing via the bilinear transform (§7.3.2) transform gives

*i.e.*, it is a

*Schur*filter that becomes an allpass as the damping approaches zero. The choice of bilinear-transform constant maps the peak-frequency without error (see Problem 4).

#### A Two-Resonance Guitar Bridge

Now let's consider a two-resonance guitar bridge, as shown in Fig. 9.6.

Like all mechanical systems that don't ``slide away'' in response to a
constant applied input force, the bridge must ``look like a spring''
at zero frequency. Similarly, it is typical for systems to ``look
like a mass'' at very high frequencies, because the driving-point
typically has mass (unless the driver is spring-coupled by what seems
to be massless spring). This implies the driving point admittance
should have a zero at dc and a pole at infinity. If we neglect
losses, as frequency increases up from zero, the first thing we
encounter in the admittance is a pole (a ``resonance'' frequency at
which energy is readily accepted by the bridge from the strings). As
we pass the admittance peak going up in frequency, the phase switches
around from being near (``spring like'') to being closer to
(``mass like''). (Recall the *graphical method* for
calculating the *phase response* of a linear system
[449].) Below the first resonance, we may say that the system
is *stiffness controlled* (admittance phase
),
while above the first resonance, we say it is *mass controlled*
(admittance phase
). This qualitative description is
typical of any lightly damped, linear, dynamic system. As we proceed
up the axis, we'll next encounter a near-zero, or
``anti-resonance,'' above which the system again appears ``stiffness
controlled,'' or spring-like, and so on in alternation to infinity.
The strict alternation of poles and zeros near the axis is
required by the *positive real* property of all passive
admittances (§C.11.2).

#### Measured Guitar-Bridge Admittance

A *measured* driving-point admittance [269]
for a real guitar bridge is shown in Fig. 9.7. Note
that at very low frequencies, the phase information does not appear to
be bounded by as it must be for the admittance to be
positive real (passive). This indicates a poor signal-to-noise ratio
in the measurements at very low frequencies. This can be verified by
computing and plotting the *coherence* function between the
bridge input and output signals using multiple physical
measurements.^{10.5}

Figures 9.8 and 9.9 show overlays of the admittance magnitudes and phases, and also the input-output coherence, for three separate measurements. A coherence of 1 means that all the measurements are identical, while a coherence less than 1 indicates variation from measurement to measurement, implying a low signal-to-noise ratio. As can be seen in the figures, at frequencies for which the coherence is very close to 1, successive measurements are strongly in agreement, and the data are reliable. Where the coherence drops below 1, successive measurements disagree, and the measured admittance is not even positive real at very low frequencies.

#### Building a Synthetic Guitar Bridge Admittance

To construct a synthetic guitar bridge model, we can first measure
empirically the admittance of a real guitar bridge, or we can work from
measurements published in the literature, as shown in
Fig. 9.7. Each peak in the admittance curve corresponds
to a *resonance* in the guitar body that is well coupled to the strings
via the bridge. Whether or not the corresponding vibrational mode *radiates efficiently* depends on the geometry of the vibrational mode, and
how well it couples to the surrounding air. Thus, a complete bridge model
requires not only a synthetic bridge admittance which determines the
reflectance ``seen'' on the string, but also a separate filter which models
the *transmittance* from the bridge to the outside air; the
transmittance filter normally contains the same poles as the reflectance
filter, but the zeros are different. Moreover, keep in mind that each
string sees a slightly different reflectance and transmittance because it
is located at a slightly different point relative to the guitar top plate;
this changes the coupling coefficients to the various resonating modes to
some extent. (We normally ignore this for simplicity and use the same
bridge filters for all the strings.)

Finally, also keep in mind that each string excites the bridge in *three dimensions*. The two most important are the *horizontal and
vertical planes of vibration*, corresponding to the two planes of
transverse vibration on the string. The vertical plane is normal to the
guitar top plate, while the horizontal plane is parallel to the top plate.
Longitudinal waves also excite the bridge, and they can be important as
well, especially in the piano. Since longitudinal waves are much faster in
strings than transverse waves, the corresponding overtones in the sound are
normally inharmonically related to the main (nearly harmonic) overtones set
up by the transverse string vibrations.

The frequency, complex amplitude, and width of each peak in the
measured admittance of a guitar bridge can be used to determine the
parameters of a second-order digital resonator in a parallel bank of
such resonators being used to model the bridge impedance. This is a
variation on *modal synthesis* [5,299].
However, for the bridge model to be *passive* when attached to a
string, its transfer function must be *positive real*, as
discused previously. Since strings are very nearly lossless,
passivity of the bridge model is actually quite critical in practice.
If the bridge model is even slightly active at some frequency, it can
either make the whole guitar model unstable, or it might objectionably
lengthen the decay time of certain string overtones.

We will describe two methods of constructing passive ``bridge filters'' from measured modal parameters. The first is guaranteed to be positive real but has some drawbacks. The second method gives better results, but it has to be checked for passivity and possibly modified to give a positive real admittance. Both methods illustrate more generally applicable signal processing methods.

#### Passive Reflectance Synthesis--Method 1

The first method is based on constructing a *passive reflectance*
having the desired poles, and then converting to an
admittance via the fundamental relation

As we saw in §C.11.1, every passive impedance corresponds to a passive reflectance which is a Schur function (stable and having gain not exceeding around the unit circle). Since damping is light in a guitar bridge impedance (otherwise the strings would not vibrate very long, and sustain is a highly prized feature of real guitars), we can expect the bridge reflectance to be close to an allpass transfer function .

It is well known that every allpass transfer function can be expressed as

We will then construct a Schur function as

^{10.6}We require which means its frequency response is real and positive. Note that being a symmetric FIR filter is not sufficient; there also can be no zero-crossings in the frequency response, which tends to limit how far the impedance magnitudes can be shaped. (For simplicity, feel free to forget about and accept that all the admittance resonances will be constrained to the same height. The second method is easier anyway.)

Recall that in every allpass filter with real coefficients, to every pole
at radius there corresponds a zero at radius .^{10.7}

Because the impedance is lightly damped, the poles and zeros of the corresponding reflectance are close to the unit circle. This means that at points along the unit circle between the poles, the poles and zeros tend to cancel. It can be easily seen using the graphical method for computing the phase of the frequency response that the pole-zero angles in the allpass filter are very close to the resonance frequencies in the corresponding passive impedance [429]. Furthermore, the distance of the allpass poles to the unit circle controls the bandwidth of the impedance peaks. Therefore, to a first approximation, we can treat the allpass pole-angles as the same as those of the impedance pole angles, and the pole radii in the allpass can be set to give the desired impedance peak bandwidth. The zero-phase shaping filter gives the desired mode height.

From the measured peak frequencies and bandwidths in the guitar bridge admittance, we may approximate the pole locations as

where is the sampling interval as usual. Next we construct the allpass denominator as the product of elementary second-order sections:

Now that we've constructed a Schur function, a passive admittance can be computed using (9.2.1). While it is guaranteed to be positive real, the modal frequencies, bandwidths, and amplitudes are only indirectly controlled and therefore approximated. (Of course, this would provide a good initial guess for an iterative procedure which computes an optimal approximation directly.)

A simple example of a synthetic bridge constructed using this method with and is shown in Fig.9.10.

#### Passive Reflectance Synthesis--Method 2

The second method is based on constructing a partial fraction expansion of the admittance directly:

*tends*to be positive real is that by using the same phase for each section (the 1 in all the section numerators), we are sure to get a zero forming at some frequency near the middle between the resonance frequencies, and at a similar distance from the unit circle. This means we are constructing interlacing poles and zeros by simply adding the resonators in parallel. The extra zero near dc is to ensure that the admittance looks like a lightly damped spring at zero frequency. Since half the sampling rate merely ``cuts off'' the frequency response, there is no corresponding requirement to add a pole near as if it were the point at infinity. However, the phase should be checked not to exceed plus or minus degrees there (or at any other frequency), and a pole or zero added if necessary to pull it into the positive-real range.

A simple example of a synthetic bridge constructed using this method with is shown in Fig.9.11.

#### Matlab for Passive Reflectance Synthesis Method 1

fs = 8192; % Sampling rate in Hz (small for display convenience) fc = 300; % Upper frequency to look at nfft = 8192;% FFT size (spectral grid density) nspec = nfft/2+1; nc = round(nfft*fc/fs); f = ([0:nc-1]/nfft)*fs; % Measured guitar body resonances F = [4.64 96.52 189.33 219.95]; % frequencies B = [ 10 10 10 10 ]; % bandwidths nsec = length(F); R = exp(-pi*B/fs); % Pole radii theta = 2*pi*F/fs; % Pole angles poles = R .* exp(j*theta); A1 = -2*R.*cos(theta); % 2nd-order section coeff A2 = R.*R; % 2nd-order section coeff denoms = [ones(size(A1)); A1; A2]' A = [1,zeros(1,2*nsec)]; for i=1:nsec, % polynomial multiplication = FIR filtering: A = filter(denoms(i,:),1,A); end;

Now A contains the (stable) denominator of the desired bridge admittance. We want now to construct a numerator which gives a positive-real result. We'll do this by first creating a passive reflectance and then computing the corresponding PR admittance.

g = 0.9; % Uniform loss factor B = g*fliplr(A); % Flip(A)/A = desired allpass Badm = A - B; Aadm = A + B; Badm = Badm/Aadm(1); % Renormalize Aadm = Aadm/Aadm(1); % Plots fr = freqz(Badm,Aadm,nfft,'whole'); nc = round(nfft*fc/fs); spec = fr(1:nc); f = ([0:nc-1]/nfft)*fs; dbmag = db(spec); phase = angle(spec)*180/pi; subplot(2,1,1); plot(f,dbmag); grid; title('Synthetic Guitar Bridge Admittance'); ylabel('Magnitude (dB)'); subplot(2,1,2); plot(f,phase); grid; ylabel('Phase (degrees)'); xlabel('Frequency (Hz)');

#### Matlab for Passive Reflectance Synthesis Method 2

... as in Method 1 for poles ... % Construct a resonator as a sum of arbitrary modes % with unit residues, adding a near-zero at dc. B = zeros(1,2*nsec+1); impulse = [1,zeros(1,2*nsec)]; for i=1:nsec, % filter() does polynomial division here: B = B + filter(A,denoms(i,:),impulse); end; % add a near-zero at dc B = filter([1 -0.995],1,B); ... as in Method 1 for display ...

#### Matrix Bridge Impedance

A passive *matrix* bridge admittance (*e.g.*, taking all six strings
of the guitar together) can be conveniently synthesized as a vector of
positive-real scalar admittances multiplied by a positive definite
matrix [25]. In [25], a
dual-polarization guitar-string model is connected to a matrix
reflectance computed from the matrix driving-point impedance at the
bridge.

### Body Modeling

As introduced in §8.7, the *commuted synthesis*
technique involves *commuting* the LTI filter representing
acoustic resonator (guitar body) with the string, so that the string
is excited by the *pluck response* of the body in place of the
plectrum directly. The techniques described in §8.8 can be
used to build practical computational synthesis models of acoustic
guitar bodies and the like [229].

## String Excitation

In §2.4 and §6.10 we looked at the basic architecture of
a digital waveguide string excited by some external disturbance. We
now consider the specific cases of string excitation by a
*hammer* (mass) and plectrum (spring).

### Ideal String Struck by a Mass

In §6.6, the ideal struck string was modeled as a simple
initial velocity distribution along the string, corresponding to an
instantaneous transfer of linear momentum from the striking hammer
into the transverse motion of a string segment at time zero. (See
Fig.6.10 for a diagram of the initial traveling velocity
waves.) In that model, we neglected any effect of the striking hammer
after time zero, as if it had bounced away at time 0 due to a
so-called *elastic collision*. In this section, we consider the
more realistic case of an *inelastic collision*, *i.e.*, where the
mass hits the string and *remains in contact* until something,
such as a wave, or gravity, causes the mass and string to separate.

For simplicity, let the string length be infinity, and denote its wave
impedance by . Denote the colliding mass by and its speed
prior to collision by . It will turn out in this analysis that
we may approximate the size of the mass by zero (a so-called
*point mass*). Finally, we neglect the effects of gravity and
drag by the surrounding air. When the mass collides with the string,
our model must switch from two separate models (mass-in-flight and
ideal string), to that of two ideal strings joined by a mass at
, as depicted in Fig.9.12. The ``force-velocity
port'' connections of the mass and two semi-infinite string endpoints
are formally in *series* because they all move together; that is,
the mass velocity equals the velocity of each of the two string
endpoints connected to the mass (see §7.2 for a fuller
discussion of impedances and their parallel/series connection).

The equivalent circuit for the mass-string assembly after time zero is shown in Fig.9.13. Note that the string wave impedance appears twice, once for each string segment on the left and right. Also note that there is a single common velocity for the two string endpoints and mass. LTI circuit elements in series can be arranged in any order.

From the equivalent circuit, it is easy to solve for the velocity . Formally, this is accomplished by applying Kirchoff's Loop Rule, which states that the sum of voltages (``forces'') around any series loop is zero:

All of the signs are `' in this equation because the ``current'' (velocity ) flows into the `' sign of each element. These

*reference directions*indicated by `' on each element in Fig.9.13 may be chosen arbitrarily, but it is convenient to adopt the convention that flows into the `' sign for each

*reaction force*, and into the `' sign for each

*action force*(or

*driving force*or

*voltage/current source*, etc.). With this convention, Kirchoff's Loop Rule effectively states ``the sum of all action forces equals the sum of all reaction forces,'' where the forces are understood to act on a common point. This is essentially Newton's third law of motion ``for every action there is an equal and opposite reaction'' (§B.1). In our mass-string example, all three forces are defined to be reaction forces because there is no external driving force on the mass, and no incoming waves from the string segments. External driving forces on the mass-string junction will be formulated later below.

^{10.8}We could equally well have defined the mass inertial force as a driving force on the two string segments, or the string segments could comprise a pair of driving forces for the mass. Ultimately, any force polarities may be assigned and consistently handled.

Taking the Laplace transform^{10.9}of Eq.(9.8) yields, by linearity,

where and denote the Laplace transforms of and , respectively. As discussed above, the impedance relation for each string endpoint is given by

For the mass, we have

^{10.10}Note that the mass is characterized by its

*impedance*

when the initial velocity is zero (§7.1).

Substituting these relations into Eq.(9.9) yields

We see that the initial momentum of the mass effectively provides an

*impulsive external driving force*

An advantage of the external-impulse formulation is that the system
has a zero initial state, so that an *impedance description*
(§7.1) is complete. In other words, the system can be
fully described as a series combination of the three impedances ,
(on the left), and (on the right), driven by an external
force-source
.

Solving Eq.(9.10) for yields

^{10.11}the velocity of the contact point is

The displacement of the string at is given by the integral of the velocity:

The momentum of the mass before time zero is , and after time zero it is

The force applied to the two string endpoints by the mass is given by
. From Newton's Law,
, we have that
momentum , delivered by the mass to the string,
can be calculated as the *time integral* of applied force:

In a real piano, the hammer, which strikes in an upward (grand) or sideways (upright) direction, falls away from the string a short time after collision, but it may remain in contact with the string for a substantial fraction of a period (see §9.4 on piano modeling).

#### Mass Termination Model

The previous discussion solved for the motion of an ideal mass striking an ideal string of infinite length. We now investigate the same model from the string's point of view. As before, we will be interested in a digital waveguide (sampled traveling-wave) model of the string, for efficiency's sake (Chapter 6), and we therefore will need to know what the mass ``looks like'' at the end of each string segment. For this we will find that the impedance description (§7.1) is especially convenient.

Let's number the string segments to the left and right of the mass by 1 and 2, respectively, as shown in Fig.9.15. Then Eq.(9.8) above may be written

where denotes the force applied by string-segment 1 to the mass (defined as positive in the ``up'', or positive- direction), is the force applied by string-segment 2 to the mass (again positive upwards), and denotes the inertial force applied by the mass to both string endpoints (where again, a positive force points up).

To derive the traveling-wave relations in a digital waveguide model,
we want to use the force-wave variables
and
that we defined for vibrating strings in
§6.1.5; *i.e.*, we defined
, where is the string tension and is the string
slope, .

As shown in Fig.9.16, a negative string slope pulls ``up'' to the right. Therefore, at the mass point we have , where denotes the position of the mass along the string. On the other hand, the figure also shows that a negative string slope pulls ``down'' to the left, so that . In summary, relating the forces we have defined for the mass-string junction to the force-wave variables in the string, we have

where denotes the position of the mass along the string.

Thus, we can rewrite Eq.(9.11) in terms of string wave variables as

or, substituting the definitions of these forces,

The inertial force of the mass is because the mass must be accelerated downward in order to produce an upward reaction force. The signs of the two string forces follow from the definition of force-wave variables on the string, as discussed above.

The force relations can be checked individually. For string 1,

Now that we have expressed the string forces in terms of string force-wave variables, we can derive digital waveguide models by performing the traveling-wave decompositions and and using the Ohm's law relations and for (introduced above near Eq.(6.6)).

#### Mass Reflectance from Either String

Let's first consider how the mass looks from the viewpoint of string 1, assuming string 2 is at rest. In this situation (no incoming wave from string 2), string 2 will appear to string 1 as a simple resistor (or dashpot) of Ohms in series with the mass impedance . (This observation will be used as the basis of a rapid solution method in §9.3.1 below.)

When a wave from string 1 hits the mass, it will cause the mass to
move. This motion carries both string endpoints along with it.
Therefore, both the reflected and transmitted waves include this mass
motion. We can say that we see a ``dispersive transmitted wave'' on
string 2, and a dispersive reflection back onto string 1. Our object
in this section is to calculate the transmission and reflection
*filters* corresponding to these transmitted and reflected waves.

By physical symmetry the *velocity* reflection and transmission
will be the same from string 1 as it is from string 2. We can say the
same about force waves, but we will be more careful because the sign
of the transverse force flips when the direction of travel is
reversed.^{10.12}Thus, we expect a scattering junction of the form shown in
Fig.9.17 (recall the discussion of physically
interacting waveguide inputs in §2.4.3). This much invokes
the superposition principle (for simultaneous reflection and transmission),
and imposes the expected symmetry: equal reflection filters
and
equal transmission filters
(for either force or velocity waves).

Let's begin with Eq.(9.12) above, restated as follows:

The traveling-wave decompositions can be written out as

where a ``+'' superscript means ``right-going'' and a ``-'' superscript means ``left-going'' on either string.

^{10.13}

Let's define the mass position to be zero, so that Eq.(9.14) with the substitutions Eq.(9.15) becomes

In the Laplace domain, dropping the common ``(s)'' arguments,

To compute the reflection coefficient of the mass seen on string 1, we may set , which means , so that we have

From this, the reflected velocity is immediate:

*velocity reflection transfer function*(or velocity

*reflectance*) of the mass as seen from string 1:

*i.e.*, the transverse-velocity reflectance is the same on either side of the mass. Thus we have found both velocity reflection filters at the mass-string junction. In summary, the velocity reflectance of the mass is from either string segment. This is a simple first-order filter model of what the mass (and string beyond) looks like dynamically from either string.

It is always good to check that our answers make physical sense in limiting cases. For this problem, easy cases to check are and . When the mass is , the reflectance goes to zero (no reflected wave at all). When the mass goes to infinity, the reflectance approaches , corresponding to a rigid termination, which also makes sense.

The results of this section can be more quickly obtained as a special case of the main result of §C.12, by choosing waveguides meeting at a load impedance . The next section gives another fast-calculation method based on a standard formula.

#### Simplified Impedance Analysis

The above results are quickly derived from the general reflection-coefficient for force waves (or voltage waves, pressure waves, etc.):

where is the reflection coefficient of impedance as ``seen'' from impedance . If a force wave traveling along in impedance suddenly hits a new impedance , the wave will split into a reflected wave , and a transmitted wave . It therefore follows that a velocity wave will split into a reflected wave and transmitted wave . This rule is derived in §C.8.4 (and implicitly above as well).

In the mass-string-collision problem, we can immediately write down
the *force reflectance* of the mass as seen from either string:

Since, by the Ohm's-law relations,

*velocity reflectance*is simply

#### Mass Transmittance from String to String

Referring to Fig.9.15, the *velocity
transmittance* from string 1 to string 2 may be defined as

We can now refine the picture of our scattering junction Fig.9.17 to obtain the form shown in Fig.9.18.

#### Force Wave Mass-String Model

The velocity transmittance is readily converted to a *force
transmittance* using the Ohm's-law relations:

Checking as before, we see that corresponds to , which means no force is transmitted through an infinite mass, which is reasonable. As , the force transmittance becomes 1 and the mass has no effect, as desired.

#### Summary of Mass-String Scattering Junction

In summary, we have characterized the mass on the string in terms of its reflectance and transmittance from either string. For force waves, we have outgoing waves given by

or

*dynamic scattering junction*on the string. (If there were no dependency on , such as when a dashpot is affixed to the string, we would simply call it a

*scattering junction*.) The above form of the dynamic scattering junction is analogous to the

*Kelly-Lochbaum scattering junction*(§C.8.4). The general relation can be used to simplify the Kelly-Lochbaum form to a

*one-filter scattering junction*analogous to the one-multiply scattering junction (§C.8.5):

The one-filter form follows from the observation that appears in both computations, and therefore need only be implemented once:

This structure is diagrammed in Fig.9.20.

Again, the above results follow immediately from the more general formulation of §C.12.

#### Digital Waveguide Mass-String Model

To obtain a force-wave digital waveguide model of the string-mass assembly after the mass has struck the string, it only remains to digitize the model of Fig.9.20. The delays are obviously to be implemented using digital delay lines. For the mass, we must digitize the force reflectance appearing in the one-filter model of Fig.9.20:

A common choice of digitization method is the

*bilinear transform*(§7.3.2) because it preserves losslessness and does not alias. This will effectively yield a

*wave digital filter*model for the mass in this context (see Appendix F for a tutorial on wave digital filters).

The bilinear transform is typically scaled as

where the gain coefficient and pole are given by

Thus, the reflectance of the mass is a one-pole, one-zero filter. The
zero is exactly at dc, the real pole is close to dc, and the gain at
half the sampling rate is . We may recognize this as the classic
*dc-blocking filter*
[449]. Comparing with Eq.(9.18), we see that the behavior
at dc is correct, and that the behavior at infinite frequency
(
) is now the behavior at half the sampling rate
(
).

Physically, the mass reflectance is zero at dc because sufficiently slow waves can freely move a mass of any finite size. The reflectance is 1 at infinite frequency because there is no time for the mass to start moving before it is pushed in the opposite direction. In short, a mass behaves like a rigid termination at infinite frequency, and a free end (no termination) at zero frequency. The reflectance of a mass is therefore a ``dc blocker''.

The final digital waveguide model of the mass-string combination is shown in Fig.9.21.

Additional examples of lumped-element modeling, including masses,
springs, dashpots, and their various interconnections, are discussed
the Wave Digital Filters (WDF) appendix (Appendix F). A nice feature WDFs is that they
employ traveling-wave input/output signals which are ideal for
interfacing to digital waveguides. The main drawback is that the WDFs
operate over a *warped frequency axis* (due to the bilinear
transform), while digital delay lines have a normal (unwarped)
frequency axis. On the plus side, WDFs cannot *alias*, while
digital waveguides do alias in the frequency domain for signals that
are not bandlimited to less than half the sampling rate. At low
frequencies (or given sufficient oversampling), the WDF frequency
warping is minimal, and in such cases, WDF ``lumped element models''
may be connected directly to digital waveguides, which are ``sampled-wave
distributed parameter'' models.

Even when the bilinear-transform frequency-warping is severe, it is often well tolerated when the frequency response has only one ``important frequency'', such as a second-order resonator, lowpass, or highpass response. In other words, the bilinear transform can be scaled to map any single analog frequency to any desired corresponding digital frequency (see §7.3.2 for details), and the frequency-warped responses above and below the exactly mapped frequency may ``sound as good as'' the unwarped responses for musical purposes. If not, higher order filters can be used to model lumped elements (Chapter 7).

#### Displacement-Wave Simulation

As discussed in [121], displacement waves are often preferred over force or velocity waves for guitar-string simulations, because such strings often hit obstacles such as frets or the neck. To obtain displacement from velocity at a given , we may time-integrate velocity as above to produce displacement at any spatial sample along the string where a collision might be possible. However, all these integrators can be eliminated by simply going to a displacement-wave simulation, as has been done in nearly all papers to date on plucking models for digital waveguide strings.

To convert our force-wave simulation to a displacement-wave simulation, we may first convert force to velocity using the Ohm's law relations and and then conceptually integrate all signals with respect to time (in advance of the simulation). is the same on both sides of the finger-junction, which means we can convert from force to velocity by simply negating all left-going signals. (Conceptually, all signals are converted from force to velocity by the Ohm's law relations and then divided by , but the common scaling by can be omitted (or postponed) unless signal values are desired in particular physical units.) An all-velocity-wave simulation can be converted to displacement waves even more easily by simply changing to everywhere, because velocity and displacement waves scatter identically. In more general situations, we can go to the Laplace domain and replace each occurrence of by , each by , divide all signals by , push any leftover around for maximum simplification, perhaps absorbing it into a nearby filter. In an all-velocity-wave simulation, each signal gets multiplied by in this procedure, which means it cancels out of all definable transfer functions. All filters in the diagram (just in this example) can be left alone because their inputs and outputs are still force-valued in principle. (We expressed each force wave in terms of velocity and wave impedance without changing the signal flow diagram, which remains a force-wave simulation until minus signs, scalings, and operators are moved around and combined.) Of course, one can absorb scalings and sign reversals into the filter(s) to change the physical input/output units as desired. Since we routinely assume zero initial conditions in an impedance description, the integration constants obtained by time-integrating velocities to get displacements are all defined to be zero. Additional considerations regarding the choice of displacement waves over velocity (or force) waves are given in §E.3.3. In particular, their initial conditions can be very different, and traveling-wave components tend not to be as well behaved for displacement waves.

### Piano Hammer Modeling

The previous section treated an ideal point-mass striking an ideal
string. This can be considered a simplified piano-hammer model. The
model can be improved by adding a damped spring to the point-mass, as
shown in Fig.9.22 (*cf.* Fig.9.12).

The impedance of this plucking system, as seen by the string, is the parallel combination of the mass impedance and the damped spring impedance . (The damper and spring are formally in series--see §7.2, for a refresher on series versus parallel connection.) Denoting the driving-point impedance of the hammer at the string contact-point by , we have

Thus, the scattering filters in the digital waveguide model are second order (biquads), while for the string struck by a mass (§9.3.1) we had first-order scattering filters. This is expected because we added another energy-storage element (a spring).

The impedance formulation of Eq.(9.19) assumes all elements are
linear and time-invariant (LTI), but in practice one can normally
modulate element values as a function of time and/or state-variables
and obtain realistic results for low-order elements. For this we must
maintain filter-coefficient formulas that are explicit functions of
physical state and/or time. For best results, state variables should
be chosen so that any nonlinearities remain *memoryless* in the
digitization
[361,348,554,555].

#### Nonlinear Spring Model

In the musical acoustics literature, the piano hammer is classically
modeled as a *nonlinear spring*
[493,63,178,76,60,486,164].^{10.14}Specifically, the piano-hammer damping in Fig.9.22 is
typically approximated by , and the spring is
*nonlinear* and *memoryless* according to a simple power
law:

The upward force applied to the string by the hammer is therefore

(10.20) |

This force is balanced at all times by the downward string force (string tension times slope difference), exactly as analyzed in §9.3.1 above.

#### Including Hysteresis

Since the compressed hammer-felt (wool) on real piano hammers shows
significant *hysteresis memory*, an improved piano-hammer felt
model is

where (s), and again denotes piano key number [487].

Equation (9.21) is said to be a good approximation under normal playing conditions. A more complete hysteresis model is [487]

Relating to Eq.(9.21) above, we have (N/mm).

#### Piano Hammer Mass

The piano-hammer *mass* may be approximated across the keyboard by
[487]

### Pluck Modeling

The piano-hammer model of the previous section can also be configured
as a *plectrum* by making the mass and damping small or zero, and
by releasing the string when the contact force exceeds some threshold
. That is, to a first approximation, a plectrum can be modeled
as a *spring* (linear or nonlinear) that disengages when either
it is far from the string or a maximum spring-force is exceeded. To
avoid discontinuities when the plectrum and string engage/disengage,
it is good to taper both the damping and spring-constant to zero at
the point of contact (as shown
below).

Starting with the piano-hammer impedance of Eq.(9.19) and setting
the mass to infinity (the plectrum holder is immovable), we define
the *plectrum impedance* as

The force-wave reflectance of impedance in Eq.(9.22), as
seen from the string, may be computed exactly as in
§9.3.1:

If the spring damping is much greater than twice the string wave impedance (), then the plectrum looks like a rigid termination to the string (force reflectance ), which makes physical sense.

Again following §9.3.1, the transmittance for force waves is given by

If the damping is set to zero, *i.e.*, if the plectrum is to be modeled
as a simple linear spring, then the impedance becomes
,
and the force-wave reflectance becomes
[128]

#### Digital Waveguide Plucked-String Model

When plucking a string, it is necessary to detect ``collisions''
between the plectrum and string. Also, more complete plucked-string
models will allow the string to ``buzz'' on the frets and ``slap''
against obstacles such as the fingerboard. For these reasons, it is
convenient to choose *displacement waves* for the waveguide
string model. The reflection and transmission filters for
displacement waves are the same as for velocity, namely,
and
.

As in the mass-string collision case, we obtain the one-filter scattering-junction implementation shown in Fig.9.23. The filter may now be digitized using the bilinear transform as previously (§9.3.1).

#### Incorporating Control Motion

Let denote the vertical position of the *mass*
in Fig.9.22. (We still assume .) We can think of
as the position of the *control point* on the
plectrum, *e.g.*, the position of the ``pinch-point'' holding the
plectrum while plucking the string. In a harpsichord, can be
considered the *jack* position [347].

Also denote by the *rest length* of the spring
in Fig.9.22, and let
denote the
position of the ``end'' of the spring while not in contact with the
string. Then the plectrum makes *contact* with the string when

*collision detection*equation.

Let the subscripts and each denote one side of the scattering system, as indicated in Fig.9.23. Then, for example, is the displacement of the string on the left (side ) of plucking point, and is on the right side of (but still located at point ). By continuity of the string, we have

When the spring engages the string () and begins to compress, the upward force on the string at the contact point is given by

^{10.15}For or the applied force is zero and the entire plucking system disappears to leave and , or equivalently, the force reflectance becomes and the transmittance becomes .

During contact, force equilibrium at the plucking point requires
(*cf.* §9.3.1)

where as usual (§6.1), with denoting the string tension. Using Ohm's laws for traveling-wave components (p. ), we have

Substituting and taking the Laplace transform yields

where, as first noted at Eq.(9.24) above,

This system is diagrammed in Fig.9.24. The manipulation of the
minus signs relative to Fig.9.23 makes it convenient for
restricting to positive values only (as shown in the
figure), corresponding to the plectrum engaging the string going up.
This uses the approximation
,
which is exact when
, *i.e.*, when the plectrum does not affect
the string displacement at the current time. It is therefore exact at
the time of collision and also applicable just after release.
Similarly,
can be used to trigger a release of the
string from the plectrum.

#### Successive Pluck Collision Detection

As discussed above, in a simple 1D plucking model, the plectrum comes
up and engages the string when
, and above some
maximum force the plectrum releases the string. At this point, it is
``above'' the string. To pluck again in the same direction, the
collision-detection must be disabled until we again have ,
requiring one bit of state to keep track of that.^{10.16} The harpsichord jack
plucks the string only in the ``up'' direction due to its asymmetric
behavior in the two directions [143]. If only
``up picks'' are supported, then engagement can be suppressed after a
release until comes back down below the envelope of string
vibration (*e.g.*,
). Note that
intermittent disengagements as a plucking cycle begins are normal;
there is often an audible ``buzzing'' or ``chattering'' when plucking
an already vibrating string.

When plucking up and down in alternation, as in the *tremolo*
technique (common on mandolins), the collision detection alternates
between and , and again a bit of state is needed to
keep track of which comparison to use.

#### Plectrum Damping

To include damping in the plectrum model, the load impedance goes back to Eq.(9.22):

#### Digitization of the Damped-Spring Plectrum

Applying the bilinear transformation (§7.3.2) to the reflectance in Eq.(9.23) (including damping) yields the following first-order digital force-reflectance filter:

The transmittance filter is again , and there is a one-filter form for the scattering junction as usual.

#### Feathering

Since the pluck model is linear, the parameters are not
signal-dependent. As a result, when the string and spring separate,
there is a discontinuous change in the reflection and transmission
coefficients. In practice, it is useful to ``feather'' the
switch-over from one model to the next [470]. In
this instance, one appealing choice is to introduce a *nonlinear
spring*, as is commonly used for piano-hammer models (see
§9.3.2 for details).

Let the nonlinear spring model take the form

The foregoing suggests a nonlinear tapering of the damping in addition to the tapering the stiffness as the spring compression approaches zero. One natural choice would be

In summary, the engagement and disengagement of the plucking system can be ``feathered'' by a nonlinear spring and damper in the plectrum model.

## Piano Synthesis

The main elements of the piano we need to model are the

- hammer (§9.3.2),
- string (Chapter 6, §6.6),
- bridge (§9.2.1), and
- soundboard, enclosure, acoustic space, mic placement, etc.

*stiff*, we need to include a

*dispersion allpass filter*in the string model, as introduced in (§6.11.3). The next section provides additional details specific to piano strings.

From the bridge (which runs along the soundboard) to each ``virtual microphone'' (or ``virtual ear''), the soundboard, enclosure, and listening space can be well modeled as a high-order LTI filter characterized by its impulse response (Chapter 8). Such long impulse responses can be converted to more efficient recursive digital filters by various means (Chapter 3,§8.6.2).

A straightforward piano model along the lines indicated above turns
out to be relatively expensive computationally. Therefore, we also
discuss, in §9.4.4 below, the more specialized *commuted
piano synthesis* technique [467], which is capable of high
quality piano synthesis at low cost relative to other model-based
approaches. It can be seen as a *hybrid* method in which the
string is *physically* modeled, the hammer is represented by a
*signal* model, and the acoustic resonators (soundboard,
enclosure, etc.) are handled by ordinary *sampling* (of their
impulse response).

### Stiff Piano Strings

Piano strings are audibly *inharmonic* due to stiffness
[211,210]. General stiff-string modeling
was introduced in §6.9. In this section, we discuss
further details specific to piano strings.

The main effect of string stiffness is to *stretch* the partial
overtone series, so that piano tones are not precisely harmonic
[144]. As a result, piano tuners typically
stretch the tuning of the piano slightly. For example, the total
amount of tuning stretch from the lowest to highest note has been
measured to be approximately 35 cents on a Kurzweil PC88, 45 cents on
a Steinway Model M, and 60 cents on a Steinway Model D.^{10.17}

#### Piano String Wave Equation

A wave equation suitable for modeling linearized piano strings is given by [77,45,317,517]

where the partial derivative notation and are defined on page , and

Young's modulus and the radius of gyration are defined in Appendix B.

The first two terms on the right-hand side of Eq.(9.30) come from
the ideal string wave equation (see Eq.(C.1)), and they model
transverse acceleration and transverse restoring force due to tension,
respectively. The term approximates the transverse
restoring force exerted by a stiff string when it is bent. In an
ideal string with zero diameter, this force is zero; in an *ideal
rod* (or *bar*), this term is dominant [317,261,169].
The final two terms provide *damping*. The damping associated
with is frequency-independent, while the damping due
increases with frequency.

#### Damping-Filter Design

In [46], the damping in real piano strings was modeled using a length 17 FIR filter for the lowest strings, and a length 9 FIR filter for the remaining strings. Such FIR filters (or nonparametric measured frequency-response data) can be converted to lower-order IIR filters by the usual methods (§8.6.2). It is convenient to have separate damping and dispersion filters in the string model. The damping filter in piano strings is significantly less demanding than the dispersion filter.

#### Dispersion Filter-Design

In the context of a digital waveguide string model, dispersion
associated with stiff strings can be supplied by an *allpass
filter* in the basic feedback loop. Methods for designing dispersion
allpass filters were summarized in §6.11.3. In this
section, we are mainly concerned with how to *specify* the
desired dispersion allpass filter for piano strings.

Perceptual studies regarding the audibility of inharmonicity in stringed instrument sounds [211] indicate that the just noticeable coefficient of inharmonicity is given approximately by

where is the fundamental frequency of the string vibration in hertz, and --the so-called

*coefficient of inharmonicity*--affects the th partial overtone tuning via

For a stiff string with Young's modulus , radius , length , and tension , the coefficient of inharmonicity is predicted from theory [144, p. 65],[211] to be

where is the string cross-sectional area, and is the radius of gyration of the string cross-section (see §B.4.9).

In general, when designing dispersion filters for vibrating string models, it is highly cost-effective to obtain an allpass filter which correctly tunes only the lowest-frequency partial overtones, where the number of partials correctly tuned is significantly less than the total number of partials present, as in [384].

Application of the method of [2] to piano-string dispersion-filter design is reported in [1].

A Faust implementation of a closed-form expression
[367] for dispersion allpass coefficients as a
function of inharmonicity coefficient may be found in the function
`piano_dispersion_filter` within `effect.lib`.

### Nonlinear Piano Strings

It turns out that piano strings exhibit audible nonlinear effects, especially in the first three octaves of its pitch range at fortissimo playing levels and beyond [26]. As a result, for highest quality piano synthesis, we need more than what is obtainable from a linearized wave equation such as Eq.(9.30).

As can be seen from a derivation of the wave equation for an ideal
string vibrating in 3D space (§B.6), there is
fundamentally *nonlinear coupling* between transverse and
longitudinal string vibrations. It turns out that the coupling from
transverse-to-longitudinal is much stronger than vice versa, so that
piano synthesis models can get by with one-way coupling at normal
dynamic playing levels [30,163]. As
elaborated in §B.6 and the references cited there, the
longitudinal displacement is driven by longitudinal changes in the
*squared slope* of the string:

*high-frequency inharmonic overtones*(corresponding to longitudinal modes of vibration) in the sound. Since the nonlinear coupling is

*distributed*along the length of the string, the longitudinal modes are continuously being excited by the transverse vibration across time and position along the string.

In addition to the excitation of longitudinal modes, the nonlinear
transverse-to-longitudinal coupling results in a powerful
*longitudinal attack pulse*, which is the leading component of the initial ``shock
noise'' audible in a piano tone. This longitudinal attack pulse hits
the bridge well before the first transverse wave and is therefore
quite significant perceptually. A detailed simulation of both
longitudinal and transverse waves in an ideal string excited by a
Gaussian pulse is given in [391].

Another important (*i.e.*, audible) effect due to nonlinear
transverse-to-longitudinal coupling is so-called *phantom
partials*. Phantom partials are
ongoing *intermodulation products* from the transverse partials
as they transduce (nonlinearly) into longitudinal waves. The term
``phantom partial'' was coined by Conklin [85]. The Web
version of [18] includes some illuminating sound
examples by Conklin.

#### Nonlinear Piano-String Synthesis

When one-way transverse-to-longitudinal coupling is sufficiently
accurate, we can model its effects *separately* based on
*observations* of transverse waves.
For example [30,28],

- longitudinal modes can be implemented as
*second-order resonators*(``modal synthesis''), with driving coefficients given by *orthogonally projecting*[451] the spatial derivative of the squared string slope onto the longitudinal mode shapes (both functions of position ).- If tension variations along the string are neglected (reasonable
since longitudinal waves are so much faster than transverse waves),
then the longitudinal force on the bridge can be derived from the
estimated
*instantaneous tension*in the string, and efficient methods for this have been developed for guitar-string synthesis, particularly by Tolonen (§9.1.6).

An excellent review of nonlinear piano-string synthesis (emphasizing the modal synthesis approach) is given in [30].

#### Regimes of Piano-String Vibration

In general, more complex synthesis models are needed at higher dynamic playing levels. The main three regimes of vibration are as follows [27]:

- The simplest piano-string vibration regime is characterized by
*linear superposition*in which transverse and longitudinal waves*decouple*into separate modes, as implied by Eq.(9.30). In this case, transverse and longitudinal waves can be simulated in separate digital waveguides (Ch. 6). The longitudinal waveguide is of course an order of magnitude shorter than the transverse waveguide(s). - As dynamic playing level is increased,
transverse-to-longitudinal coupling becomes audible
[26].
- At very high dynamic levels, the model should also include
longitudinal-to-transverse coupling. However, this is usually
neglected.

#### Efficient Waveguide Synthesis of Nonlinear Piano Strings

The following nonlinear waveguide piano-synthesis model is under current consideration [427]:

- Initial striking force determines the starting regime (1, 2, or 3 above).
- The string model is simplified as it decays.

*e.g.*, fortissimo for the first three octaves of the piano keyboard [26]), the string model starts in regime 2.

Because the energy stored in a piano string decays monotonically after
a hammer strike (neglecting coupling from other strings), we may
*switch to progressively simpler models* as the energy of
vibration falls below corresponding thresholds. Since the purpose of
model-switching is merely to save computation, it need not happen
immediately, so it may be triggered by a string-energy estimate based
on observing the string at a *single point* over the past
period or so of vibration. Perhaps most simply, the model-regime
classifier can be based on the maximum magnitude of the *bridge
force* over at least one period. If the regime 2 model includes an
instantaneous string-length (tension) estimate, one may simply compare
that to a threshold to determine when the simpler model can be used.
If the longitudinal components and/or phantom partials are not
completely inaudible when the model switches, then standard
cross-fading techniques should be applied so that inharmonic partials
are faded out rather than abruptly and audibly cut off.

To obtain a realistic initial shock noise in the tone for regime 1 (and for any model that does not compute it automatically), the appropriate shock pulse, computed as a function of hammer striking force (or velocity, displacement, etc.), can be summed into the longitudinal waveguide during the hammer contact period.

The longitudinal bridge force may be generated from the estimated string length (§9.1.6). This force should be exerted on the bridge in the direction coaxial with the string at the bridge (a direction available from the two transverse displacements one sample away from the bridge).

Phantom partials may be generated in the longitudinal waveguide as explicit intermodulation products based on the transverse-wave overtones known to be most contributing; for example, the Goetzel algorithm [451] could be used to track relevant partial amplitudes for this purpose. Such explicit synthesis of phantom partials, however, makes modal synthesis more compelling for the longitudinal component [30]; in a modal synthesis model, the longitudinal attack pulse can be replaced by a one-shot (per hammer strike) table playback, scaled and perhaps filtered as a function of the hammer striking velocity.

On the high end for regime 2 modeling, a full nonlinear coupling may be implemented along the three waveguides (two transverse and one longitudinal). At this level of complexity, a wide variety of finite-difference schemes should also be considered (§7.3.1) [53,555].

#### Checking the Approximations

The preceding discussion considered *several* possible
approximations for nonlinear piano-string synthesis. Other neglected
terms in the stiff-string wave equation were not even discussed, such
as terms due to *shear deformation* and *rotary inertia*
that are included in the (highly accurate) *Timoshenko beam theory*
formulation [261,169]. The following questions naturally
arise:

- How do we know for sure our approximations are inaudible?
- We can listen, but could we miss an audible effect?
- Could a difference become audible after more listening?

*truth reference*--a ``perceptually exact'' model.

Note that there are software tools (*e.g.*, from the world of
*perceptual audio coding*
[62,472]) that can be used to measure
the *audible equivalence* of two sounds [456]. For an
audio coder, these tools predict the audibility of the difference
between original and compressed sounds. For sound synthesis
applications, we want to compare our ``exact'' and ``computationally
efficient'' synthesis models.

### High-Accuracy Piano-String Modeling

In [265,266], an extension of the mass-spring model of [391] was presented for the purpose of high-accuracy modeling of nonlinear piano strings struck by a hammer model such as described in §9.3.2. This section provides a brief overview.

Figure 9.25 shows a mass-spring model in 3D space. From
*Hooke's Law* (§B.1.3), we have

*vector equation of motion*for mass 1 is given by Newton's second law :

and similarly for mass 2, where is the vector position of mass in 3D space.

Generalizing to a *chain* of masses and
spring is shown in Fig.9.26. Mass-spring chains--also
called *beaded strings*--have been analyzed in numerous
textbooks (*e.g.*, [295,318]), and numerical software
simulation is described in [391].

The force on the th mass can be expressed as

where

#### A Stiff Mass-Spring String Model

Following the classical derivation of the stiff-string wave equation
[317,144], an obvious way to introduce
*stiffness* in the mass-spring chain is to use a *bundle* of
mass-spring chains to form a kind of ``lumped stranded cable''. One
section of such a model is shown in Fig.9.27. Each mass
is now modeled as a 2D *mass disk*. Complicated rotational
dynamics can be avoided by assuming *no torsional waves* (no
``twisting'' motion) (§B.4.20).

A three-spring-per-mass model is shown in Fig.9.28
[266]. The spring positions alternate between angles
, say, on one side of a mass disk and
on the other side in order to provide effectively
*six* spring-connection points around the mass disk for only
three connecting springs per section. This improves *isotropy*
of the string model with respect to bending direction.

A problem with the simple mass-spring-chain-bundle is that there is no
resistance whatsoever to *shear deformation*, as is clear from
Fig.9.29. To rectify this problem (which does not
arise due implicit assumptions when classically deriving the
stiff-string wave equation), diagonal springs can be added to the
model, as shown in
Fig..

In the simulation results reported in [266], the spring-constants of the shear springs were chosen so that their stiffness in the longitudinal direction would equal that of the longitudinal springs.

#### Nonlinear Piano-String Equations of Motion in State-Space Form

For the flexible (non-stiff) mass-spring string, referring to Fig.9.26 and Eq.(9.34), we have the following equations of motion:

or, in vector form,

#### Finite Difference Implementation

Digitizing via the centered second-order difference [Eq.(7.5)]

*explicit finite-difference scheme*(§D.1):

Note that requiring three adjacent spatial string samples to be in contact with the piano hammer during the attack (which helps to suppress aliasing of spatial frequencies on the string during the attack) implies a sampling rate in the vicinity of 6 megahertz [265]. Thus, the model is expensive to compute! However, results to date show a high degree of accuracy, as desired. In particular, the stretching of the partial overtones in the stiff-string model of Fig. has been measured to be highly accurate despite using only three spring attachment points on one side of each mass disk [265].

See [53] for alternative finite-difference formulations that better preserve physical energy and have other nice properties worth considering.

### Commuted Piano Synthesis

The general technique of *commuted synthesis* was introduced in
§8.7. This method is enabled by the
*linearity* and *time invariance* of both the vibrating
string and its acoustic resonator, allowing them to be interchanged,
in principle, without altering the overall transfer function
[449].

While the piano-hammer is a strongly nonlinear element (§9.3.2),
it is nevertheless possible to synthesize a piano using commuted
synthesis with high fidelity and low computational cost given the
approximation that the string itself is linear and time-invariant
(§9.4.2) [467,519,30].
The key observation is that the interaction
between the hammer and string is essentially *discrete* (after
deconvolution) at only one or a few time instants per hammer strike.
The deconvolution needed is a function of the hammer-string collision
velocity . As a result, *the hammer-string interaction can
be modeled as one or a few discrete impulses that are filtered in a
-dependent way.*

Figure 9.31 illustrates a typical series of interaction forces-pulses at the contact point between a piano-hammer and string. The vertical lines indicate the locations and amplitudes of three single-sample impulses passed through three single-pulse filters to produce the overlapping pulses shown.

#### Force-Pulse Synthesis

The creation of a single force-pulse for a given hammer-string
collision velocity (a specific ``dynamic level'') is shown in
Fig.9.32. The filter input is an impulse, and the
output is the desired hammer-string force pulse. As increases,
the output pulse increases in amplitude *and decreases in width*,
which means the filter is nonlinear. In other words, the force pulse
gets ``brighter'' as its amplitude (dynamic level) increases. In a
real piano, this brightness increase is caused by the nonlinear
felt-compression in the piano hammer. Recall from §9.3.2
that piano-hammer felt is typically modeled as a nonlinear spring
described by
, where is felt compression. Here, the
brightness is increased by shrinking the duration of the filter
impulse response as increases. The key property enabling
commuted synthesis is that, when is constant, the filter
operates as a normal LTI filter. In this way, the entire piano has
been ``linearized'' with respect to a given collision velocity .

#### Multiple Force-Pulse Synthesis

One method of generating multiple force pulses as shown in Fig.9.31 is to sum several single-pulse filters, as shown in Fig.9.33. Furthermore, the three input impulses can be generated using a single impulse into a tapped delay line (§2.5). The sum of the three filter outputs gives the desired superposition of three hammer-string force pulses. As the collision velocity increases, the output pulses become taller and thinner, showing less overlap. The filters LPF1-LPF3 can be given as side information, or the input impulse amplitudes can be set to , or the like.

#### Commuted Piano Synthesis Architecture

Figure 9.34 shows a complete piano synthesis system along
the lines discussed. At a fixed dynamic level, we have the critical
feature that the model is *linear* and *time
invariant*. Therefore we may *commute* the ``Soundboard &
Enclosure'' filter with not only the string, but with the hammer
filter-bank as well. The result is shown in Fig.9.35.

As in the case of commuted guitar synthesis (§8.7),
we replace a high-order digital filter (at least thousands of poles
and zeros [229]) with a simple *excitation
table* containing that filter's impulse response. Thus, the large
digital filter required to implement the soundboard, piano enclosure,
and surrounding acoustic space, has been eliminated. At the same
time, we still have explicit models for the hammer and string, so
physical variations can be implemented, such as harder or softer
hammers, more or less string stiffness, and so on.

Even some resonator modifications remain possible, however, such as changing the virtual mic positions (§2.2.7). However, if we now want to ``open the top'' of our virtual piano, we have to measure the impulse response that case separately and have a separate table for it, which is not a problem, but it means we're doing ``sampling'' instead of ``modeling'' of the various resonators.

An approximation made in the commuted technique is that detailed reflections generated on the string during hammer-string contact are neglected; that is, the hammer force-pulse depends only on the hammer-string velocity at the time of their initial contact, and the string velocity is treated as remaining constant throughout the contact duration.

#### Progressive Filtering

To save computation in the filtering, we can make use of the
observation that, under the assumption of a string initially at rest,
each interaction pulse is smoother than the one before it. That
suggests applying the force-pulse filtering *progressively*, as
was done with Leslie cabinet reflections in §5.7.6. In other
words, the second force-pulse is generated as a filtering of the first
force-pulse. This arrangement is shown in
Fig.9.36.

With progressive filtering, each filter need only supply the mild smoothing (and perhaps dispersion) associated with traveling from the hammer to the agraffe and back, plus the mild attenuation associated with reflection from the felt-covered hammer (a nonlinear mass-spring system as described in §9.3.2).

Referring to Fig.9.36, The first filter LPF1 can shape a velocity-independent excitation signal to obtain the appropriate ``shock spectrum'' for that hammer velocity. Alternatively, the Excitation Table itself can be varied with velocity to produce the needed signal. In this case, filter LPF1 can be eliminated entirely by applying it in advance to the excitation signal. It is possible to interpolate between tables for two different striking velocities; in this case, the tables should be pre-processed to eliminate phase cancellations during cross-fade.

Assuming the first filter in Fig.9.36 is ``weakest'' at the highest hammer velocity (MIDI velocity ), that filtering can be applied to the excitation table in advance, and the first filter then becomes no computation for MIDI velocity , and as velocity is lowered, the filter only needs to make up the difference between what was done in advance to the table and what is desired at that velocity.

Since, for most keys, only a few interaction impulses are observed, per hammer strike, in real pianos, this computational model of the piano achieves a high degree of realism for a price comparable to the cost of the strings only. The soundboard and enclosure filtering have been eliminated and replaced by a look-up table using a few read-pointers per note, and the hammer costs only one or a few low-order filters which in principle convert the interaction impulse into an accurate force pulse.

#### Excitation Factoring

As another refinement, it is typically more efficient to implement the highest Q resonances of the soundboard and piano enclosure using actual digital filters (see §8.8). By factoring these out, the impulse response is shortened and thus the required excitable length is reduced. This provides a classical computation vs. memory trade-off which can be optimized as needed in a given implementation. For lack of a better name, let us refer to such a resonator bank as an ``equalizer'' since it can be conveniently implemented using parametric equalizer sections, one per high-Q resonance.

A possible placement of the comb filter and equalizer is shown in Fig.9.37. The resonator/eq/reverb/comb-filter block may include filtering for partially implementing the response of the soundboard and enclosure, equalization sections for piano color variations, reverberation, comb-filter(s) for flanging, chorus, and simulated hammer-strike echoes on the string. Multiple outputs having different spectral characteristics can be extracted at various points in the processing.

#### Excitation Synthesis

For a further reduction in cost of implementation, particularly with
respect to memory usage, it is possible to *synthesize* the
excitation using a noise signal through a time-varying lowpass filter
[440]. The synthesis replaces the recorded
soundbard/enclosure sound. It turns out that the sound produced by
tapping on a piano soundboard sounds very similar to noise which is
lowpass-filtered such that the filter bandwidth contracts over time.
This approach is especially effective when applied only to the
high-frequency residual excitation after factoring out all
long-ringing modes as biquads.

#### Coupled Piano Strings

Perhaps it should be emphasized that for good string-tone synthesis,
*multiple* filtered delay loops should be employed for each note
rather than the single-loop case so often used for simplicity
(§6.12; §C.13). For piano, two or
three delay loops can correspond to the two or three strings hit by
the same hammer in the plane orthogonal to both the string and bridge.
This number of delay loops is *doubled* from there if the
transverse vibration planes *parallel* to the bridge are added;
since these are slightly detuned, they are worth considering.
Finally, another 2-3 (shorter) delay loops are needed to incorporate
longitudinal waves for each string, and all loops should be
appropriately coupled. In fact, for full accurancy, longitudinal
waves and transverse waves should be *nonlinearly* coupled along
the length of the string [391] (see §B.6.2).

#### Commuted Synthesis of String Reverberation

The sound of all strings ringing can be summed with the excitation to simulate the effect of many strings resonating with the played string when the sustain pedal is down [519]. The string loop filters out the unwanted frequencies in this signal and selects only the overtones which would be excited by the played string.

#### High Piano Key Numbers

At very high pitches, the delay-line length used in the string may
become too short for the implementation used, especially when using
vectorized module signals (sometimes called ``chunks'' in place of
samples [355]). In this case, good results can be obtained by
replacing the filtered-delay-loop string model by a simpler model
consisting only of a few parallel, second-order filter sections or
enveloped sinusoidal oscillators, etc. In other words, *modal
synthesis* (§8.5) is a good choice for very high key
numbers.

#### Force-Pulse Filter Design

In the commuted piano synthesis technique, it is necessary to fit a
filter impulse response to the desired hammer-string force-pulse shape
*for each key* and *for each key velocity* , as
indicated in Figures 9.32 and 9.33. Such a
desired curve can be found in the musical acoustics literature
[488] or it can be easily generated from a good piano-hammer
model (§9.3.2) striking the chosen piano string model.

For psychoacoustic reasons, it is preferable to optimize the
force-pulse filter-coefficients in the *frequency domain*
§8.6.2. Some additional suggestions are given in
[467,519].

### Literature on Piano Acoustics and Synthesis

There has been much research into the musical acoustics of the piano, including models of piano strings [543,18,75,76,144,42] and the piano hammer [493,63,178,60,486,164,487]. Additionally, there has been considerable work to date on the problem of piano synthesis by digital waveguide methods [467,519,523,56,42].

A careful computational model of the piano, partially based on physical modeling and suitable for high-quality real-time sound synthesis, was developed by Bensa in the context of his thesis work [42]. Related publications include [43,45,46].

An excellent simulation of the *clavichord*, also using the
commuted waveguide synthesis technique, is described in
[501]. A detailed simulation of the
*harpsichord*, along similar lines, but with some interesting new
variations, may be found in [514].

The 2003 paper of Bank et al. [29] includes good summaries of prior research on elements of piano modeling such as efficient computational string models, loss-filter design, dispersion filter design, and soundboard models. A more recent paper in this area is [30].

## Woodwinds

We now consider models of woodwind instruments, focusing on the single-reed case. In §9.5.4, tonehole modeling is described in some detail.

### Single-Reed Instruments

A simplified model for a single-reed woodwind instrument is shown in Fig. 9.38 [431].

If the bore is cylindrical, as in the clarinet, it can be modeled
quite simply using a bidirectional delay line. If the bore is
conical, such as in a saxophone, it can still be modeled as a
bidirectional delay line, but interfacing to it is slightly more
complex, especially at the mouthpiece
[37,7,160,436,506,507,502,526,406,528],
Because the main control variable for the instrument is air pressure
in the mouth at the reed, it is convenient to choose *pressure
wave variables*.

To first order, the bell passes high frequencies and reflects low
frequencies, where ``high'' and ``low'' frequencies are divided by the
wavelength which equals the bell's diameter. Thus, the bell can be regarded
as a simple ``cross-over'' network, as is used to split signal energy
between a woofer and tweeter in a loudspeaker cabinet. For a clarinet
bore, the nominal ``cross-over frequency'' is around Hz
[38]. The flare of the bell lowers the cross-over frequency by
decreasing the bore characteristic impedance toward the end in an
approximately non-reflecting manner [51]. Bell flare can
therefore be
considered analogous to a transmission-line *transformer*.

Since the length of the clarinet bore is only a quarter wavelength at the fundamental frequency, (in the lowest, or ``chalumeau'' register), and since the bell diameter is much smaller than the bore length, most of the sound energy traveling into the bell reflects back into the bore. The low-frequency energy that makes it out of the bore radiates in a fairly omnidirectional pattern. Very high-frequency traveling waves do not ``see'' the enclosing bell and pass right through it, radiating in a more directional beam. The directionality of the beam is proportional to how many wavelengths fit along the bell diameter; in fact, many wavelengths away from the bell, the radiation pattern is proportional to the two-dimensional spatial Fourier transform of the exit aperture (a disk at the end of the bell) [308].

The theory of the single reed is described, *e.g.*, in
[102,249,308].
In the digital waveguide clarinet model described below [431],
the reed is modeled as a signal- and embouchure-dependent
*nonlinear reflection coefficient* terminating the bore. Such a
model is possible because the reed mass is neglected. The player's
embouchure controls damping of the reed, reed aperture width, and
other parameters, and these can be implemented as parameters on the
contents of the lookup table or nonlinear function.

#### Digital Waveguide Single-Reed Implementation

A diagram of the basic clarinet model is shown in
Fig.9.39. The delay-lines carry left-going and
right-going *pressure* samples and (respectively) which
sample the traveling pressure-wave components within the bore.

The reflection filter at the right implements the bell or tone-hole
losses as well as the round-trip attenuation losses from traveling
back and forth in the bore. The bell output filter is highpass, and
*power complementary* with respect to the bell reflection filter
[500]. Power complementarity follows from the
assumption that the bell itself does not vibrate or otherwise absorb
sound. The bell is also *amplitude complementary*. As a result,
given a reflection filter designed to match measured mode
decay-rates in the bore, the transmission filter can be written down
simply as
for velocity waves, or
for pressure waves. It is easy to show that such
amplitude-complementary filters are also power complementary by
summing the transmitted and reflected power waves:

where denotes the *z* transform transform of the incident pressure wave,
and denotes the *z* transform of the incident volume-velocity. (All
*z* transform have omitted arguments
, where
denotes the sampling interval in seconds.)

At the far left is the reed mouthpiece controlled by *mouth
pressure* . Another control is *embouchure,* changed in
general by modifying the *reflection-coefficient* function
, where
. A simple choice of embouchure control is an
offset in the reed-table address. Since the main feature of the reed table
is the pressure-drop where the reed begins to open, a simple embouchure
offset can implement the effect of biting harder or softer on the reed, or
changing the reed stiffness.

### A View of Single-Reed Oscillation

To start the oscillation, the player applies a pressure at the mouthpiece which ``biases'' the reed in a ``negative-resistance'' region. (The pressure drop across the reed tends to close the air gap at the tip of the reed so that an increase in pressure will result in a net decrease in volume velocity--this is negative resistance.) The high-pressure front travels down the bore at the speed of sound until it encounters an open air hole or the bell. To a first approximation, the high-pressure wave reflects with a sign inversion and travels back up the bore. (In reality a lowpass filtering accompanies the reflection, and the complementary highpass filter shapes the spectrum that emanates away from the bore.)

As the negated pressure wave travels back up the bore, it cancels the
elevated pressure that was established by the passage of the first wave.
When the negated pressure front gets back to the mouthpiece, it is
reflected again, this time with no sign inversion (because the mouthpiece
looks like a closed end to a first approximation). Therefore, as the wave
travels back down to the bore, a *negative *pressure zone is left
behind. Reflecting from the open end again with a sign inversion brings a
return-to-zero wave traveling back to the mouthpiece. Finally the positive
traveling wave reaches the mouthpiece and starts the second ``period'' of
oscillation, after four trips along the bore.

So far, we have produced oscillation without making any use of the
negative-resistance of the reed aperture. This is merely the start-up
transient. Since in reality there are places of pressure loss in the bore,
some mechanism is needed to feed energy back into the bore and prevent the
oscillation just described from decaying exponentially to zero. This is
the function of the reed: When a traveling pressure-drop reflects from
the mouthpiece, making pressure at the mouthpiece switch from high to low,
the reed changes from open to closed (to first order). The closing of
the reed increases the reflection coefficient ``seen'' by the impinging
traveling wave, and so as the pressure falls, it is amplified by an
increasing gain (whose maximum is unity when the reed shuts completely).
This process sharpens the falling edge of the pressure drop. But this is
not all. The closing of the reed also cuts back on the steady incoming
flow from the mouth. This causes the pressure to drop even more,
potentially providing effective *amplification* by more than unity.

An analogous story can be followed through for a rising pressure appearing
at the mouthpiece. However, in the rising pressure case, the reflection
coefficient falls as the pressure rises, resulting in a progressive
attenuation of the reflected wave; however, the increased pressure let in
from the mouth amplifies the reflecting wave. It turns out that the
reflection of a positive wave is boosted when the incoming wave is below a
certain level and it is attenuated above that level. When the oscillation
reaches a very high amplitude, it is limited on the negative side by the
shutting of the reed, which sets a maximum reflective amplification for
the negative excursions, and it is limited on the positive side by the
attenuation described above. *Unlike* classical negative-resistance
oscillators, in which the negative-resistance device is
terminated by a simple resistance instead of a lossy transmission line, a
*dynamic equilibrium* is established between the amplification of
the negative excursion and the dissipation of the positive excursion.

In the first-order case, where the reflection-coefficient varies linearly with pressure drop, it is easy to obtain an exact quantitative description of the entire process. In this case it can be shown, for example, that amplification occurs only on the positive half of the cycle, and the amplitude of oscillation is typically close to half the incoming mouth pressure (when losses in the bore are small). The threshold blowing pressure (which is relatively high in this simplified case) can also be computed in closed form.

### Single-Reed Theory

A simplified diagram of the clarinet mouthpiece is shown in
Fig. 9.40. The pressure in the mouth is assumed
to be a constant value , and the bore pressure is defined
located at the mouthpiece. Any pressure drop
across
the mouthpiece causes a flow into the mouthpiece through the
reed-aperture impedance
which changes as a function of
since the reed position is affected by
. To a first
approximation, the clarinet reed can be regarded as a spring flap
regulated Bernoulli flow (§B.7.5), [249]).
This model has been verified well experimentally until the reed is
about to close, at which point viscosity effects begin to appear
[102]. It has also been verified that the mass
of the reed can be neglected to first order,^{10.18} so that
is a
positive real number for all values of
. Possibly the most
important neglected phenomenon in this model is sound generation due
to turbulence of the flow, especially near reed closure. Practical
synthesis models have always included a noise component of some sort
which is modulated by the reed [431], despite a lack of firm
basis in acoustic measurements to date.

The fundamental equation governing the action of the reed is
*continuity of volume velocity,* *i.e.*,

where

and

is the volume velocity corresponding to the incoming pressure wave and outgoing pressure wave . (The physical pressure in the bore at the mouthpiece is of course .) The wave impedance of the bore air-column is denoted (computable as the air density times sound speed divided by cross-sectional area).

In operation, the mouth pressure and incoming traveling bore pressure
are given, and the reed computation must produce an outgoing bore
pressure which satisfies (9.35), *i.e.*, such that

Solving for is not immediate because of the dependence of on which, in turn, depends on . A graphical solution technique was proposed [151,244,308] which, in effect, consists of finding the intersection of the two terms of the equation as they are plotted individually on the same graph, varying . This is analogous to finding the operating point of a transistor by intersecting its operating curve with the ``load line'' determined by the load resistance.

It is helpful to normalize (9.38) as follows: Define , and note that , where . Then (9.38) can be multiplied through by and written as , or

The solution is obtained by plotting and on the same graph, finding the point of intersection at coordinates , and computing finally the outgoing pressure wave sample as

(10.40) |

An example of the qualitative appearance of overlaying is shown in Fig. 9.41.

#### Scattering-Theoretic Formulation

Equation (9.38) can be solved for to obtain

where

(10.44) |

We interpret as a

*signal-dependent reflection coefficient.*

Since the mouthpiece of a clarinet is nearly closed,
which
implies
and
. In the limit as goes to
infinity relative to , (9.42) reduces to the simple form of a
rigidly capped acoustic tube, *i.e.*,
.
If it were possible to open the reed wide enough to achieve
matched impedance, , then we would have and , in
which case
, with no reflection of , as expected. If
the mouthpiece is removed altogether to give (regarding it now as a
tube section of infinite radius), then , , and
.

#### Computational Methods

Since finding the intersection of and requires an expensive iterative algorithm with variable convergence times, it is not well suited for real-time operation. In this section, fast algorithms based on precomputed nonlinearities are described.

Let denote *half-pressure* , *i.e.*,
and
. Then (9.43) becomes

Subtracting this equation from gives

The last expression above can be used to precompute as a function of . Denoting this newly defined function as

(10.47) |

(9.45) becomes

This is the form chosen for implementation in Fig. 9.39 [431]. The control variable is mouth half-pressure , and is computed from the incoming bore pressure using only a single subtraction. The table is indexed by , and the result of the lookup is then multiplied by . Finally, the result of the multiplication is subtracted from . The cost of the reed simulation is only two subtractions, one multiplication, and one table lookup per sample.

Because the table contains a coefficient rather than a signal value, it can be more heavily quantized both in address space and word length than a direct lookup of a signal value such as or the like. A direct signal lookup, though requiring much higher resolution, would eliminate the multiplication associated with the scattering coefficient. For example, if and are 16-bit signal samples, the table would contain on the order of 64K 16-bit samples. Clearly, some compression of this table would be desirable. Since is smoothly varying, significant compression is in fact possible. However, because the table is directly in the signal path, comparatively little compression can be done while maintaining full audio quality (such as 16-bit accuracy).

In the field of computer music, it is customary to use simple piecewise
linear functions for functions other than signals at the audio sampling
rate, *e.g.*, for amplitude envelopes, FM-index functions, and so on
[382,380]. Along these lines, good initial results were
obtained [431] using the simplified *qualitatively* chosen
reed table

depicted in Fig. 9.42 for . The corner point is the smallest pressure difference giving reed closure.

^{10.19}Embouchure and reed stiffness correspond to the choice of offset and slope . Brighter tones are obtained by increasing the curvature of the function as the reed begins to open; for example, one can use for increasing .

Another variation is to replace the table-lookup contents by a piecewise polynomial approximation. While less general, good results have been obtained in practice [89,91,92]. For example, one of the SynthBuilder [353] clarinet patches employs this technique using a cubic polynomial.

An intermediate approach between table lookups and polynomial approximations is to use interpolated table lookups. Typically, linear interpolation is used, but higher order polynomial interpolation can also be considered (see Chapter 4).

#### Clarinet Synthesis Implementation Details

To finish off the clarinet example, the remaining details of the
SynthBuilder clarinet patch `Clarinet2.sb` are described.

The input mouth pressure is summed with a small amount of white noise, corresponding to turbulence. For example, 0.1% is generally used as a minimum, and larger amounts are appropriate during the attack of a note. Ideally, the turbulence level should be computed automatically as a function of pressure drop and reed opening geometry [141,530]. It should also be lowpass filtered as predicted by theory.

Referring to Fig. 9.39, the reflection filter is a simple one-pole with transfer function

(10.50) |

where , , is vibrato amplitude (

*e.g.*, ), and is vibrato frequency (

*e.g.*, Hz). Further loop filtering occurs as a result of using simple linear interpolation of the delay line. (There is only one delay line in the actual implementation since the lower delay line of Fig. 9.39 can be commuted with the reflection filter and combined with the upper delay line, ignoring the path to the output filter since a pure delay of less than a period in the final output sound is inconsequential.) There is no transmission filter or tone-hole modeling.

Legato note transitions are managed using two delay line taps and cross-fading from one to the other during a transition [208,441,405]. In general, legato problems arise when the bore length is changed suddenly while sounding, corresponding to a new fingering. The reason is that really the model itself should be changed during a fingering change from that of a statically terminated bore to that of a bore with a new scattering junction appearing where each ``finger'' is lifting, and with disappearing scattering junctions where tone holes are being covered. In addition, if a hole is covered abruptly (especially when there are large mechanical caps, as in the saxophone), there will also be new signal energy injected in both directions on the bore in superposition with the signal scattering. As a result of this ideal picture, is difficult to get high quality legato performance using only a single delay line.

A reduced-cost, approximate solution for obtaining good sounding note
transitions in a single delay-line model was proposed in
[208]. In this technique, the bore delay line is
``branched'' during the transition, *i.e.*, a second feedback loop is formed
at the new loop delay, thus forming two delay lines sharing the same
memory, one corresponding to the old pitch and the other corresponding to
the new pitch. A cross-fade from the old-pitch delay to the new-pitch
delay sounds good if the cross-fade time and duration are carefully chosen.
Another way to look at this algorithm is in terms of ``read pointers'' and
``write pointers.'' A normal delay line consists of a single write pointer
followed by a single read pointer, delayed by one period. During a legato
transition, we simply cross-fade from a read-pointer at the old-pitch delay
to a read-pointer at the new-pitch delay. In this type of implementation,
the write-pointer always traverses the full delay memory corresponding to
the minimum supported pitch in order that read-pointers may be instantiated
at any pitch-period delay at any time. Conceptually, this simplified model
of note transitions can be derived from the more rigorous model by
replacing the tone-hole scattering junction by a single reflection
coefficient.

STK software implementing a model as in Fig.9.39 can be
found in the file `Clarinet.cpp`.

### Tonehole Modeling

Toneholes in woodwind instruments are essentially cylindrical holes in
the bore. One modeling approach would be to treat the tonehole as a
small waveguide which connects to the main bore via one port on a
three-port junction. However, since the tonehole length is small
compared with the distance sound travels in one sampling instant (
in, *e.g.*), it is more straightforward to treat the
tonehole as a lumped load along the bore, and most modeling efforts
have taken this approach.

The musical acoustics literature contains experimentally verified models of tone-hole acoustics, such as by Keefe [238]. Keefe's tonehole model is formulated as a ``transmission matrix'' description, which we may convert to a traveling-wave formulation by a simple linear transformation (described in §9.5.4 below) [465]. For typical fingerings, the first few open tone holes jointly provide a bore termination [38]. Either the individual tone holes can be modeled as (interpolated) scattering junctions, or the whole ensemble of terminating tone holes can be modeled in aggregate using a single reflection and transmission filter, like the bell model. Since the tone hole diameters are small compared with most audio frequency wavelengths, the reflection and transmission coefficients can be implemented to a reasonable approximation as constants, as opposed to cross-over filters as in the bell. Taking into account the inertance of the air mass in the tone hole, the tone hole can be modeled as a two-port loaded junction having load impedance equal to the air-mass inertance [143,509]. At a higher level of accuracy, adapting transmission-matrix parameters from the existing musical acoustics literature leads to first-order reflection and transmission filters [238,406,403,404,465]. The individual tone-hole models can be simple lossy two-port junctions, modeling only the internal bore loss characteristics, or three-port junctions, modeling also the transmission characteristics to the outside air. Another approach to modeling toneholes is the ``wave digital'' model [527] (see §F.1 for a tutorial introduction to this approach). The subject of tone-hole modeling is elaborated further in [406,502]. For simplest practical implementation, the bell model can be used unchanged for all tunings, as if the bore were being cut to a new length for each note and the same bell were attached. However, for best results in dynamic performance, the tonehole model should additionally include an explicit valve model for physically accurate behavior when slowly opening or closing the tonehole [405].

#### The Clarinet Tonehole as a Two-Port Junction

The clarinet tonehole model developed by Keefe [240] is
parametrized in terms of series and shunt resistance and reactance, as
shown in Fig. 9.43. The *transmission
matrix* description of this two-port is given by the product of the
transmission matrices for the series impedance , shunt
impedance , and series impedance , respectively:

where all quantities are written in the frequency domain, and the impedance parameters are given by

(open-hole shunt impedance) | |||

(closed-hole shunt impedance) | (10.51) | ||

(open-hole series impedance) | |||

(closed-hole series impedance) |

where is the wave impedance of the tonehole entrance,

*i.e.*, that of an acoustic tube of cross-sectional area ( is air density and is sound speed as usual), is the tonehole radius, is the wavenumber (radian spatial frequency), is the open-tonehole effective length (which is slightly greater than its physical length due to the formation of a small air-piston inside the open tonehole), is the ``specific resistance'' of the open tonehole due to air viscosity in and radiation from the hole, is the closed-tonehole height, defined such that its product times the cross-sectional area of the tonehole exactly equals the geometric volume of the closed tonehole. Finally, and are the equivalent

*series*lengths of the open and closed tonehole, respectively, and are given by

where is the radius of the main bore. The closed-tonehole height can be estimated as [240]

Note that the specific resistance of the open tonehole, , is the only real impedance and therefore the only source of wave energy loss at the tonehole. It is given by [240]

*i.e.*, when the tube radius is large compared with the viscous boundary layer), is given by

where

The open-hole effective length , assuming no pad above the hole, is given in [240] as

For implementation in a digital waveguide model, the lumped parameters above must be converted to scattering parameters. Such formulations of toneholes have appeared in the literature: Vesa Välimäki [509,502] developed tonehole models based on a ``three-port'' digital waveguide junction loaded by an inertance, as described in Fletcher and Rossing [143], and also extended his results to the case of interpolated digital waveguides. It should be noted in this context, however, that in the terminology of Appendix C, Välimäki's tonehole representation is a loaded 2-port junction rather than a three-port junction. (A load can be considered formally equivalent to a ``waveguide'' having wave impedance given by the load impedance.) Scavone and Smith [402] developed digital waveguide tonehole models based on the more rigorous ``symmetric T'' acoustic model of Keefe [240], using general purpose digital filter design techniques to obtain rational approximations to the ideal tonehole frequency response. A detailed treatment appears in Scavone's CCRMA Ph.D. thesis [406]. This section, adapted from [465], considers an exact translation of the Keefe tonehole model, obtaining two one-filter implementations: the ``shared reflectance'' and ``shared transmittance'' forms. These forms are shown to be stable without introducing an approximation which neglects the series inertance terms in the tonehole model.

By substituting
in (9.53) to convert spatial
frequency to temporal frequency, and by substituting

(10.52) | |||

(10.53) |

for , into (9.51) to convert physical variables to wave variables, ( is the bore wave impedance), we may solve for the outgoing waves in terms of the incoming waves . Mathematica code for obtaining the general conversion formula from lumped parameters to scattering parameters is as follows:

Clear["t*", "p*", "u*", "r*"] transmissionMatrix = {{t11, t12}, {t21, t22}}; leftPort = {{p2p+p2m}, {(p2p-p2m)/r2}}; rightPort = {{p1p+p1m}, {(p1p-p1m)/r1}}; Format[t11, TeXForm] := "{T_{11}}" Format[p1p, TeXForm] := "{P_1^+}" ... (etc. for all variables) ... TeXForm[Simplify[Solve[leftPort == transmissionMatrix . rightPort, {p1m, p2p}]]]The above code produces the following formulas:

Substituting relevant values for Keefe's tonehole model, we obtain, in matrix notation,

We thus obtain the scattering formulation depicted in Fig. 9.44, where

is the

*reflectance*of the tonehole (the same from either direction), and

is the

*transmittance*of the tonehole (also the same from either direction). The notation ``'' for reflectance is chosen because every reflectance is a

*Schur function*(stable and not exceeding unit magnitude on the unit circle in the plane) [428, p. 221].

The approximate forms in (9.57) and (9.58) are obtained by neglecting the negative series inertance which serves to adjust the effective length of the bore, and which therefore can be implemented elsewhere in the interpolated delay-line calculation as discussed further below. The open and closed tonehole cases are obtained by substituting and , respectively, from (9.53).

In a manner analogous to converting the four-multiply Kelly-Lochbaum (KL) scattering junction [245] into a one-multiply form (cf. (C.60) and (C.62) on page ), we may pursue a ``one-filter'' form of the waveguide tonehole model. However, the series inertance gives some initial trouble, since

*I.e.*, if denotes the transmittance from branch to all other branches meeting at the junction, then is the reflectance seen on branch . Substituting

(10.58) |

and, similarly,

(10.59) |

The resulting tonehole implementation is shown in Fig. 9.45. We call this the ``shared reflectance'' form of the tonehole junction.

In the same way, an alternate form is obtained from the substitution

(10.60) | |||

(10.61) |

shown in Fig. 9.46.

Since , it can be neglected to first order, and , reducing both of the above forms to an approximate ``one-filter'' tonehole implementation.

Since is a pure negative reactance, we have

In this form, it is clear that is a first-order

*allpass*filter with a single pole-zero pair near infinity. Unfortunately, the pole is in the right-half-plane and hence

*unstable*. We cannot therefore implement it as shown in Fig. 9.45 or Fig. 9.46. Using elementary manipulations, the unstable allpasses in Figs. 9.45 and Fig. 9.46 can be moved to the configuration shown in Figs. 9.47 and 9.48, respectively. Note that is stable whenever is stable. The unstable allpasses now operate only on the two incoming wave variables, and they can be implemented implicitly by slightly reducing the (interpolated) delay-lines leading to the junction from either side. The tonehole then requires only one filter or .

We now see precisely how the negative series inertance provides a
*negative, frequency-dependent, length correction* for the bore. From
(9.63),
the phase delay of can be computed as

In practice, it is common to combine all delay corrections into a single ``tuning allpass filter'' for the whole bore [428,207]. Whenever the desired allpass delay goes negative, we simply add a sample of delay to the desired allpass phase-delay and subtract it from the nearest delay. In other words, negative delays have to be ``pulled out'' of the allpass and used to shorten an adjacent interpolated delay line. Such delay lines are normally available in practical modeling situations.

#### Tonehole Filter Design

The tone-hole reflectance and transmittance must be converted to discrete-time form for implementation in a digital waveguide model. Figure 9.49 plots the responses of second-order discrete-time filters designed to approximate the continuous-time magnitude and phase characteristics of the reflectances for closed and open toneholes, as carried out in [403,406]. These filter designs assumed a tonehole of radius mm, minimum tonehole height mm, tonehole radius of curvature mm, and air column radius mm. Since the measurements of Keefe do not extend to 5 kHz, the continuous-time responses in the figures are extrapolated above this limit. Correspondingly, the filter designs were weighted to produce best results below 5 kHz.

The closed-hole filter design was carried out using weighted
equation-error minimization [428, p. 47], *i.e.*, by minimizing
, where is the weighting
function,
is the desired frequency response, denotes
discrete-time radian frequency, and the designed filter response is
. Note that both phase and magnitude are
matched by equation-error minimization, and this error criterion is used
extensively in the field of system identification [288]
due to its ability to design optimal IIR filters via quadratic
minimization. In the spirit of the well-known Steiglitz-McBride algorithm
[287], equation-error minimization can be iterated,
setting the weighting function at iteration to the inverse of the
inherent weighting
of the previous iteration, *i.e.*,
. However, for this study, the weighting was used only to
increase accuracy at low frequencies relative to high frequencies.
Weighted equation-error minimization is implemented in the matlab function
`invfreqz()` (§8.6.4).

The open-hole discrete-time filter was designed using Kopec's method [297], [428, p. 46] in conjunction with weighted equation-error minimization. Kopec's method is based on linear prediction:

- Given a desired complex frequency response , compute an allpole model using linear prediction
- Compute the error spectrum .
- Compute an allpole model
for
by
minimizing

*ratio error*

*upper spectral envelope*of the desired frequency-response. Since the first step of Kopec's method captures the upper spectral envelope, the ``nulls'' and ``valleys'' are largely ``saved'' for the next step which computes zeros. When computing the zeros, the spectral ``dips'' become ``peaks,'' thereby receiving more weight under the ratio-error norm. Thus, in Kopec's method, the poles model the upper spectral envelope, while the zeros model the lower spectral envelope. To apply Kopec's method to the design of an open-tonehole filter, a one-pole model was first fit to the continuous-time response, Subsequently, the inverse error spectrum, was modeled with a two-pole digital filter, The discrete-time approximation to was then given by

The reasonably close match in both phase and magnitude by second-order filters indicates that there is in fact only one important tonehole resonance and/or anti-resonance within the audio band, and that the measured frequency responses can be modeled with very high audio accuracy using only second-order filters.

Figure 9.50 plots the reflection function calculated for a six-hole flute bore, as described in [240].

#### The Tonehole as a Two-Port Loaded Junction

It seems reasonable to expect that the tonehole should be
representable as a *load* along a waveguide bore model, thus
creating a loaded two-port junction with two identical bore ports on
either side of the tonehole. From the relations for the loaded
parallel junction (C.101), in the two-port case
with
, and considering pressure waves rather than force
waves, we have

(10.63) | |||

(10.64) | |||

(10.65) |

Thus, the loaded two-port junction can be implemented in ``one-filter form'' as shown in Fig. 9.48 with ( ) and

*i.e.*, the parallel load impedance is simply the shunt impedance in the tonehole model.

Each series impedance in the split-T model of
Fig. 9.43 can be modeled as a *series* waveguide
junction with a load of . To see this, set the transmission
matrix parameters in (9.55) to the values
,
, and from (9.51) to get

(10.66) |

where is the alpha parameter for a series loaded waveguide junction involving two impedance waveguides joined in series with each other and with a load impedance of , as can be seen from (C.99). To obtain exactly the loaded series scattering relations (C.100), we first switch to the more general convention in which the ``'' superscript denotes waves traveling

*into*a junction of any number of waveguides. This exchanges ``'' with ``'' at port 2 to yield

(10.67) |

Next we convert pressure to velocity using and to obtain

(10.68) |

Finally, we toggle the reference direction of port 2 (the ``current'' arrow for on port 2 in Fig. 9.43) so that velocity is positive flowing

*into*the junction on both ports (which is the convention used to derive (C.100) and which is typically followed in circuit theory). This amounts to negating , giving

(10.69) |

where . This is then the canonical form (C.100).

## Bowed Strings

A schematic block diagram for bowed strings is shown in
Fig.9.51 [431]. The bow divides the string
into two sections, so the bow model is a nonlinear two-port, in
contrast with the reed which was a nonlinear one-port terminating the
bore at the mouthpiece. In the case of bowed strings, the primary
control variable is bow velocity, so *velocity waves* are the
natural choice for the delay lines.

### Digital Waveguide Bowed-String

A more detailed diagram of the digital waveguide implementation of the
bowed-string instrument model is shown in Fig.9.52.
The right delay-line pair carries left-going and right-going velocity waves
samples
and
, respectively, which sample the traveling-wave
components within the string to the right of the bow, and similarly for the
section of string to the left of the bow. The `' superscript refers to
waves traveling *into* the bow.

String velocity at any point is obtained by adding a left-going velocity sample to the right-going velocity sample immediately opposite in the other delay line, as indicated in Fig.9.52 at the bowing point. The reflection filter at the right implements the losses at the bridge, bow, nut or finger-terminations (when stopped), and the round-trip attenuation/dispersion from traveling back and forth on the string. To a very good degree of approximation, the nut reflects incoming velocity waves (with a sign inversion) at all audio wavelengths. The bridge behaves similarly to a first order, but there are additional (complex) losses due to the finite bridge driving-point impedance (necessary for transducing sound from the string into the resonating body). According to [95, page 27], the bridge of a violin can be modeled up to about kHz, for purposes of computing string loss, as a single spring in parallel with a frequency-independent resistance (``dashpot''). Bridge-filter modeling is discussed further in §9.2.1.

Figure 9.52 is drawn for the case of the lowest
note. For higher notes the delay lines between the bow and nut are
shortened according to the distance between the bow and the finger
termination. The bow-string interface is controlled by *differential
velocity*
which is defined as the bow velocity minus the total
incoming string velocity. Other controls include *bow force* and *angle* which are changed by modifying the reflection-coefficient
. Bow position is changed by taking samples from one delay-line
pair and appending them to the other delay-line pair. Delay-line
interpolation can be used to provide continuous change of bow position
[267].

### The Bow-String Scattering Junction

The theory of bow-string interaction is described in [95,151,244,307,308]. The basic operation of the bow is to reconcile the nonlinear bow-string friction curve with the string wave impedance :

or, equating these equal and opposite forces, we obtain

*i.e.*, bow force . (Force and velocity point in the same direction when they have the same sign.) Here, denotes transverse velocity on the segment of the bowed string to the

*right*of the bow, and denotes velocity waves to the

*left*of the bow. The corresponding normalized functions to be used in the Friedlander-Keller graphical solution technique are depicted in Fig.9.53.

In a bowed string simulation as in Fig.9.51, a velocity input (which is injected equally in the left- and right-going directions) must be found such that the transverse force of the bow against the string is balanced by the reaction force of the moving string. If bow-hair dynamics are neglected [176], the bow-string interaction can be simulated using a memoryless table lookup or segmented polynomial in a manner similar to single-reed woodwinds [431].

A derivation analogous to that for the single reed is possible for the
simulation of the bow-string interaction. The final result is as follows.

where , is bow velocity, and

Nominally, is constant (the so-called static coefficient of friction) for , where is both the capture and break-away differential velocity. For , falls quickly to a low dynamic coefficient of friction. It is customary in the bowed-string physics literature to assume that the dynamic coefficient of friction continues to approach zero with increasing [308,95].

Figure 9.54 illustrates a simplified, piecewise linear bow table . The flat center portion corresponds to a fixed reflection coefficient ``seen'' by a traveling wave encountering the bow stuck against the string, and the outer sections of the curve give a smaller reflection coefficient corresponding to the reduced bow-string interaction force while the string is slipping under the bow. The notation at the corner point denotes the capture or break-away differential velocity. Note that hysteresis is neglected.

### Bowed String Synthesis Extensions

For maximally natural interaction between the bow and string, bow-hair
dynamics should be included in a proper bow bodel
[176]. In addition, the width of the bow should
be variable, depending on bow ``angle''. A finite difference model
for the finite width bow was reported in
[351,352].
The bow-string friction characteristic should employ a
*thermodynamic model* of bow friction in which the bow rosin has
a *time-varying viscosity* due to temperature variations within a
period of sound [549]. It is well known by bowed-string
players that rosin is sensitive to temperature. Thermal models of
dynamic friction in bowed strings are described in
[426], and they have been used in virtual bowed
strings for computer music
[419,423,21].

Given a good model of a bowed-string instrument, it is most natural to interface it to a physical bow-type controller, with sensors for force, velocity, position, angle, and so on [420,294].

A real-time software implementation of a bowed-string model, similar
to that shown in Fig.9.52, is available in the
Synthesis Tool Kit (STK) distribution as `Bowed.cpp`. It
provides a convenient starting point for more refined bowed-string
models [276].

### Linear Commuted Bowed Strings

The commuted synthesis technique (§8.7) can be
extended to bowed strings in special case of ``ideal'' bowed attacks
[440]. Here, an ideal attack is defined as one in
which *Helmholtz motion*^{10.20}is instantly achieved. This technique will be called ``linear
commuted synthesis'' of bowed strings.

Additionally, the linear commuted-synthesis model for bowed strings can be driven by a separate nonlinear model of bowed-string dynamics. This gives the desirable combination of a full range of complex bow-string interaction behavior together with an efficiently implemented body resonator.

#### Bowing as Periodic Plucking

The ``leaning sawtooth'' waveforms observed by Helmholtz for steady state bowed strings can be obtained by periodically ``plucking'' the string in only one direction along the string [428]. In principle, a traveling impulsive excitation is introduced into the string in the right-going direction for a ``down bow'' and in the left-going direction for an ``up bow.'' This simplified bowing simulation works best for smooth bowing styles in which the notes have slow attacks. More varied types of attack can be achieved using the more physically accurate McIntyre-Woodhouse theory [307,431].

Commuting the string and resonator means that the string is now
plucked by a *periodically repeated resonator impulse response. *A nice simplified vibrato implementation is available by varying the
impulse-response retriggering period, *i.e.*, the vibrato is implemented
in the excitation oscillator and not in the delay loop. The string
loop delay need not be modulated at all. While this departs from
being a physical model, the vibrato quality is satisfying and
qualitatively similar to that obtained by a rigorous physical
model. Figure 9.55 illustrates the overall block
diagram of the simplified bowed string and its commuted and
response-excited versions.

In current technology, it is reasonable to store one recording of the
resonator impulse response in digital memory as one of many possible *string excitation tables. *The excitation can contribute to many aspects of
the tone to be synthesized, such as whether it is a violin or a cello, the
force of the bow, and where the bow is playing on the string. Also,
graphical equalization and other time-invariant filtering can be provided in
the form of alternate excitation-table choices.

During the synthesis of a single bowed-string tone, the excitation signal is played into the string quasi-periodically. Since the excitation signal is typically longer than one period of the tone, it is necessary to either (1) interrupt the excitation playback to replay it from the beginning, or (2) start a new playback which overlaps with the playback in progress. Variant (2) requires a separate incrementing pointer and addition for each instance of the excitation playback; thus it is more expensive, but it is preferred from a quality standpoint. This same issue also arises in the Chant synthesis technique for the singing voice [389]. In the Chant technique, a sum of three or more enveloped sinusoids (called FOFs) is periodically played out to synthesize a sung vowel tone. In the unlikely event that the excitation table is less than one period long, it is of course extended by zeros, as is done in the VOSIM voice synthesis technique [219] which can be considered a simplified forerunner of Chant.

Sound examples for linear commuted bowed-string synthesis may be heard here: (WAV) (MP3) .

Of course, ordinary wavetable synthesis [303,199,327] or any other type of synthesis can also be used as an excitation signal in which case the string loop behaves as a pitch-synchronous comb filter following the wavetable oscillator. Interesting effects can be obtained by slightly detuning the wavetable oscillator and delay loop; tuning the wavetable oscillator to a harmonic of the delay loop can also produce an ethereal effect.

#### More General Quasi-Periodic Synthesis

Figure 9.56 illustrates a more general version of the table-excited, filtered delay loop synthesis system [440]. The generalizations help to obtain a wider class of timbres. The multiple excitations summed together through time-varying gains provide for timbral evolution of the tone. For example, a violin can transform smoothly into a cello, or the bow can move smoothly toward the bridge by interpolating among two or more tables. Alternatively, the tables may contain ``principal components'' which can be scaled and added together to approximate a wider variety of excitation timbres. An excellent review of multiple wavetable synthesis appears in [199]. The nonlinearity is useful for obtaining distortion guitar sounds and other interesting evolving timbres [387].

Finally, the ``attack signal'' path around the string has been found to be useful for reducing the cost of implementation: the highest frequency components of a struck string, say, tend to emanate immediately from the string to the resonator with very little reflection back into the string (or pipe, in the case of wind instrument simulation). Injecting them into the delay loop increases the burden on the loop filter to quickly filter them out. Bypassing the delay loop altogether alleviates requirements on the loop filter and even allows the filtered delay loop to operate at a lower sampling rate; in this case, a signal interpolator would appear between the string output and the summer which adds in the scaled attack signal in Fig. 9.56. For example, it was found that the low E of an electric guitar (Gibson Les Paul) can be synthesized quite well using a filtered delay loop running at a sampling rate of 3 kHz. (The pickups do not pick up much energy above 1.5 kHz.) Similar savings can be obtained for any instrument having a high-frequency content which decays much more quickly than its low-frequency content.

#### Stochastic Excitation for Quasi-Periodic Synthesis

For good generality, at least one of the excitation signals should be a filtered noise signal [440]. An example implementation is shown in Fig. 9.57, in which there is a free-running bandlimited noise generator filtered by a finite impulse response (FIR) digital filter. As noted in §9.4.4, such a filtered-noise signal can synthesize the perceptual equivalent of the impulse response of many high-frequency modes that have been separated from the lower frequency modes in commuted synthesis (§8.7.1). It can also handle pluck models in which successive plucking variations are imposed by the FIR filter coefficients.

In a simple implementation, only two gains might be used, allowing simple interpolation from one filter to the next, and providing an overall amplitude control for the noise component of the excitation signal.

## Brasses

Brass instruments include the trumpet, trombone, French horn, tuba, and other instruments consisting of a horn driven by lip vibrations at one end. The mouthpiece of a brass instrument is generally cup-shaped, and the lips vibrate at a frequency near that of the sounding note.

There are three main components needed in synthesis models for brass
instruments: the mouthpiece, bore, and bell. Both cylindrical and
conical bores
are accurately modeled using digital waveguide techniques, as
discussed in previous chapters (see §C.18 for further discussion).
However, bore models may need to
simulate nonlinear shock wave formation at high dynamic levels
[198,341], which is relatively expensive to
do [536]. The bell, which flares more rapidly
than conical (typically
*exponentially*^{10.21}) does not support true traveling
waves [357], and so it must be modeled using a more general
simulation technique, such as by means of reflection and transmission
filters (discussed in §9.7.2 below).

### Modeling the Lips and Mouthpiece

Figure 9.58 depicts a schematic of a brass-player's lips
positioned within a mouthpiece. The air pressure^{10.22} is denoted inside the mouth, between the lips, and
within the mouthpiece cup.

A simple first approximation is to assume that the air flow through
the lips is *inviscid* (``frictionless'') and
*incompressible*. In this case, the
pressure between the lips is governed by *Bernoulli's
equation* (see §B.7.4):

*particle velocity*(see §B.7.1) (distance per unit time) between the lips. The particle velocity in the mouth is so small that it is usually neglected (approximated by zero).

Following Cullen et al. [98] and previous researchers
[159,161],
one may assume that the flow between the lips creates a jet in the
mouthpiece having velocity . The pressure within the jet is
therefore , the same as between the lips. The jet is assumed to
mix turbulently with the air in the mouthpiece so that it dissipates
its energy without raising pressure in the mouthpiece.^{10.23}

One- and two-mass models were also
developed by Rodet and Vergez [388,533]. An
early one-mass, ``swinging door'' model that works surprisingly well
is in Cook's `HosePlayer` [89] and and `TBone`
[88]. It is also used in the `Brass.cpp` patch in the
Synthesis Tool Kit [91]; in this simplified model, the
lip-valve is modeled as a second-order resonator whose output is
squared and hard-clipped to a maximum magnitude.

The acoustic tube in a horn can be modeled as a simple digital waveguide [22,37,436,506,528], whether it is cylindrical or conical. More complicated, flaring horn sections may be modeled as two-port digital filters which provide the appropriate reflection and transmission from the bell [39,50,71,118,406,528]. Section 9.7.2 below summarizes a particularly economical approach to brass bell modeling.

It is known that nonlinear *shock waves* occur in the bore for
large amplitude vibrations [198,341]. A
simple nonlinear waveguide simulation of this effect can be
implemented by shortening the delay elements slightly when the
instantaneous amplitude is large. (The first-order effect of air
nonlinearity in large-amplitude wave propagation is an increase in
sound speed at very high pressures [52].)

It is also known that the bores of ``bent horns,'' such as the trumpet, do not behave as ideal waveguides [425]. This is because sharp bends in the metal tube cause some scattering, and mode conversion at high frequencies. Such points along the horn can be modeled using high-frequency loss and reflection.

### Bell Models

The flaring bell of a horn cannot be
accurately modeled as a sparse digital waveguide, because traveling pressure
waves only propagate without reflection in conical bores (which
include cylindrical bores as a special case)
[357].^{10.24} Digital waveguides are ``sparse''
(free of internal scattering) only when there are long sections at a
constant wave impedance.

The most cost-effective bell filters (and, more generally, ``flare
filters'') to date appears to be the use of
*truncated IIR* (TIIR) digital filters
[540]. These filters use an unstable pole to produce
exponentially rising components in the impulse response, but the
response is cut off after a finite time, as is needed in the case of a
bell impulse response. By fitting a piecewise polynomial/exponential
approximation to the reflection impulse response of the trumpet bell,
very good approximations can be had for the computational equivalent
of approximately a 10th order IIR filter (but using more memory in the
form of a delay line, which costs very little computation).

In more detail, the most efficient computational model for flaring
bells in brass instruments seems to be one that consists of one or
more sections having an impulse response given by the sum of a growing
exponential and a constant, *i.e.*,

The C++ class in the Synthesis Tool Kit (STK) implementing a basic
brass synthesis model is called `Brass.cpp`.

#### Literature Relevant to Brasses

A nice summary of the musical acoustics of brass instruments is given by Campbell [70].

Research by Cullen et al. [98] and Gilbert et al. [159,161] has been concerned with artificial mouth experiments to verify a theoretical model of vibrating lips, and to calibrate the model to realistic playing conditions. These data can be used in the construction of improved lip-valve models for virtual brass instruments. Additional literature relevant to brass instruments includes [187,186,188,247,529,319,331,336,534,535,536,445,93].

## Other Instruments

One of the most interesting aspects of building virtual musical instruments is that every instrument has its own unique characteristics and modeling considerations. One disadvantage of this, however, is that it is not feasible to cover all cases in a single book. This section provides some literature pointers that can hopefully serve as a starting point for further research in some of the instrument areas not discussed here. While these pointers are neither complete nor up to date, the most recent papers and dissertations by the cited researchers can help fill in the gaps. (This is true of the preceding sections as well.)

### Singing Voice

An historical summary of voice modeling and synthesis appears in §A.6.3.

In [290], a model is proposed for the singing voice in which the driving glottal pulse train is estimated jointly with filter parameters describing the shape of the vocal tract (the complete airway from the base of the throat to the lip opening). The model can be seen as an improvement over linear-predictive coding (LPC) of voice in the direction of a more accurate physical model of voice production, while maintaining a low computational cost relative to more complex articulatory models of voice production. In particular, the parameter estimation involves only convex optimization plus a one-dimensional (possibly non-convex) line search over a compact interval. The line search determines the so-called ``open quotient'' which is fraction of the time there is glottal flow within each period. The glottal pulse parameters are based on the derivative-glottal-wave models of Liljencrants, Fant, and Klatt [133,257]. Portions of this research have been published in the ICMC-00 [291] and WASPAA-01 [292] proceedings. Related subsequent work includes [250,213,251,214,212]

Earlier work in voice synthesis, some summarized in Appendix A, includes [40,81,87,90,206,257,389,492]; see also the KTH ``Research Topics'' home page.

### Flutes, Recorders, and Pipe Organs

A chapter on the fundamental aero-acoustics of wind instruments appears in [196], and a comprehensive treatment of the acoustics of air jets in recorder-like instruments is given in [530].

A comprehensive review article on lumped models for flue instruments
appears [130] in a special issue (July/August 2000) of
the *Acustica* journal on ``musical wind instrument
acoustics.'' An overview of research on woodwinds and organs appears
in [129]. Follow-up publications by this research group
include papers concerning the influence of mouth geometry on sound
production in flue instruments [108,418].

### Percussion Instruments

While sample-playback synthesis is especially effective for percussion
instruments that are supposed to play ``in the background,'' some form
of parametrization is needed for the more expressive performances, or
for highly variable percussion instruments such as the Indian tabla.
Highly efficient computational models of 2D membranes and 3D volumes
may be built using the *digital waveguide mesh*
[146,520,518,521]. More recently,
Lauri and Välimäki have developed a frequency-warping approach to
compensating for dispersive wave propagation in a variety of mesh
types [398,399,401]. The 2001
thesis of Bilbao [54] provides a unified view of the digital
waveguide mesh and *wave digital filters* [136] as
particular classes of energy invariant finite difference schemes
[481]. The problem of modeling diffusion at a mesh
boundary was addressed by [268].

**Next Section:**

Conclusion

**Previous Section:**

Transfer Function Models