The main elements of the piano we need to model are the
- hammer (§9.3.2),
- string (Chapter 6, §6.6),
- bridge (§9.2.1), and
- soundboard, enclosure, acoustic space, mic placement, etc.
From the bridge (which runs along the soundboard) to each ``virtual microphone'' (or ``virtual ear''), the soundboard, enclosure, and listening space can be well modeled as a high-order LTI filter characterized by its impulse response (Chapter 8). Such long impulse responses can be converted to more efficient recursive digital filters by various means (Chapter 3,§8.6.2).
A straightforward piano model along the lines indicated above turns out to be relatively expensive computationally. Therefore, we also discuss, in §9.4.4 below, the more specialized commuted piano synthesis technique , which is capable of high quality piano synthesis at low cost relative to other model-based approaches. It can be seen as a hybrid method in which the string is physically modeled, the hammer is represented by a signal model, and the acoustic resonators (soundboard, enclosure, etc.) are handled by ordinary sampling (of their impulse response).
Stiff Piano Strings
The main effect of string stiffness is to stretch the partial overtone series, so that piano tones are not precisely harmonic . As a result, piano tuners typically stretch the tuning of the piano slightly. For example, the total amount of tuning stretch from the lowest to highest note has been measured to be approximately 35 cents on a Kurzweil PC88, 45 cents on a Steinway Model M, and 60 cents on a Steinway Model D.10.17
where the partial derivative notation and are defined on page , and
The first two terms on the right-hand side of Eq.(9.30) come from the ideal string wave equation (see Eq.(C.1)), and they model transverse acceleration and transverse restoring force due to tension, respectively. The term approximates the transverse restoring force exerted by a stiff string when it is bent. In an ideal string with zero diameter, this force is zero; in an ideal rod (or bar), this term is dominant [317,261,169]. The final two terms provide damping. The damping associated with is frequency-independent, while the damping due increases with frequency.
In , the damping in real piano strings was modeled using a length 17 FIR filter for the lowest strings, and a length 9 FIR filter for the remaining strings. Such FIR filters (or nonparametric measured frequency-response data) can be converted to lower-order IIR filters by the usual methods (§8.6.2). It is convenient to have separate damping and dispersion filters in the string model. The damping filter in piano strings is significantly less demanding than the dispersion filter.
In the context of a digital waveguide string model, dispersion associated with stiff strings can be supplied by an allpass filter in the basic feedback loop. Methods for designing dispersion allpass filters were summarized in §6.11.3. In this section, we are mainly concerned with how to specify the desired dispersion allpass filter for piano strings.
Perceptual studies regarding the audibility of inharmonicity in stringed instrument sounds  indicate that the just noticeable coefficient of inharmonicity is given approximately by
where is the fundamental frequency of the string vibration in hertz, and --the so-called coefficient of inharmonicity--affects the th partial overtone tuning via
For a stiff string with Young's modulus , radius , length , and tension , the coefficient of inharmonicity is predicted from theory [144, p. 65], to be
where is the string cross-sectional area, and is the radius of gyration of the string cross-section (see §B.4.9).
In general, when designing dispersion filters for vibrating string models, it is highly cost-effective to obtain an allpass filter which correctly tunes only the lowest-frequency partial overtones, where the number of partials correctly tuned is significantly less than the total number of partials present, as in .
A Faust implementation of a closed-form expression  for dispersion allpass coefficients as a function of inharmonicity coefficient may be found in the function piano_dispersion_filter within effect.lib.
It turns out that piano strings exhibit audible nonlinear effects, especially in the first three octaves of its pitch range at fortissimo playing levels and beyond . As a result, for highest quality piano synthesis, we need more than what is obtainable from a linearized wave equation such as Eq.(9.30).
As can be seen from a derivation of the wave equation for an ideal string vibrating in 3D space (§B.6), there is fundamentally nonlinear coupling between transverse and longitudinal string vibrations. It turns out that the coupling from transverse-to-longitudinal is much stronger than vice versa, so that piano synthesis models can get by with one-way coupling at normal dynamic playing levels [30,163]. As elaborated in §B.6 and the references cited there, the longitudinal displacement is driven by longitudinal changes in the squared slope of the string:
In addition to the excitation of longitudinal modes, the nonlinear transverse-to-longitudinal coupling results in a powerful longitudinal attack pulse, which is the leading component of the initial ``shock noise'' audible in a piano tone. This longitudinal attack pulse hits the bridge well before the first transverse wave and is therefore quite significant perceptually. A detailed simulation of both longitudinal and transverse waves in an ideal string excited by a Gaussian pulse is given in .
Another important (i.e., audible) effect due to nonlinear transverse-to-longitudinal coupling is so-called phantom partials. Phantom partials are ongoing intermodulation products from the transverse partials as they transduce (nonlinearly) into longitudinal waves. The term ``phantom partial'' was coined by Conklin . The Web version of  includes some illuminating sound examples by Conklin.
- longitudinal modes can be implemented as second-order
resonators (``modal synthesis''), with driving coefficients
- orthogonally projecting  the spatial
derivative of the squared string slope onto the longitudinal mode
shapes (both functions of position ).
- If tension variations along the string are neglected (reasonable
since longitudinal waves are so much faster than transverse waves),
then the longitudinal force on the bridge can be derived from the
estimated instantaneous tension in the string, and efficient
methods for this have been developed for guitar-string synthesis,
particularly by Tolonen (§9.1.6).
An excellent review of nonlinear piano-string synthesis (emphasizing the modal synthesis approach) is given in .
- The simplest piano-string vibration regime is characterized by
linear superposition in which transverse and longitudinal
waves decouple into separate modes, as implied by
Eq.(9.30). In this case, transverse and longitudinal waves can
be simulated in separate digital waveguides (Ch. 6).
The longitudinal waveguide is of course an order of magnitude
shorter than the transverse waveguide(s).
- As dynamic playing level is increased,
transverse-to-longitudinal coupling becomes audible
- At very high dynamic levels, the model should also include
longitudinal-to-transverse coupling. However, this is usually
- Initial striking force determines the starting regime (1, 2, or 3 above).
- The string model is simplified as it decays.
Because the energy stored in a piano string decays monotonically after a hammer strike (neglecting coupling from other strings), we may switch to progressively simpler models as the energy of vibration falls below corresponding thresholds. Since the purpose of model-switching is merely to save computation, it need not happen immediately, so it may be triggered by a string-energy estimate based on observing the string at a single point over the past period or so of vibration. Perhaps most simply, the model-regime classifier can be based on the maximum magnitude of the bridge force over at least one period. If the regime 2 model includes an instantaneous string-length (tension) estimate, one may simply compare that to a threshold to determine when the simpler model can be used. If the longitudinal components and/or phantom partials are not completely inaudible when the model switches, then standard cross-fading techniques should be applied so that inharmonic partials are faded out rather than abruptly and audibly cut off.
To obtain a realistic initial shock noise in the tone for regime 1 (and for any model that does not compute it automatically), the appropriate shock pulse, computed as a function of hammer striking force (or velocity, displacement, etc.), can be summed into the longitudinal waveguide during the hammer contact period.
The longitudinal bridge force may be generated from the estimated string length (§9.1.6). This force should be exerted on the bridge in the direction coaxial with the string at the bridge (a direction available from the two transverse displacements one sample away from the bridge).
Phantom partials may be generated in the longitudinal waveguide as explicit intermodulation products based on the transverse-wave overtones known to be most contributing; for example, the Goetzel algorithm  could be used to track relevant partial amplitudes for this purpose. Such explicit synthesis of phantom partials, however, makes modal synthesis more compelling for the longitudinal component ; in a modal synthesis model, the longitudinal attack pulse can be replaced by a one-shot (per hammer strike) table playback, scaled and perhaps filtered as a function of the hammer striking velocity.
On the high end for regime 2 modeling, a full nonlinear coupling may be implemented along the three waveguides (two transverse and one longitudinal). At this level of complexity, a wide variety of finite-difference schemes should also be considered (§7.3.1) [53,555].
The preceding discussion considered several possible approximations for nonlinear piano-string synthesis. Other neglected terms in the stiff-string wave equation were not even discussed, such as terms due to shear deformation and rotary inertia that are included in the (highly accurate) Timoshenko beam theory formulation [261,169]. The following questions naturally arise:
- How do we know for sure our approximations are inaudible?
- We can listen, but could we miss an audible effect?
- Could a difference become audible after more listening?
Note that there are software tools (e.g., from the world of perceptual audio coding [62,472]) that can be used to measure the audible equivalence of two sounds . For an audio coder, these tools predict the audibility of the difference between original and compressed sounds. For sound synthesis applications, we want to compare our ``exact'' and ``computationally efficient'' synthesis models.
In [265,266], an extension of the mass-spring model of  was presented for the purpose of high-accuracy modeling of nonlinear piano strings struck by a hammer model such as described in §9.3.2. This section provides a brief overview.
and similarly for mass 2, where is the vector position of mass in 3D space.
Generalizing to a chain of masses and spring is shown in Fig.9.26. Mass-spring chains--also called beaded strings--have been analyzed in numerous textbooks (e.g., [295,318]), and numerical software simulation is described in .
The force on the th mass can be expressed as
Following the classical derivation of the stiff-string wave equation [317,144], an obvious way to introduce stiffness in the mass-spring chain is to use a bundle of mass-spring chains to form a kind of ``lumped stranded cable''. One section of such a model is shown in Fig.9.27. Each mass is now modeled as a 2D mass disk. Complicated rotational dynamics can be avoided by assuming no torsional waves (no ``twisting'' motion) (§B.4.20).
A three-spring-per-mass model is shown in Fig.9.28 . The spring positions alternate between angles , say, on one side of a mass disk and on the other side in order to provide effectively six spring-connection points around the mass disk for only three connecting springs per section. This improves isotropy of the string model with respect to bending direction.
A problem with the simple mass-spring-chain-bundle is that there is no resistance whatsoever to shear deformation, as is clear from Fig.9.29. To rectify this problem (which does not arise due implicit assumptions when classically deriving the stiff-string wave equation), diagonal springs can be added to the model, as shown in Fig..
or, in vector form,
Digitizing via the centered second-order difference [Eq.(7.5)]
Note that requiring three adjacent spatial string samples to be in contact with the piano hammer during the attack (which helps to suppress aliasing of spatial frequencies on the string during the attack) implies a sampling rate in the vicinity of 6 megahertz . Thus, the model is expensive to compute! However, results to date show a high degree of accuracy, as desired. In particular, the stretching of the partial overtones in the stiff-string model of Fig. has been measured to be highly accurate despite using only three spring attachment points on one side of each mass disk .
The general technique of commuted synthesis was introduced in §8.7. This method is enabled by the linearity and time invariance of both the vibrating string and its acoustic resonator, allowing them to be interchanged, in principle, without altering the overall transfer function .
While the piano-hammer is a strongly nonlinear element (§9.3.2), it is nevertheless possible to synthesize a piano using commuted synthesis with high fidelity and low computational cost given the approximation that the string itself is linear and time-invariant (§9.4.2) [467,519,30]. The key observation is that the interaction between the hammer and string is essentially discrete (after deconvolution) at only one or a few time instants per hammer strike. The deconvolution needed is a function of the hammer-string collision velocity . As a result, the hammer-string interaction can be modeled as one or a few discrete impulses that are filtered in a -dependent way.
Figure 9.31 illustrates a typical series of interaction forces-pulses at the contact point between a piano-hammer and string. The vertical lines indicate the locations and amplitudes of three single-sample impulses passed through three single-pulse filters to produce the overlapping pulses shown.
The creation of a single force-pulse for a given hammer-string collision velocity (a specific ``dynamic level'') is shown in Fig.9.32. The filter input is an impulse, and the output is the desired hammer-string force pulse. As increases, the output pulse increases in amplitude and decreases in width, which means the filter is nonlinear. In other words, the force pulse gets ``brighter'' as its amplitude (dynamic level) increases. In a real piano, this brightness increase is caused by the nonlinear felt-compression in the piano hammer. Recall from §9.3.2 that piano-hammer felt is typically modeled as a nonlinear spring described by , where is felt compression. Here, the brightness is increased by shrinking the duration of the filter impulse response as increases. The key property enabling commuted synthesis is that, when is constant, the filter operates as a normal LTI filter. In this way, the entire piano has been ``linearized'' with respect to a given collision velocity .
One method of generating multiple force pulses as shown in Fig.9.31 is to sum several single-pulse filters, as shown in Fig.9.33. Furthermore, the three input impulses can be generated using a single impulse into a tapped delay line (§2.5). The sum of the three filter outputs gives the desired superposition of three hammer-string force pulses. As the collision velocity increases, the output pulses become taller and thinner, showing less overlap. The filters LPF1-LPF3 can be given as side information, or the input impulse amplitudes can be set to , or the like.
Figure 9.34 shows a complete piano synthesis system along the lines discussed. At a fixed dynamic level, we have the critical feature that the model is linear and time invariant. Therefore we may commute the ``Soundboard & Enclosure'' filter with not only the string, but with the hammer filter-bank as well. The result is shown in Fig.9.35.
As in the case of commuted guitar synthesis (§8.7), we replace a high-order digital filter (at least thousands of poles and zeros ) with a simple excitation table containing that filter's impulse response. Thus, the large digital filter required to implement the soundboard, piano enclosure, and surrounding acoustic space, has been eliminated. At the same time, we still have explicit models for the hammer and string, so physical variations can be implemented, such as harder or softer hammers, more or less string stiffness, and so on.
Even some resonator modifications remain possible, however, such as changing the virtual mic positions (§2.2.7). However, if we now want to ``open the top'' of our virtual piano, we have to measure the impulse response that case separately and have a separate table for it, which is not a problem, but it means we're doing ``sampling'' instead of ``modeling'' of the various resonators.
An approximation made in the commuted technique is that detailed reflections generated on the string during hammer-string contact are neglected; that is, the hammer force-pulse depends only on the hammer-string velocity at the time of their initial contact, and the string velocity is treated as remaining constant throughout the contact duration.
To save computation in the filtering, we can make use of the observation that, under the assumption of a string initially at rest, each interaction pulse is smoother than the one before it. That suggests applying the force-pulse filtering progressively, as was done with Leslie cabinet reflections in §5.7.6. In other words, the second force-pulse is generated as a filtering of the first force-pulse. This arrangement is shown in Fig.9.36.
With progressive filtering, each filter need only supply the mild smoothing (and perhaps dispersion) associated with traveling from the hammer to the agraffe and back, plus the mild attenuation associated with reflection from the felt-covered hammer (a nonlinear mass-spring system as described in §9.3.2).
Referring to Fig.9.36, The first filter LPF1 can shape a velocity-independent excitation signal to obtain the appropriate ``shock spectrum'' for that hammer velocity. Alternatively, the Excitation Table itself can be varied with velocity to produce the needed signal. In this case, filter LPF1 can be eliminated entirely by applying it in advance to the excitation signal. It is possible to interpolate between tables for two different striking velocities; in this case, the tables should be pre-processed to eliminate phase cancellations during cross-fade.
Assuming the first filter in Fig.9.36 is ``weakest'' at the highest hammer velocity (MIDI velocity ), that filtering can be applied to the excitation table in advance, and the first filter then becomes no computation for MIDI velocity , and as velocity is lowered, the filter only needs to make up the difference between what was done in advance to the table and what is desired at that velocity.
Since, for most keys, only a few interaction impulses are observed, per hammer strike, in real pianos, this computational model of the piano achieves a high degree of realism for a price comparable to the cost of the strings only. The soundboard and enclosure filtering have been eliminated and replaced by a look-up table using a few read-pointers per note, and the hammer costs only one or a few low-order filters which in principle convert the interaction impulse into an accurate force pulse.
As another refinement, it is typically more efficient to implement the highest Q resonances of the soundboard and piano enclosure using actual digital filters (see §8.8). By factoring these out, the impulse response is shortened and thus the required excitable length is reduced. This provides a classical computation vs. memory trade-off which can be optimized as needed in a given implementation. For lack of a better name, let us refer to such a resonator bank as an ``equalizer'' since it can be conveniently implemented using parametric equalizer sections, one per high-Q resonance.
A possible placement of the comb filter and equalizer is shown in Fig.9.37. The resonator/eq/reverb/comb-filter block may include filtering for partially implementing the response of the soundboard and enclosure, equalization sections for piano color variations, reverberation, comb-filter(s) for flanging, chorus, and simulated hammer-strike echoes on the string. Multiple outputs having different spectral characteristics can be extracted at various points in the processing.
For a further reduction in cost of implementation, particularly with respect to memory usage, it is possible to synthesize the excitation using a noise signal through a time-varying lowpass filter . The synthesis replaces the recorded soundbard/enclosure sound. It turns out that the sound produced by tapping on a piano soundboard sounds very similar to noise which is lowpass-filtered such that the filter bandwidth contracts over time. This approach is especially effective when applied only to the high-frequency residual excitation after factoring out all long-ringing modes as biquads.
Perhaps it should be emphasized that for good string-tone synthesis, multiple filtered delay loops should be employed for each note rather than the single-loop case so often used for simplicity (§6.12; §C.13). For piano, two or three delay loops can correspond to the two or three strings hit by the same hammer in the plane orthogonal to both the string and bridge. This number of delay loops is doubled from there if the transverse vibration planes parallel to the bridge are added; since these are slightly detuned, they are worth considering. Finally, another 2-3 (shorter) delay loops are needed to incorporate longitudinal waves for each string, and all loops should be appropriately coupled. In fact, for full accurancy, longitudinal waves and transverse waves should be nonlinearly coupled along the length of the string  (see §B.6.2).
The sound of all strings ringing can be summed with the excitation to simulate the effect of many strings resonating with the played string when the sustain pedal is down . The string loop filters out the unwanted frequencies in this signal and selects only the overtones which would be excited by the played string.
At very high pitches, the delay-line length used in the string may become too short for the implementation used, especially when using vectorized module signals (sometimes called ``chunks'' in place of samples ). In this case, good results can be obtained by replacing the filtered-delay-loop string model by a simpler model consisting only of a few parallel, second-order filter sections or enveloped sinusoidal oscillators, etc. In other words, modal synthesis (§8.5) is a good choice for very high key numbers.
In the commuted piano synthesis technique, it is necessary to fit a filter impulse response to the desired hammer-string force-pulse shape for each key and for each key velocity , as indicated in Figures 9.32 and 9.33. Such a desired curve can be found in the musical acoustics literature  or it can be easily generated from a good piano-hammer model (§9.3.2) striking the chosen piano string model.
Literature on Piano Acoustics and Synthesis
There has been much research into the musical acoustics of the piano, including models of piano strings [543,18,75,76,144,42] and the piano hammer [493,63,178,60,486,164,487]. Additionally, there has been considerable work to date on the problem of piano synthesis by digital waveguide methods [467,519,523,56,42].
A careful computational model of the piano, partially based on physical modeling and suitable for high-quality real-time sound synthesis, was developed by Bensa in the context of his thesis work . Related publications include [43,45,46].
An excellent simulation of the clavichord, also using the commuted waveguide synthesis technique, is described in . A detailed simulation of the harpsichord, along similar lines, but with some interesting new variations, may be found in .
The 2003 paper of Bank et al.  includes good summaries of prior research on elements of piano modeling such as efficient computational string models, loss-filter design, dispersion filter design, and soundboard models. A more recent paper in this area is .