Signal Models

To simplify terminology, the term ``signal model'' will henceforth mean ``non-physical signal model''.

Recordings (Samples)

Perhaps the simplest possible signal ``model'' is a recording of the desired sound indexed by the controller state used to produce it. In other words, for each possible input condition (such as a key-press on a keyboard, pedal press, etc.) we record the sound produced in the desired acoustic space (which itself can be a recording parameter). Such a procedure is called instrument sampling [303]. Sampling is of course highly laborious, but it is actually the current common practice for electronic musical instruments, such as ``sampled pianos''.2.2

A problem with sample-based synthesizers, aside from the enormous recording effort required, is that the dimensions of expressive performance are invariably limited. While impressive results have been obtained for struck-string instruments such as the piano, continuously-controlled instruments such as bowed-string, wind, and brass instruments are reduced to highly oversimplified shadows of themselves. This can be appreciated by considering that the only non-pedal control parameters for the piano are key-number and key-velocity, while for bowed-string and wind instruments, the player manages multiple continuous dimensions of control. Skilled performers do not wish to give up these dimensions.

Another source of sonic richness routinely given up by sampling synthesis is the interaction between the performer and the instrument. For example, in a long-sustaining electric guitar performance, there is significant interaction between a ringing string and its subsequent re-excitation. Such effects are also audible (though subtle) on a piano when restriking a ringing string (with the sustain pedal down).

While sample-based sound synthesis can be frustrating (or at least constraining) for the performing musician, its best feature is the high quality of the ultimate sound to the listener. The sound quality is limited only by the quality of the original recordings and subsequent signal processing.

A major advantage of physical models, especially relative to sample-based signal models, is that the internal state is automatically maintained. That is, sample-based models should in principle index each acoustic2.3recording by not only the input state, but also by the internal state of the instrument (which is prohibitive and rarely done). Physical models, in contrast, propagate some kind of simulation of the internal state, so that realistic interactions between the external inputs and the internal state are provided ``for free''.

Structured Sampling

Structured sampling refers to the use of a combination of sampling and model-based methods. Instead of sampling the acoustic pressure wave, as in any typical audio recording, we sample more fundamental physical quantities such as an impulse response [449] that can be used to provide the desired level of both audio quality and model flexibility.

For example, in ``commuted waveguide synthesis'' (§8.7), the body resonator of a stringed instrument is efficiently modeled by its impulse response.

Another example is measuring the frequency response of a vibrating string so that a digital filter can be fit to that instead of being designed from first principles.

An advantage of sampling more fundamental characteristic signals such as impulse-responses is that they are often largely invariant with respect to controller state. This yields a far smaller memory footprint relative to brute force sampling of the acoustic pressure wave as a function of controller state.

There is an approximate continuum between sampling and physical modeling. That is, there is a wide range of possible hybrids between computational physical modeling and interpolation/manipulation of recorded samples. More computing power generally enables more accurate modeling and less memory usage.

Spectral Models

As discussed in [456], spectral models are inspired by the mechanics of hearing. Typically they are based on the Short-Time Fourier Transform (STFT), but there are also signal models, such as Linear Predictive Coding (LPC) whose success derives from how well they match spectral characteristics of hearing [456]. Additionally, Frequency-Modulation (FM) synthesis is typically developed by tweaking FM parameters to match short-term audio spectra [82]. Other well known signal models rooted in the spectral point of view include the phase vocoder, additive synthesis, and so-called spectral modeling synthesis [456].

Virtual Analog

Analog synthesizers, such as the modular Moog and ARP synthesizers, typically used elementary waveforms such as sawtooth, pulse train, and the like. A variety of signal models has been developed for generating such waveforms digitally without incurring the aliasing associated with simply sampling the ideal waveforms. A special issue of the IEEE Transactions on Audio, Speech, and Language Processing was devoted to this area in 2010. Specific related papers include [516,324]. The same issue also included papers on more model-based approaches in which the original analog circuit is digitized using general methods [554], some of which will be discussed here in later chapters. Waveform signal models will not be discussed further in this book, so see the references for further guidance in this area [516,324,323,344,504,505,477,503,478,67,78].

Next Section:
Physical Models
Previous Section:
Overview of Model Types