## Phase Vocoder

In the 1960s, the phase vocoder was introduced by Flanagan and Golden based on interpreting the classical vocoder (§G.5) filter bank as a sliding short-time Fourier transform [76,243]. The digital computer made it possible for the phase vocoder to easily support phase modulation of the synthesis oscillators as well as implementing their amplitude envelopes. Thus, in addition to computing the instantaneous amplitude at the output of each (complex) bandpass filter, the instantaneous phase was also computed. (Phase could be converted to frequency by taking a time derivative.) Complex bandpass filters were implemented by multiplying the incoming signal by , where is the th channel radian center-frequency, and lowpass-filtering using a sixth-order Bessel filter.

The phase vocoder also relaxed the requirement of pitch-following (needed in the vocoder), because the phase modulation computed by the analysis stage automatically fine-tuned each sinusoidal component within its filter-bank channel. The main remaining requirement was that only one sinusoidal component be present in any given channel of the filter bank; otherwise, the instantaneous amplitude and frequency computations would be based on ``beating'' waveforms instead of single quasi-sinusoids which produce smooth amplitude and frequency envelopes necessary for good data compression.

Unlike the hardware implementations of the channel vocoder, the phase
vocoder is typically implemented in software on top of a
*Short-Time Fourier Transform* (STFT), and
originally reconstructed the signal from its amplitude spectrum and
``phase derivative'' (instantaneous frequency) spectrum
[76]. Time scale modification (§10.5) and
frequency shifting were early applications of the phase vocoder
[76].

The phase vocoder can also be considered an early *subband coder*
[284]. Since the mid-1970s, subband coders have
typically been implemented using the STFT
[76,212,9]. In the
field of perceptual audio coding, additional compression has been
obtained using *undersampled* filter banks that provide
*aliasing cancellation* [287], the first example
being the Princen-Bradley filter bank [214].

The phase vocoder was also adopted as the analysis framework of choice
for *additive synthesis* (later called sinusoidal modeling) in
computer music [186]. (See §G.8, for more
about additive synthesis.)

Today, the term ``vocoder'' has become somewhat synonymous in the
audio research world with ``modified short-time Fourier transform''
[212,62]. In the commercial musical
instrument world, it connotes a keyboard instrument with a microphone
that performs *cross-synthesis* (§10.2).

### FFT Implementation of the Phase Vocoder

In the 1970s, the phase vocoder was reimplemented using the FFT for
increased computational efficiency [212]. The FFT
window (analysis lowpass filter) was also improved to yield exact
reconstruction of the original signal when synthesizing without
modifications. Shortly thereafter, the FFT-based phase-vocoder became
the basis for additive synthesis in computer music
[187,62]. A generic diagram of
phase-vocoder (or vocoder) processing is given in Fig.G.4.
Since then, numerous variations and improvements of the phase vocoder
have appeared, *e.g.*,
[99,215,140,138,143,142,139]. A summary
of vocoder research from the 1930s to the 60s appears in a review
article by Manfred Schroeder [245]. The phase vocoder
and its descendants (STFT modification/resynthesis, sinusoidal
modeling) have been used for many audio applications, such as speech
coding and transmission, data compression, noise reduction,
reverberation suppression, cross-synthesis (§10.2), time scale
modification (§10.5), frequency shifting, and much more.

**Next Section:**

Additive Synthesis

**Previous Section:**

Voder