Sign in

username:

password:



Not a member?

Search Online Books



Search tips

Free Online Books

Sponsor

Industry's highest performing at the lowest power DSPs now as low as $5.00*
Start development today!
*volume pricing for 10ku

Chapters

See Also

Embedded SystemsFPGAElectronics
Chapter Contents:

Search Spectral Audio Signal Processing

  

Book Index | Global Index


Would you like to be notified by email when Julius Orion Smith III publishes a new entry into his blog?

  

Following Spectral Peaks

In the analysis phase, sinusoidal peaks are measured over time in a sequence of FFTs, and these peaks are grouped into ``tracks'' across time. A detailed discussion of various options for this can be found in [228], and a particular case is detailed in Appendix I.

The end result of the analysis pass is a collection amplitude and frequency envelopes for each spectral peak versus time. If the time advance from one FFT to the next is fixed (5ms is a typical choice for speech analysis), then we obtain uniformly sampled amplitude and frequency trajectories as the result of the analysis. The sampling rate of these amplitude and frequency envelopes is equal to the frame rate of the analysis. (If the time advance between FFTs is $ \Delta t=5$ms, then the frame rate is defined as $ 1/\Delta t
= 200$ Hz.) For resynthesis using inverse FFTs, these data may be used unmodified. For resynthesis using a bank of sinusoidal oscillators, on the other hand, we must somehow interpolate the envelopes to create envelopes at the signal sampling rate (typically $ 44$ kHz or higher).

It is typical in computer music to linearly interpolate the amplitude and frequency trajectories from one frame to the next [255]. Higher order interpolations of so-called envelope break-points were also developed at CCRMA in the late 1970s (e.g., using cubic splines), but for tonal sounds, linearly interpolation is usually sufficient, and the higher-order envelopes did not see much use, presumably due to the greater complexity of dealing with them coupled with the lack of significant benefit. Let's call the piecewise linear upsampled envelopes $ {\hat A}_k(n)$ and $ \hat{F}_k(n)$, defined now for all $ n$ at the normal signal sampling rate. For steady-state tonal sounds, the phase may be discarded at this stage and redefined as the integral of the instantaneous frequency when needed:

$\displaystyle \hat{\Theta }_k(n) \isdef \hat{\Theta }_k(n-1) + 2\pi T \hat{F}_k(n).
$

When phase must be matched in a given frame, such as when it is known to contain a transient event, the frequency can instead move quadratically across the frame to provide cubic polynomial phase interpolation [164], or a second linear breakpoint can be introduced somewhere in the frame for the frequency trajectory (in which case the area under the triangle formed by the second breakpoint equals the added phase at the end of the segment).


Previous: Additive Synthesis Analysis
Next: Sinusoidal Peak Finding

Order a Hardcopy of Spectral Audio Signal Processing


About the Author: Julius Orion Smith III
Julius Smith's background is in electrical engineering (BS Rice 1975, PhD Stanford 1983). He is presently Professor of Music and Associate Professor (by courtesy) of Electrical Engineering at Stanford's Center for Computer Research in Music and Acoustics (CCRMA), teaching courses and pursuing research related to signal processing applied to music and audio systems. See http://ccrma.stanford.edu/~jos/ for details.


Comments


No comments yet for this page


Add a Comment
You need to login before you can post a comment (best way to prevent spam). ( Not a member? )