Sign in

username:

password:



Not a member?

Search Online Books



Search tips

Free Online Books



Chapters

Chapter Contents:

Search Spectral Audio Signal Processing

  

Book Index | Global Index


Would you like to be notified by email when Julius Orion Smith III publishes a new entry into his blog?

  

Sines + Noise + Transients Models

This section describes further details of the sines+noise+transients system of Scott Levine [132], which can be considered a modern descendent of the vocoder.

Figure 10.17 shows the time-frequency map used in the S+N+T system of Scott Levine [132]. Vertical line spacing in the time-frequency map indicates the time resolution of the underlying multiresolution STFT, and the horizontal line spacing indicates its frequency resolution. The time waveform appears below the time-frequency map. For transients, an interval of data including the transient is simply encoded using MPEG-2 AAC. The transient-time in Fig.10.17 extends from approximately 47 to 115 ms. (This interval can be tighter, as discussed further below.) Between transients, the signal model consists of sines+noise below 5 kHz and amplitude-modulated noise above. The spectrum from 0 to 5 kHz is divided into three octaves (``multiresolution sinusoidal modeling''). The time step-size varies from 25 ms in the low-frequency band (where the frequency resolution is highest), down to 6 ms in the third octave (where frequency resolution is four times lower). In the 0-5 kHz band, sines+noise modeling is carried out. Above 5 kHz, noise substition is performed, as discussed further below.

Figure 10.17: Sines + Noise + Transients Time-Frequency Map (from [132]).
\includegraphics[width=\textwidth]{eps/scottl-tf-aac}

Figure 10.18 shows a similar frequency map in which the transient interval depends on frequency. This enables a tighter interval enclosing the transient, and follows audio perception more closely (see Appendix E).

Figure 10.18: Quasi-Constant-Q (Wavelet) Time-Frequency Map (from [132]).
\includegraphics[width=\textwidth]{eps/scottl-tf-smooth}

Figure 10.19 illustrates the nature of the noise modeling used. The energy in each Bark band11.5 is summed, and this is used as the gain for the noise in that band at that frame time.

Figure 10.19: Bark-band noise modeling (from [132]).
\includegraphics[width=\textwidth]{eps/scottl-bark-noise}

Figure 10.20 shows the frame gain versus time for a particular Bark band (top) and the piecewise linear envelope made from it (bottom). As illustrated in Figures 10.17 and 10.18, the step size for all of the Bark bands above 5 kHz is approximately 3 ms.

Figure 10.20: Amplitude envelope for one noise band (from [132]).
\includegraphics[width=\textwidth]{eps/scottl-noise-env}

For more information on this sines+noise+transient system, see Scott Levine's CCRMA PhD/EE thesis [132].


Order a Hardcopy of Spectral Audio Signal Processing

Previous: Software Listing
Next: Time Scale Modification

written by Julius Orion Smith III
Julius Smith's background is in electrical engineering (BS Rice 1975, PhD Stanford 1983). He is presently Professor of Music and Associate Professor (by courtesy) of Electrical Engineering at Stanford's Center for Computer Research in Music and Acoustics (CCRMA), teaching courses and pursuing research related to signal processing applied to music and audio systems. See http://ccrma.stanford.edu/~jos/ for details.


Comments


No comments yet for this page


Add a Comment
You need to login before you can post a comment (best way to prevent spam). ( Not a member? )