Search Spectral Audio Signal Processing
Book Index | Global Index
Would you like to be notified by email when Julius Orion Smith III publishes a new entry into his blog?
Pitch Detection from Sinusoidal Peaks
Given a set of sinusoidal peak frequencies
,
, it
is straightforward to form a pitch estimate. Here we assume
that ``pitch'' means the same thing as ``fundamental frequency'' or
``F0'', i.e., that the signal is periodic, so that all of its
sinusoidal components are harmonics of the fundamental
frequency F0. For inharmonic sounds, the perceived pitch can be complex
to predict.
The F0-detection algorithm described in this section consists of the
following steps:
- Find the peak of the histogram of the peak-frequency-differences
in order to find the most common harmonic spacing. This is the nominal
pitch estimate.
- Refine the nominal pitch estimate using linear regression. Linear
regression simply fits a straight line through the date to
give a least-squares fit.
The slope of the fitted line gives the pitch estimate.
In many cases, results are improved through the use of preprocessing
of the spectrum prior two peak finding. Examples include the following:
- Pre-emphasis: Equalize the spectrum so as to flatten it
- Masking: Small peaks close to much larger peaks are often
masked by the auditory system. Therefore, it is good practice to reject
all peaks below an inaudibility threshold which is the maximum of the
threshold of hearing (versus frequency) and the masking pattern
generated by the largest peaks. Since it is simple to extract peaks
in descending magnitude order, each removed peak can be replaced by
its masking pattern, which elevates the assumed inaudibility
threshold.
Subsections
Previous:
ReferencesNext:
References on F0 Estimation
written by Julius Orion Smith III
Julius Smith's background is in electrical engineering (BS Rice 1975, PhD Stanford 1983). He is presently Professor of Music and Associate Professor (by courtesy) of Electrical Engineering at
Stanford's Center for Computer Research in Music and Acoustics (CCRMA), teaching courses and pursuing research related to signal processing applied to music and audio systems. See
http://ccrma.stanford.edu/~jos/ for details.