DSPRelated.com
Forums

Pitch Estimation using Autocorrelation

Started by olivers September 7, 2005
From what I understand the first minimum of an autocorrelation function
(say 1200 samples with a lag between 0 and 600) will give me the sample
value which can be directly mapped to frequency and thus pitch.

I have tried to extract this minima from my autocorrelation result with
varied results. I tried using a C (third fret 5th string) on my guitar and
got a sample value for the first minimum which varied between 68 and 75.
From my conversion chart
(http://grace.evergreen.edu/~arunc/intro_doc/node12.htm#SECTION00092000000000000000)

this corresponds to a note which varies between d4 and e4 which is clearly
incorrect.

I thought maybe that I am doing something wrong in the extraction of the
first minimum. Another article I have just read indicates that the minima
represents half the period where the waveform is out of phase thus, the
maxima indicates the period of the waveform and directly relates to the
pitch. At a guess my value of 75 (half the period) which translates to 150
is still wrong. 

I am identifying the first minimum by searching for the first change in
sign indicating the first zero crossing. Is this how its done?

Any help will be great!

Thanks





		
This message was sent using the Comp.DSP web interface on
www.DSPRelated.com
olivers wrote:
> I am identifying the first minimum by searching for the first change in > sign indicating the first zero crossing.
-1 < 0 -- Jim Thomas Principal Applications Engineer Bittware, Inc jthomas@bittware.com http://www.bittware.com (603) 226-0404 x536 Sometimes experience is the only teacher that works - Mike Rosing
Autocorrelation measures the degree of similarity of a signal with a
delayed version of itself.  Therefore you should look not for a
minimum, but for the maximum.

This will often, but not always correspond to the fundamental
frequency; sometimes you'll get a harmonic or sub-harmonic, depending
on the shape of the spectrum.  Whitening the signal by center-clipping
is effective in minimizing this problem.  I don't have a reference
handy, but it's commonly done in speech analysis.

cheers,
  jerry

olivers wrote:
> From what I understand the first minimum of an autocorrelation function > (say 1200 samples with a lag between 0 and 600) will give me the sample > value which can be directly mapped to frequency and thus pitch.
Where did you find that? I am not aware of any simple relation between the time-domain autocorrelation function and the pitch of the signal.
> I have tried to extract this minima from my autocorrelation result with > varied results. I tried using a C (third fret 5th string) on my guitar and > got a sample value for the first minimum which varied between 68 and 75. > From my conversion chart > (http://grace.evergreen.edu/~arunc/intro_doc/node12.htm#SECTION00092000000000000000) > > this corresponds to a note which varies between d4 and e4 which is clearly > incorrect.
Try to compute the DFT of the autocorrelation function, and see if you can find the peak in the spectrum.
> I thought maybe that I am doing something wrong in the extraction of the > first minimum. Another article I have just read indicates that the minima > represents half the period where the waveform is out of phase thus, the > maxima indicates the period of the waveform and directly relates to the > pitch. At a guess my value of 75 (half the period) which translates to 150 > is still wrong. > > I am identifying the first minimum by searching for the first change in > sign indicating the first zero crossing. Is this how its done?
IF the pitch can be extracted from the time-domain autocorrelation function (I am not sure it can, but I may be wrong) it would be based on the peak in the autocorrelation function. For the guitar string, try to look at the peaks in the power spectrum, i.e. the DFT of the autocorrelation function. Rune
Carlos Moreno wrote:
> Rune Allnor wrote: > > olivers wrote: > > > >>From what I understand the first minimum of an autocorrelation function > >>(say 1200 samples with a lag between 0 and 600) will give me the sample > >>value which can be directly mapped to frequency and thus pitch. > > > > > > Where did you find that? I am not aware of any simple relation between > > the time-domain autocorrelation function and the pitch of the signal. > > If the signal does have a sinusoidal component at period T, then when > correlating with the version of the signal shifted by T, there will be > a peak, corresponding to 1/T and all of the multiples (the harmonics). > In fact, when shifted by T/2, there will be a peak with negative value, > provided that there are no components of lower frequency.
OK, I am sure you are right, provided the signal consists of a single sinusoidal. If there are more sinusoidals, or noise present... Rune
Rune Allnor wrote:
> olivers wrote: > >>From what I understand the first minimum of an autocorrelation function >>(say 1200 samples with a lag between 0 and 600) will give me the sample >>value which can be directly mapped to frequency and thus pitch. > > > Where did you find that? I am not aware of any simple relation between > the time-domain autocorrelation function and the pitch of the signal.
If the signal does have a sinusoidal component at period T, then when correlating with the version of the signal shifted by T, there will be a peak, corresponding to 1/T and all of the multiples (the harmonics). In fact, when shifted by T/2, there will be a peak with negative value, provided that there are no components of lower frequency. But I agree with you -- in fact, before reading your reply, I was going to reply to him suggesting that the DFT would be more appropriate for that. Computing a DFT using FFT is much *much* faster than computing the autocorrelation *function* (in fact, computing an FFT is faster than computing *one sample* of the autocorrelation function if you compute the A.C. the straightforward way) Carlos --
Carlos Moreno wrote:
> Rune Allnor wrote: > > > > Where did you find that? I am not aware of any simple relation between > > the time-domain autocorrelation function and the pitch of the signal. > > If the signal does have a sinusoidal component at period T, then when > correlating with the version of the signal shifted by T, there will be > a peak, corresponding to 1/T and all of the multiples (the harmonics). > In fact, when shifted by T/2, there will be a peak with negative value, > provided that there are no components of lower frequency.
there need be no presumption of having a sinusoidal component with period T. there need only be a presumption that the signal is periodic with period T. if the window of summation is wide enough, the autocorrelation function can be directly related to the Average Squared Difference Function (ASDF) which is a pretty straight-forward approach to determining the pitch or period of a (quasi)periodic signal. Rx[k] = mean{|x|^2} - 1/2 * ASDF(x, k) where the ASDF comes to a minimum (say at multiples of T), the autocorrelation becomes maximum. since the ASDF can never be less than zero, the autocorrelation can never be greater than the power or mean{|x|^2}. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Rune Allnor wrote:

   ...

> OK, I am sure you are right, provided the signal consists of a single > sinusoidal. If there are more sinusoidals, or noise present...
As I understand it, that should be a single _dominant_ sinusoid (and its harmonics). Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
Jerry Avins wrote:

> Rune Allnor wrote: > > ... > >> OK, I am sure you are right, provided the signal consists of a single >> sinusoidal. If there are more sinusoidals, or noise present... > > > As I understand it, that should be a single _dominant_ sinusoid (and its > harmonics). > > Jerry
Most instrumental and sung music is quite rich in harmonics -- to the point where the fundamental cannot be counted on to have the majority of the energy, or even be there at all (bells, IIRC, have the fundamental entirely suppressed, yet our brains synthesize it out of the harmonics). -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Tim Wescott wrote:
> Jerry Avins wrote: > >> Rune Allnor wrote: >> >> ... >> >>> OK, I am sure you are right, provided the signal consists of a single >>> sinusoidal. If there are more sinusoidals, or noise present... >> >> >> >> As I understand it, that should be a single _dominant_ sinusoid (and >> its harmonics). >> >> Jerry > > > Most instrumental and sung music is quite rich in harmonics -- to the > point where the fundamental cannot be counted on to have the majority of > the energy, or even be there at all (bells, IIRC, have the fundamental > entirely suppressed, yet our brains synthesize it out of the harmonics).
Sure, but autocorrelation doesn't necessarily fail to find the missing fundamental. Imagine (or draw) a square wave from which the fundamental has been removed. The period of the suppressed fundamental clearly remains the period of that waveform. Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;