Forums

How do I handle harmonics for autocorrelation peaks? Need an algorithm I think.

Started by noodle22 February 20, 2009
Hi,

I am trying to determine the pitch of an audio signal using
autocorrelation.  My autocorrelation peaks come out quite clearly and after
a bit of processing, I have a signal where one of the following is true

a) all the peaks are close to the same height as their neighbors
b) Every second peak is significantly taller than the previous peak
c) Every third peak is significantly taller than the previous 2 peaks

Clearly I am dealing with harmonics.  My problem is trying to get my
program to recognize what the fundamental frequency of the signal is.

Methods of have tried
a) get the average height for all peaks, every second peak, and every
third peak and then compare.  Ideally if one group stands out, it indicates
a harmonic
b) compare the height of a peak with the height of it's neighbors for
every peak, every 2nd peak, and every 3rd peak.  If a group stands out, it
indicates the fundamental frequency

These two methods work reasonably well but they do not have the
sophistication to be overly accurate and I often run into issues where
viewing the spectrum, I can tell what the frequency is but the algorithms
above are to finicky to get it right.  This happens because not necessarily
every peak will be identified (to keep from identifying noise as peaks) and
also, the result of the acf is that the farther the signals been correlated
are shifted, the smaller the peaks tend to be.

I think the computer should be able to do a better job then me identifying
peaks.  So, what i'm wondering is, is there another way I can do this? 
Maybe something I can graph and fit a line to?  I have the position of a
number of peaks, and a normalized value for their heights to work with.

Thanks for any assistance.  I am now stumped
Also, I guess I have the minimum distance between peaks as well (well, I
think I can probably come up with a pretty good way to find it at least) 
On Feb 19, 11:53&#2013266080;pm, "noodle22" <jw970...@yahoo.com> wrote:
> Hi, > > I am trying to determine the pitch of an audio signal using > autocorrelation. &#2013266080;My autocorrelation peaks come out quite clearly and after > a bit of processing, I have a signal where one of the following is true > > a) all the peaks are close to the same height as their neighbors > b) Every second peak is significantly taller than the previous peak > c) Every third peak is significantly taller than the previous 2 peaks > > Clearly I am dealing with harmonics. &#2013266080;My problem is trying to get my > program to recognize what the fundamental frequency of the signal is.
first YOU have to decide what you mean by the fundamental frequency. i presume you mean that fundamental frequency is the reciprocal of the period. now which peak is located at the period?? is it behind door number 1, door number 2, or door number 3? you have a fundamental issue here to think about. strictly speaking, the highest peak (which you tell us is the 3rd one) represents mathematically the period of the most periodic function. but, if it's audio, maybe we're gonna hear the pitch as what goes with the 1st decent peak and those other peaks represent what happens when you get some inaudible sub-harmonic. your problem is the issue of SUBharmonics, not harmonics. r b-j
On 20 Feb, 05:53, "noodle22" <jw970...@yahoo.com> wrote:
> Hi, > > I am trying to determine the pitch of an audio signal using > autocorrelation. &#2013266080;My autocorrelation peaks come out quite clearly and after > a bit of processing, I have a signal where one of the following is true > > a) all the peaks are close to the same height as their neighbors > b) Every second peak is significantly taller than the previous peak > c) Every third peak is significantly taller than the previous 2 peaks
One of those awkward details are missing: Are you taliking about frequency domain or time domain? ...
> I think the computer should be able to do a better job then me identifying > peaks.
Why do you think this? Rune
>Hi, > >I am trying to determine the pitch of an audio signal using >autocorrelation. My autocorrelation peaks come out quite clearly and
after
>a bit of processing, I have a signal where one of the following is true > >a) all the peaks are close to the same height as their neighbors >b) Every second peak is significantly taller than the previous peak >c) Every third peak is significantly taller than the previous 2 peaks > >Clearly I am dealing with harmonics. My problem is trying to get my >program to recognize what the fundamental frequency of the signal is. > >Methods of have tried >a) get the average height for all peaks, every second peak, and every >third peak and then compare. Ideally if one group stands out, it >indicates a harmonic
I don't exactly understand the second sentence. In general the idea sounds reasonable, except that you might want to take the distance between peaks into account. If you just take every second one any spurious peak will ruin the whole scheme completely. I would try to identify pulse-trains (with regular spacing) and I would suggest to try different start points. Create candidates by starting at the first peak, the second and maybe the third one. And try to rule out inbetween peaks that deviate from a regular spacing. In other words: 1. Take the first peak 2. go to the second and measure the distance 3. consider all following peaks at multiples of this distance as one group Create some pitch candidates by repeating this, and skipping the first and maybe second peak and build candidates by using every second (or third peak) but also stick to a regular distance. Find the candidate whose pulse-train has the highest energy. A friend once made a pitch tracker for a tuba using a very simple method (she added DC and measured zero crossing distance). That thing worked fine and outperformed more complex pitch-trackers in terms of latency. What I want to say, is that if you can limit the class of signals you want to be able to track, you can use specific characteristics of the signals you want to track and you have much better chances to get it to work. Bjoern
>b) compare the height of a peak with the height of it's neighbors for >every peak, every 2nd peak, and every 3rd peak. If a group stands out,
it
>indicates the fundamental frequency >
>Hi, > >I am trying to determine the pitch of an audio signal using >autocorrelation. My autocorrelation peaks come out quite clearly and
after
>a bit of processing, I have a signal where one of the following is true > >a) all the peaks are close to the same height as their neighbors >b) Every second peak is significantly taller than the previous peak >c) Every third peak is significantly taller than the previous 2 peaks > >Clearly I am dealing with harmonics. My problem is trying to get my >program to recognize what the fundamental frequency of the signal is. > >Methods of have tried >a) get the average height for all peaks, every second peak, and every >third peak and then compare. Ideally if one group stands out, it
indicates
>a harmonic >b) compare the height of a peak with the height of it's neighbors for >every peak, every 2nd peak, and every 3rd peak. If a group stands out,
it
>indicates the fundamental frequency > >These two methods work reasonably well but they do not have the >sophistication to be overly accurate and I often run into issues where >viewing the spectrum, I can tell what the frequency is but the
algorithms
>above are to finicky to get it right. This happens because not
necessarily
>every peak will be identified (to keep from identifying noise as peaks)
and
>also, the result of the acf is that the farther the signals been
correlated
>are shifted, the smaller the peaks tend to be. > >I think the computer should be able to do a better job then me
identifying
>peaks. So, what i'm wondering is, is there another way I can do this? >Maybe something I can graph and fit a line to? I have the position of a >number of peaks, and a normalized value for their heights to work with. > >Thanks for any assistance. I am now stumped >
This is a well know difficult problem and gets a lot worse when multpile voices are present , for example chords played on a guitar. You do not say how the note is produced , in the case of a plucked string instrument the ratio of harmonics depends on the plucking or bowing position on the string. In some cases the first harmonic can be greater that the fundamental or even missing. One approach is to perform a constant Q transform instead of an fft. This will place the spectral lines into semitone space, then your harmonics will lie at a constant interval apart. So fisrt harmonics is 12 semitones aove the fundamental the next is 7 then 4 then 3 (if I remeber correctly) You can now get rough results by correlating for this pattern in the CQ spectrum. The problem is now one of pattern matching. Music forums such as KVR are a rich source of help on these issues.
Ok, thanks for the responses so far.  I have uploaded an image of my
results to help clarify the issue

http://www.studypipe.com/Shared/images/Peaks.png

The x axis is just sample number and it is in the time domain.  I zeroed
the center peak since it is always perfectly correlated but to count peaks,
just start in the center and either go left or right to the end.  


>first YOU have to decide what you mean by the fundamental frequency. >i presume you mean that fundamental frequency is the reciprocal of the >period.
Good point. I'm not sure I always use this term correctly. What I was referring to is the lowest frequency (and coincidentally, the frequency that I am playing on my keyboard and trying to identify). If you look at the first graph in the image, the fundamental frequency I was referring to was 1/the distance between every 3rd peak. As for sub-harmonics, I'm pretty sure my issue is harmonics since those other smaller peaks are at a higher frequency than the peak I am trying to measure
> One of those awkward details are missing: Are you taliking > about frequency domain or time domain?
I wasn't very clear about that. I am talking about time domain.
>Create candidates by starting at the first peak, >the second and maybe the third one. And try to rule out inbetween >peaks that deviate from a regular spacing. >In other words: >1. Take the first peak >2. go to the second and measure the distance >3. consider all following peaks at multiples of this distance
as one group My algorithm was not really described in a whole lot of detail but that is basically what I am doing. And then I am using the group with the highest average height to determine the pitch. It works ok but I was hoping there was a better method. As for your friend's idea of crossovers...I think this would be a fantastic way to do it and I know that it does preform well. However, I have an issue with hum at 60 Hz. This really throws off the cross over count. And I don't really want to use a notch/stop band filter because I have frequencies I want to measure near 60, and also, I want to keep calculations as minimal as possible. I have clear peaks with my current method so I don't think a notch filter is really necessary So, now that you guys have seen my peaks, any other ideas on how I can get the proper frequency that I am looking for?
On 20 Feb, 16:18, "noodle22" <jw970...@yahoo.com> wrote:
> Ok, thanks for the responses so far. &#2013266080;I have uploaded an image of my > results to help clarify the issue > > http://www.studypipe.com/Shared/images/Peaks.png > > The x axis is just sample number and it is in the time domain. &#2013266080;
With these data I'd just go for a threshold: - Decide on a threshold T - Find all intervals [a_m,b_m] such that r[k] > T, k = [a_m,b_m] - Find the maximum inside each interval ... and there's your peaks. Rune
>One approach is to perform a constant Q transform instead of an fft. This >will place the spectral lines into semitone space, then your harmonics
will
>lie at a constant interval apart. So fisrt harmonics is 12 semitones
aove
>the fundamental the next is 7 then 4 then 3 (if I remeber correctly)
I have never heard of a constant Q transform before but after reading about it a bit, it seems like in some ways it would have been ideal. I think it is now too late for me to go back to a frequency domain solution as I have done a lot of time domain work and I would like to keep it if possible. I originally made the pitch detector work with an FFT but I dropped the solution for ACF because a) To cover the frequency range I was interested in, I needed a lot of samples which gave me a refresh rate that was too slow b) Bin sizes were a bit of a problem. I wanted a fairly accurate result but my bins were too large. When I shrunk them with 0 padding, I would only increase the refresh rate problem since it took longer to compute.
>Music forums such as KVR are a rich source of help on these issues.
I will check this out more as well. Thanks
On Feb 20, 10:10&#2013266080;am, "sigmonde" <cmacl...@soundmotion.co.uk> wrote:
> >Hi, > > >I am trying to determine the pitch of an audio signal using > >autocorrelation. &#2013266080;My autocorrelation peaks come out quite clearly and > after > >a bit of processing, I have a signal where one of the following is true > > >a) all the peaks are close to the same height as their neighbors > >b) Every second peak is significantly taller than the previous peak > >c) Every third peak is significantly taller than the previous 2 peaks > > >Clearly I am dealing with harmonics. &#2013266080;My problem is trying to get my > >program to recognize what the fundamental frequency of the signal is. > > >Methods of have tried > >a) get the average height for all peaks, every second peak, and every > >third peak and then compare. &#2013266080;Ideally if one group stands out, it > indicates > >a harmonic > >b) compare the height of a peak with the height of it's neighbors for > >every peak, every 2nd peak, and every 3rd peak. &#2013266080;If a group stands out, > it > >indicates the fundamental frequency > > >These two methods work reasonably well but they do not have the > >sophistication to be overly accurate and I often run into issues where > >viewing the spectrum, I can tell what the frequency is but the > algorithms > >above are to finicky to get it right. &#2013266080;This happens because not > necessarily > >every peak will be identified (to keep from identifying noise as peaks) > and > >also, the result of the acf is that the farther the signals been > correlated > >are shifted, the smaller the peaks tend to be. > > >I think the computer should be able to do a better job then me > identifying > >peaks. &#2013266080;So, what i'm wondering is, is there another way I can do this? > >Maybe something I can graph and fit a line to? &#2013266080;I have the position of a > >number of peaks, and a normalized value for their heights to work with. > > >Thanks for any assistance. &#2013266080;I am now stumped
please read what i wrote yesterday. there were a few other posts about this to both comp.dsp and the music-dsp mailing list. autocorrelation or the AMDF (or ASDF) methods get you do about the same place (picking peaks or alternatively notches). there are some nasty problems (sub-harmonics) that you just can't get away from and you have to think hard about how you, as a human viewing the autocorrelation data, would choose which peak (or notch). what do you think you hear when a strong 440 Hz tone has a very weak 220 Hz tone added to it? is it A above middle C or is it A below middle C? mathematically, what is it?
> This is a well know difficult problem and gets a lot worse when multpile > voices are present , for example chords played on a guitar. You do not say > how the note is produced ,
the only hope you have if you play a chord into a pitch detector is, if the chord is a nice major chord, it might pick the common subharmonic to all of the notes in the chord. otherwise you get shit.
> in the case of a plucked string instrument the > ratio of harmonics depends on the plucking or bowing position on the > string. In some cases the first harmonic can be greater that the > fundamental or even missing.
the fundamental *is* the "first harmonic" the way most of us count 'em. sometimes the "first overtone" refers to the second harmonic.
> One approach is to perform a constant Q transform instead of an fft. This > will place the spectral lines into semitone space, then your harmonics will > lie at a constant interval apart. So fisrt harmonics is 12 semitones above > the fundamental the next is 7 then 4 then 3 (if I remeber correctly)
1 = 2^(0/12) 2 = 2^(12/12) 3 ~= 2^(19/12) 4 = 2^(24/12) (so it's "5" instead of 4) 5 ~= 2^(28/12) (so it's "4" instead of 3 6 ~= 2^(31/12) (now it's 3...) 7 ~= 2^(34/12) 8 = 2^(36/12) 9 ~= 2^(38/12) ...
> > You can now get rough results by correlating for this pattern in the CQ > spectrum. The problem is now one of pattern matching. > > Music forums such as KVR are a rich source of help on these issues.
i'm unfamiliar with KVR. is it USENET or a mailing list or a Yahoo group? where is it? another good group about music and dsp is simply the "music-dsp" mailing list. r b-j