Reply by Martin Eisenberg February 22, 20092009-02-22
noodle22 wrote:

> http://www.studypipe.com/Shared/images/Peaks.png > > The x axis is just sample number and it is in the time domain.
Better to call it the lag domain. Martin -- Quidquid latine scriptum est, altum videtur.
Reply by Vladimir Vassilevsky February 20, 20092009-02-20

noodle22 wrote:

> Hi, > > I am trying to determine the pitch of an audio signal using > autocorrelation.
The standard way for doing that is the normalized autocorrelation. Start searching for the maximum of the normalized ACF from the higher pitch values. Assume the next maximum only if it is higher then the previous maximum by a factor of X. X depends on the particulars of your setup and the history. BTW, you might need to oversample the signal for the results to be more accurate and clear.
> So, what i'm wondering is, is there another way I can do this?
You can look for the comb period of the harmonics in the FFT domain. This is pretty accurate. Ah, of course, there is the ultimate pitch detector by Dmitry Teres, but nobody knows what it is and how well it works. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
Reply by noodle22 February 20, 20092009-02-20
>> there are some >nasty problems (sub-harmonics) that you just can't get away from
I was having some problems understanding this because it is not something I experience with my tests. I've since read a bit about it an I understand that it is common with certain instruments such as bells. For the sake of simplicity, assume that we will not have sub-harmonics. I will limit my pitch detector to just working for string instruments. It is a project I am doing for fun so it doesn't really have to capture every scenario. Also, might as well limit it to just one note at a time. I think the problem becomes significantly more difficult if I add other tones so one is enough :) If I can find some way to manipulate the peak data in that pic I posted so that I could obtain the fundamental frequency, then I would be all set. I want to address the threshold idea
> With these data I'd just go for a threshold: > > - Decide on a threshold T > - Find all intervals [a_m,b_m] such that r[k] > T, k =3D [a_m,b_m] > - Find the maximum inside each interval > ... and there's your peaks.
I was initially finding peaks just using a threshold and taking the top peaks. However, my threshold was a straight horizontal line across all samples and since the magnitude decreases the farther I get from center, it did not work overly well. Your method looks like it might be better but, I don't completely understand it the way it is described above. Can you elaborate on it a bit more? What exactly is [a_m,b_m], k =3D [a_m,b_m], and r[k]? I would guess that r[k]>T are peaks that are above the threshold over the interval a_m to b_m...so I guess my question is, how do you choose the intervals a_m to b_m (assuming there are several rather than just 1), and do you change the threshold T for each different interval?
Reply by robert bristow-johnson February 20, 20092009-02-20
On Feb 20, 10:10&#4294967295;am, "sigmonde" <cmacl...@soundmotion.co.uk> wrote:
> >Hi, > > >I am trying to determine the pitch of an audio signal using > >autocorrelation. &#4294967295;My autocorrelation peaks come out quite clearly and > after > >a bit of processing, I have a signal where one of the following is true > > >a) all the peaks are close to the same height as their neighbors > >b) Every second peak is significantly taller than the previous peak > >c) Every third peak is significantly taller than the previous 2 peaks > > >Clearly I am dealing with harmonics. &#4294967295;My problem is trying to get my > >program to recognize what the fundamental frequency of the signal is. > > >Methods of have tried > >a) get the average height for all peaks, every second peak, and every > >third peak and then compare. &#4294967295;Ideally if one group stands out, it > indicates > >a harmonic > >b) compare the height of a peak with the height of it's neighbors for > >every peak, every 2nd peak, and every 3rd peak. &#4294967295;If a group stands out, > it > >indicates the fundamental frequency > > >These two methods work reasonably well but they do not have the > >sophistication to be overly accurate and I often run into issues where > >viewing the spectrum, I can tell what the frequency is but the > algorithms > >above are to finicky to get it right. &#4294967295;This happens because not > necessarily > >every peak will be identified (to keep from identifying noise as peaks) > and > >also, the result of the acf is that the farther the signals been > correlated > >are shifted, the smaller the peaks tend to be. > > >I think the computer should be able to do a better job then me > identifying > >peaks. &#4294967295;So, what i'm wondering is, is there another way I can do this? > >Maybe something I can graph and fit a line to? &#4294967295;I have the position of a > >number of peaks, and a normalized value for their heights to work with. > > >Thanks for any assistance. &#4294967295;I am now stumped
please read what i wrote yesterday. there were a few other posts about this to both comp.dsp and the music-dsp mailing list. autocorrelation or the AMDF (or ASDF) methods get you do about the same place (picking peaks or alternatively notches). there are some nasty problems (sub-harmonics) that you just can't get away from and you have to think hard about how you, as a human viewing the autocorrelation data, would choose which peak (or notch). what do you think you hear when a strong 440 Hz tone has a very weak 220 Hz tone added to it? is it A above middle C or is it A below middle C? mathematically, what is it?
> This is a well know difficult problem and gets a lot worse when multpile > voices are present , for example chords played on a guitar. You do not say > how the note is produced ,
the only hope you have if you play a chord into a pitch detector is, if the chord is a nice major chord, it might pick the common subharmonic to all of the notes in the chord. otherwise you get shit.
> in the case of a plucked string instrument the > ratio of harmonics depends on the plucking or bowing position on the > string. In some cases the first harmonic can be greater that the > fundamental or even missing.
the fundamental *is* the "first harmonic" the way most of us count 'em. sometimes the "first overtone" refers to the second harmonic.
> One approach is to perform a constant Q transform instead of an fft. This > will place the spectral lines into semitone space, then your harmonics will > lie at a constant interval apart. So fisrt harmonics is 12 semitones above > the fundamental the next is 7 then 4 then 3 (if I remeber correctly)
1 = 2^(0/12) 2 = 2^(12/12) 3 ~= 2^(19/12) 4 = 2^(24/12) (so it's "5" instead of 4) 5 ~= 2^(28/12) (so it's "4" instead of 3 6 ~= 2^(31/12) (now it's 3...) 7 ~= 2^(34/12) 8 = 2^(36/12) 9 ~= 2^(38/12) ...
> > You can now get rough results by correlating for this pattern in the CQ > spectrum. The problem is now one of pattern matching. > > Music forums such as KVR are a rich source of help on these issues.
i'm unfamiliar with KVR. is it USENET or a mailing list or a Yahoo group? where is it? another good group about music and dsp is simply the "music-dsp" mailing list. r b-j
Reply by noodle22 February 20, 20092009-02-20
>One approach is to perform a constant Q transform instead of an fft. This >will place the spectral lines into semitone space, then your harmonics
will
>lie at a constant interval apart. So fisrt harmonics is 12 semitones
aove
>the fundamental the next is 7 then 4 then 3 (if I remeber correctly)
I have never heard of a constant Q transform before but after reading about it a bit, it seems like in some ways it would have been ideal. I think it is now too late for me to go back to a frequency domain solution as I have done a lot of time domain work and I would like to keep it if possible. I originally made the pitch detector work with an FFT but I dropped the solution for ACF because a) To cover the frequency range I was interested in, I needed a lot of samples which gave me a refresh rate that was too slow b) Bin sizes were a bit of a problem. I wanted a fairly accurate result but my bins were too large. When I shrunk them with 0 padding, I would only increase the refresh rate problem since it took longer to compute.
>Music forums such as KVR are a rich source of help on these issues.
I will check this out more as well. Thanks
Reply by Rune Allnor February 20, 20092009-02-20
On 20 Feb, 16:18, "noodle22" <jw970...@yahoo.com> wrote:
> Ok, thanks for the responses so far. &#4294967295;I have uploaded an image of my > results to help clarify the issue > > http://www.studypipe.com/Shared/images/Peaks.png > > The x axis is just sample number and it is in the time domain. &#4294967295;
With these data I'd just go for a threshold: - Decide on a threshold T - Find all intervals [a_m,b_m] such that r[k] > T, k = [a_m,b_m] - Find the maximum inside each interval ... and there's your peaks. Rune
Reply by noodle22 February 20, 20092009-02-20
Ok, thanks for the responses so far.  I have uploaded an image of my
results to help clarify the issue

http://www.studypipe.com/Shared/images/Peaks.png

The x axis is just sample number and it is in the time domain.  I zeroed
the center peak since it is always perfectly correlated but to count peaks,
just start in the center and either go left or right to the end.  


>first YOU have to decide what you mean by the fundamental frequency. >i presume you mean that fundamental frequency is the reciprocal of the >period.
Good point. I'm not sure I always use this term correctly. What I was referring to is the lowest frequency (and coincidentally, the frequency that I am playing on my keyboard and trying to identify). If you look at the first graph in the image, the fundamental frequency I was referring to was 1/the distance between every 3rd peak. As for sub-harmonics, I'm pretty sure my issue is harmonics since those other smaller peaks are at a higher frequency than the peak I am trying to measure
> One of those awkward details are missing: Are you taliking > about frequency domain or time domain?
I wasn't very clear about that. I am talking about time domain.
>Create candidates by starting at the first peak, >the second and maybe the third one. And try to rule out inbetween >peaks that deviate from a regular spacing. >In other words: >1. Take the first peak >2. go to the second and measure the distance >3. consider all following peaks at multiples of this distance
as one group My algorithm was not really described in a whole lot of detail but that is basically what I am doing. And then I am using the group with the highest average height to determine the pitch. It works ok but I was hoping there was a better method. As for your friend's idea of crossovers...I think this would be a fantastic way to do it and I know that it does preform well. However, I have an issue with hum at 60 Hz. This really throws off the cross over count. And I don't really want to use a notch/stop band filter because I have frequencies I want to measure near 60, and also, I want to keep calculations as minimal as possible. I have clear peaks with my current method so I don't think a notch filter is really necessary So, now that you guys have seen my peaks, any other ideas on how I can get the proper frequency that I am looking for?
Reply by sigmonde February 20, 20092009-02-20
>Hi, > >I am trying to determine the pitch of an audio signal using >autocorrelation. My autocorrelation peaks come out quite clearly and
after
>a bit of processing, I have a signal where one of the following is true > >a) all the peaks are close to the same height as their neighbors >b) Every second peak is significantly taller than the previous peak >c) Every third peak is significantly taller than the previous 2 peaks > >Clearly I am dealing with harmonics. My problem is trying to get my >program to recognize what the fundamental frequency of the signal is. > >Methods of have tried >a) get the average height for all peaks, every second peak, and every >third peak and then compare. Ideally if one group stands out, it
indicates
>a harmonic >b) compare the height of a peak with the height of it's neighbors for >every peak, every 2nd peak, and every 3rd peak. If a group stands out,
it
>indicates the fundamental frequency > >These two methods work reasonably well but they do not have the >sophistication to be overly accurate and I often run into issues where >viewing the spectrum, I can tell what the frequency is but the
algorithms
>above are to finicky to get it right. This happens because not
necessarily
>every peak will be identified (to keep from identifying noise as peaks)
and
>also, the result of the acf is that the farther the signals been
correlated
>are shifted, the smaller the peaks tend to be. > >I think the computer should be able to do a better job then me
identifying
>peaks. So, what i'm wondering is, is there another way I can do this? >Maybe something I can graph and fit a line to? I have the position of a >number of peaks, and a normalized value for their heights to work with. > >Thanks for any assistance. I am now stumped >
This is a well know difficult problem and gets a lot worse when multpile voices are present , for example chords played on a guitar. You do not say how the note is produced , in the case of a plucked string instrument the ratio of harmonics depends on the plucking or bowing position on the string. In some cases the first harmonic can be greater that the fundamental or even missing. One approach is to perform a constant Q transform instead of an fft. This will place the spectral lines into semitone space, then your harmonics will lie at a constant interval apart. So fisrt harmonics is 12 semitones aove the fundamental the next is 7 then 4 then 3 (if I remeber correctly) You can now get rough results by correlating for this pattern in the CQ spectrum. The problem is now one of pattern matching. Music forums such as KVR are a rich source of help on these issues.
Reply by banton February 20, 20092009-02-20
>Hi, > >I am trying to determine the pitch of an audio signal using >autocorrelation. My autocorrelation peaks come out quite clearly and
after
>a bit of processing, I have a signal where one of the following is true > >a) all the peaks are close to the same height as their neighbors >b) Every second peak is significantly taller than the previous peak >c) Every third peak is significantly taller than the previous 2 peaks > >Clearly I am dealing with harmonics. My problem is trying to get my >program to recognize what the fundamental frequency of the signal is. > >Methods of have tried >a) get the average height for all peaks, every second peak, and every >third peak and then compare. Ideally if one group stands out, it >indicates a harmonic
I don't exactly understand the second sentence. In general the idea sounds reasonable, except that you might want to take the distance between peaks into account. If you just take every second one any spurious peak will ruin the whole scheme completely. I would try to identify pulse-trains (with regular spacing) and I would suggest to try different start points. Create candidates by starting at the first peak, the second and maybe the third one. And try to rule out inbetween peaks that deviate from a regular spacing. In other words: 1. Take the first peak 2. go to the second and measure the distance 3. consider all following peaks at multiples of this distance as one group Create some pitch candidates by repeating this, and skipping the first and maybe second peak and build candidates by using every second (or third peak) but also stick to a regular distance. Find the candidate whose pulse-train has the highest energy. A friend once made a pitch tracker for a tuba using a very simple method (she added DC and measured zero crossing distance). That thing worked fine and outperformed more complex pitch-trackers in terms of latency. What I want to say, is that if you can limit the class of signals you want to be able to track, you can use specific characteristics of the signals you want to track and you have much better chances to get it to work. Bjoern
>b) compare the height of a peak with the height of it's neighbors for >every peak, every 2nd peak, and every 3rd peak. If a group stands out,
it
>indicates the fundamental frequency >
Reply by Rune Allnor February 20, 20092009-02-20
On 20 Feb, 05:53, "noodle22" <jw970...@yahoo.com> wrote:
> Hi, > > I am trying to determine the pitch of an audio signal using > autocorrelation. &#4294967295;My autocorrelation peaks come out quite clearly and after > a bit of processing, I have a signal where one of the following is true > > a) all the peaks are close to the same height as their neighbors > b) Every second peak is significantly taller than the previous peak > c) Every third peak is significantly taller than the previous 2 peaks
One of those awkward details are missing: Are you taliking about frequency domain or time domain? ...
> I think the computer should be able to do a better job then me identifying > peaks.
Why do you think this? Rune