noodle22 wrote:

> Hi,
> 
> I am trying to determine the pitch of an audio signal using
> autocorrelation. 

The standard way for doing that is the normalized autocorrelation. Start 
searching for the maximum of the normalized ACF from the higher pitch 
values. Assume the next maximum only if it is higher then the previous 
maximum by a factor of X. X depends on the particulars of your setup and 
  the history.

BTW, you might need to oversample the signal for the results to be more 
accurate and clear.

> So, what i'm wondering is, is there another way I can do this? 

You can look for the comb period of the harmonics in the FFT domain. 
This is pretty accurate.

Ah, of course, there is the ultimate pitch detector by Dmitry Teres, but 
nobody knows what it is and how well it works.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

>> there are some
>nasty problems (sub-harmonics) that you just can't get away from 

I was having some problems understanding this because it is not something
I experience with my tests.  I've since read a bit about it an I understand
that it is common with certain instruments such as bells.  For the sake of
simplicity, assume that we will not have sub-harmonics.  I will limit my
pitch detector to just working for string instruments.  It is a project I
am doing for fun so it doesn't really have to capture every scenario. 
Also, might as well limit it to just one note at a time.  I think the
problem becomes significantly more difficult if I add other tones so one is
enough :)

If I can find some way to manipulate the peak data in that pic I posted so
that I could obtain the fundamental frequency, then I would be all set.

I want to address the threshold idea

> With these data I'd just go for a threshold:
>
> - Decide on a threshold T
> - Find all intervals [a_m,b_m] such that r[k] > T,   k =3D [a_m,b_m]
> - Find the maximum inside each interval
> ... and there's your peaks.

I was initially finding peaks just using a threshold and taking the top
peaks.  However, my threshold was a straight horizontal line across all
samples and since the magnitude decreases the farther I get from center, it
 did not work overly well.  Your method looks like it might be better but,
I don't completely understand it the way it is described above.  

Can you elaborate on it a bit more? What exactly is [a_m,b_m], k =3D
[a_m,b_m], and r[k]?  I would guess that r[k]>T are peaks that are above
the threshold over the interval a_m to b_m...so I guess my question is, how
do you choose the intervals a_m to b_m (assuming there are several rather
than just 1), and do you change the threshold T for each different
interval?

On Feb 20, 10:10&#4294967295;am, "sigmonde" <cmacl...@soundmotion.co.uk> wrote:
> >Hi,
>
> >I am trying to determine the pitch of an audio signal using
> >autocorrelation. &#4294967295;My autocorrelation peaks come out quite clearly and
> after
> >a bit of processing, I have a signal where one of the following is true
>
> >a) all the peaks are close to the same height as their neighbors
> >b) Every second peak is significantly taller than the previous peak
> >c) Every third peak is significantly taller than the previous 2 peaks
>
> >Clearly I am dealing with harmonics. &#4294967295;My problem is trying to get my
> >program to recognize what the fundamental frequency of the signal is.
>
> >Methods of have tried
> >a) get the average height for all peaks, every second peak, and every
> >third peak and then compare. &#4294967295;Ideally if one group stands out, it
> indicates
> >a harmonic
> >b) compare the height of a peak with the height of it's neighbors for
> >every peak, every 2nd peak, and every 3rd peak. &#4294967295;If a group stands out,
> it
> >indicates the fundamental frequency
>
> >These two methods work reasonably well but they do not have the
> >sophistication to be overly accurate and I often run into issues where
> >viewing the spectrum, I can tell what the frequency is but the
> algorithms
> >above are to finicky to get it right. &#4294967295;This happens because not
> necessarily
> >every peak will be identified (to keep from identifying noise as peaks)
> and
> >also, the result of the acf is that the farther the signals been
> correlated
> >are shifted, the smaller the peaks tend to be.
>
> >I think the computer should be able to do a better job then me
> identifying
> >peaks. &#4294967295;So, what i'm wondering is, is there another way I can do this?
> >Maybe something I can graph and fit a line to? &#4294967295;I have the position of a
> >number of peaks, and a normalized value for their heights to work with.
>
> >Thanks for any assistance. &#4294967295;I am now stumped


please read what i wrote yesterday.  there were a few other posts
about this to both comp.dsp and the music-dsp mailing list.
autocorrelation or the AMDF (or ASDF) methods get you do about the
same place (picking peaks or alternatively notches).  there are some
nasty problems (sub-harmonics) that you just can't get away from and
you have to think hard about how you, as a human viewing the
autocorrelation data, would choose which peak (or notch).  what do you
think you hear when a strong 440 Hz tone has a very weak 220 Hz tone
added to it?  is it A above middle C or is it A below middle C?
mathematically, what is it?


> This is a well know difficult problem and gets a lot worse when multpile
> voices are present , for example chords played on a guitar. You do not say
> how the note is produced ,

the only hope you have if you play a chord into a pitch detector is,
if the chord is a nice major chord, it might pick the common
subharmonic to all of the notes in the chord.  otherwise you get shit.

> in the case of a plucked string instrument the
> ratio of harmonics depends on the plucking or bowing position on the
> string. In some cases the first harmonic can be greater that the
> fundamental or even missing.

the fundamental *is* the "first harmonic" the way most of us count
'em.  sometimes the "first overtone" refers to the second harmonic.

> One approach is to perform a constant Q transform instead of an fft. This
> will place the spectral lines into semitone space, then your harmonics will
> lie at a constant interval apart. So fisrt harmonics is 12 semitones above
> the fundamental the next is 7 then 4 then 3 (if I remeber correctly)


 1  = 2^(0/12)
 2  = 2^(12/12)
 3 ~= 2^(19/12)
 4  = 2^(24/12)  (so it's "5" instead of 4)
 5 ~= 2^(28/12)  (so it's "4" instead of 3
 6 ~= 2^(31/12)  (now it's 3...)
 7 ~= 2^(34/12)
 8  = 2^(36/12)
 9 ~= 2^(38/12)

 ...

>
> You can now get rough results by correlating for this pattern in the CQ
> spectrum. The problem is now one of pattern matching.
>
> Music forums such as KVR are a rich source of help on these issues.

i'm unfamiliar with KVR.  is it USENET or a mailing list or a Yahoo
group?  where is it?

another good group about music and dsp is simply the "music-dsp"
mailing list.

r b-j

>One approach is to perform a constant Q transform instead of an fft. This
>will place the spectral lines into semitone space, then your harmonics
will
>lie at a constant interval apart. So fisrt harmonics is 12 semitones
aove
>the fundamental the next is 7 then 4 then 3 (if I remeber correctly)

I have never heard of a constant Q transform before but after reading
about it a bit, it seems like in some ways it would have been ideal.  I
think it is now too late for me to go back to a frequency domain solution
as I have done a lot of time domain work and I would like to keep it if
possible.

I originally made the pitch detector work with an FFT but I dropped the
solution for ACF because
a) To cover the frequency range I was interested in, I needed a lot of
samples which gave me a refresh rate that was too slow
b) Bin sizes were a bit of a problem.  I wanted a fairly accurate result
but my bins were too large.  When I shrunk them with 0 padding, I would
only increase the refresh rate problem since it took longer to compute.

>Music forums such as KVR are a rich source of help on these issues.

I will check this out more as well.  Thanks

On 20 Feb, 16:18, "noodle22" <jw970...@yahoo.com> wrote:
> Ok, thanks for the responses so far. &#4294967295;I have uploaded an image of my
> results to help clarify the issue
>
> http://www.studypipe.com/Shared/images/Peaks.png
>
> The x axis is just sample number and it is in the time domain. &#4294967295;

With these data I'd just go for a threshold:

- Decide on a threshold T
- Find all intervals [a_m,b_m] such that r[k] > T,   k = [a_m,b_m]
- Find the maximum inside each interval

... and there's your peaks.

Rune

Ok, thanks for the responses so far.  I have uploaded an image of my
results to help clarify the issue

http://www.studypipe.com/Shared/images/Peaks.png

The x axis is just sample number and it is in the time domain.  I zeroed
the center peak since it is always perfectly correlated but to count peaks,
just start in the center and either go left or right to the end.  


>first YOU have to decide what you mean by the fundamental frequency.
>i presume you mean that fundamental frequency is the reciprocal of the
>period.

Good point.  I'm not sure I always use this term correctly.  What I was
referring to is the lowest frequency (and coincidentally, the frequency
that I am playing on my keyboard and trying to identify).  If you look at
the first graph in the image, the fundamental frequency I was referring to
was 1/the distance between every 3rd peak.

As for sub-harmonics, I'm pretty sure my issue is harmonics since those
other smaller peaks are at a higher frequency than the peak I am trying to
measure

> One of those awkward details are missing: Are you taliking
> about frequency domain or time domain?

I wasn't very clear about that.  I am talking about time domain.

>Create candidates by starting at the first peak,
>the second and maybe the third one.  And try to rule out inbetween
>peaks that deviate from a regular spacing.
>In other words:
>1. Take the first peak
>2. go to the second and measure the distance
>3. consider all following peaks at multiples of this distance
   as one group

My algorithm was not really described in a whole lot of detail but that is
basically what I am doing.  And then I am using the group with the highest
average height to determine the pitch.  It works ok but I was hoping there
was a better method.  

As for your friend's idea of crossovers...I think this would be a
fantastic way to do it and I know that it does preform well.  However, I
have an issue with hum at 60 Hz.  This really throws off the cross over
count.  And I don't really want to use a notch/stop band filter because I
have frequencies I want to measure near 60, and also, I want to keep
calculations as minimal as possible.  I have clear peaks with my current
method so I don't think a notch filter is really necessary


So, now that you guys have seen my peaks, any other ideas on how I can get
the proper frequency that I am looking for?

>Hi,
>
>I am trying to determine the pitch of an audio signal using
>autocorrelation.  My autocorrelation peaks come out quite clearly and
after
>a bit of processing, I have a signal where one of the following is true
>
>a) all the peaks are close to the same height as their neighbors
>b) Every second peak is significantly taller than the previous peak
>c) Every third peak is significantly taller than the previous 2 peaks
>
>Clearly I am dealing with harmonics.  My problem is trying to get my
>program to recognize what the fundamental frequency of the signal is.
>
>Methods of have tried
>a) get the average height for all peaks, every second peak, and every
>third peak and then compare.  Ideally if one group stands out, it
indicates
>a harmonic
>b) compare the height of a peak with the height of it's neighbors for
>every peak, every 2nd peak, and every 3rd peak.  If a group stands out,
it
>indicates the fundamental frequency
>
>These two methods work reasonably well but they do not have the
>sophistication to be overly accurate and I often run into issues where
>viewing the spectrum, I can tell what the frequency is but the
algorithms
>above are to finicky to get it right.  This happens because not
necessarily
>every peak will be identified (to keep from identifying noise as peaks)
and
>also, the result of the acf is that the farther the signals been
correlated
>are shifted, the smaller the peaks tend to be.
>
>I think the computer should be able to do a better job then me
identifying
>peaks.  So, what i'm wondering is, is there another way I can do this? 
>Maybe something I can graph and fit a line to?  I have the position of a
>number of peaks, and a normalized value for their heights to work with.
>
>Thanks for any assistance.  I am now stumped
>

This is a well know difficult problem and gets a lot worse when multpile
voices are present , for example chords played on a guitar. You do not say
how the note is produced , in the case of a plucked string instrument the
ratio of harmonics depends on the plucking or bowing position on the
string. In some cases the first harmonic can be greater that the
fundamental or even missing. 

One approach is to perform a constant Q transform instead of an fft. This
will place the spectral lines into semitone space, then your harmonics will
lie at a constant interval apart. So fisrt harmonics is 12 semitones aove
the fundamental the next is 7 then 4 then 3 (if I remeber correctly)

You can now get rough results by correlating for this pattern in the CQ
spectrum. The problem is now one of pattern matching.

Music forums such as KVR are a rich source of help on these issues.

>Hi,
>
>I am trying to determine the pitch of an audio signal using
>autocorrelation.  My autocorrelation peaks come out quite clearly and
after
>a bit of processing, I have a signal where one of the following is true
>
>a) all the peaks are close to the same height as their neighbors
>b) Every second peak is significantly taller than the previous peak
>c) Every third peak is significantly taller than the previous 2 peaks
>
>Clearly I am dealing with harmonics.  My problem is trying to get my
>program to recognize what the fundamental frequency of the signal is.
>
>Methods of have tried
>a) get the average height for all peaks, every second peak, and every
>third peak and then compare.  Ideally if one group stands out, it 
>indicates a harmonic

I don't exactly understand the second sentence.
In general the idea sounds reasonable, except that
you might want to take the distance between peaks into account.
If you just take every second one any spurious peak will ruin the whole
scheme completely.
I would try to identify pulse-trains
(with regular spacing) and I would suggest to try different
start points.  Create candidates by starting at the first peak,
the second and maybe the third one.  And try to rule out inbetween
peaks that deviate from a regular spacing.
In other words:
1. Take the first peak
2. go to the second and measure the distance
3. consider all following peaks at multiples of this distance
   as one group

Create some pitch candidates by repeating this, and skipping
the first and maybe second peak and build candidates by using
every second (or third peak) but also stick to a regular distance.
Find the candidate whose pulse-train has the highest energy.

A friend once made a pitch tracker for a tuba using a very
simple method (she added DC and measured zero crossing distance).
That thing worked fine and outperformed more complex
pitch-trackers in terms of latency.
What I want to say, is that if you can limit the
class of signals you want to be able to track, you can use
specific characteristics of the signals you want to track and
you have much better chances to get it to work.

Bjoern

>b) compare the height of a peak with the height of it's neighbors for
>every peak, every 2nd peak, and every 3rd peak.  If a group stands out,
it
>indicates the fundamental frequency
>