# Detecting proximity to known frequencies

Started by November 23, 2005
```Like several others over the years in this group, I'm trying to write
code that can be used as a music tuner.  Specifically, I'm trying to
emulate programs like SmartMusic and Karaoke Revolution.  The former is
a program that accompanies a music performance and can vary its
accompaniment based on what the human performer is playing.  The latter
is a PlayStation 2 game that scores the player on how on-pitch he or she
is singing.

For the Karaoke Revolution case in particular, I notice that the program
always knows what pitch is supposed to be present and merely has to
assess how far from that pitch the singer is.  As far as I could tell
the game is octave-agnostic and for my current purposes I can be as
well; catching the first harmonic rather than the fundamental isn't a
big problem.

Given these restrictions on what I need to analyze, is there an
algorithm I can use to determine with precision of at least 5 cents
(i.e., 1/20th of a semitone) how close the strongest frequency in the
signal is to the desired frequency?  I'm a DSP novice, but I've been
reading several papers the past few days so I at least have a general
understanding of the field.

Thanks!

--
Brian
```
```Brian Victor wrote:
> Like several others over the years in this group, I'm trying to write
> code that can be used as a music tuner.  Specifically, I'm trying to
> emulate programs like SmartMusic and Karaoke Revolution.  The former is
> a program that accompanies a music performance and can vary its
> accompaniment based on what the human performer is playing.  The latter
> is a PlayStation 2 game that scores the player on how on-pitch he or she
> is singing.
>
> For the Karaoke Revolution case in particular, I notice that the program
> always knows what pitch is supposed to be present and merely has to
> assess how far from that pitch the singer is.  As far as I could tell
> the game is octave-agnostic and for my current purposes I can be as
> well; catching the first harmonic rather than the fundamental isn't a
> big problem.
>
> Given these restrictions on what I need to analyze, is there an
> algorithm I can use to determine with precision of at least 5 cents
> (i.e., 1/20th of a semitone) how close the strongest frequency in the
> signal is to the desired frequency?  I'm a DSP novice, but I've been
> reading several papers the past few days so I at least have a general
> understanding of the field.

Finding the strongest frequency implies that you might be
using an FFT based approach.  If you use overlapping frames
for your FFT analysis, the complex phase change between
frames of the strongest frequency bin of the two frames might
be able to give you a frequency determination accuracy to
within a few cents for some suitable frame sizes and overlaps.

This is similar to the approach that some phase vocoders use
to determine pitch.

You can also try interpolated autocorrelation for frequency
determination with a greater time resolution (e.g. cycle by
cycle for measuring vibrato, etc.)

IMHO. YMMV.
--
rhn A.T nicholson d.O.t C-o-M

```
```Ron N. wrote:
> Finding the strongest frequency implies that you might be
> using an FFT based approach.  If you use overlapping frames
> for your FFT analysis, the complex phase change between
> frames of the strongest frequency bin of the two frames might
> be able to give you a frequency determination accuracy to
> within a few cents for some suitable frame sizes and overlaps.

This is an interesting idea.  Am I correct in thinking that in order to
do this, I would need to caculate where I would expect the phase to be
at each frame and use the offset from that expected value to determine
sharpness or flatness?  Would that work in cases where no bin directly
corresponds to a frequency I want to measure, as will be the case more
often than not?  Does this technique have a name that I can google for?

Thanks!

--
Brian
```
```Brian Victor wrote:
> Ron N. wrote:
> > Finding the strongest frequency implies that you might be
> > using an FFT based approach.  If you use overlapping frames
> > for your FFT analysis, the complex phase change between
> > frames of the strongest frequency bin of the two frames might
> > be able to give you a frequency determination accuracy to
> > within a few cents for some suitable frame sizes and overlaps.
>
> This is an interesting idea.  Am I correct in thinking that in order to
> do this, I would need to caculate where I would expect the phase to be
> at each frame and use the offset from that expected value to determine
> sharpness or flatness?

Any phase change between non-overlapped successive frames
will point out a sharpness or flatness relative to the bin center
frequency.  For overlapped frames, even bin center frequencies
might have an expected phase offset, depending on the bin number
versus the frame overlap percentage.

> Would that work in cases where no bin directly
> corresponds to a frequency I want to measure, as will be the
> case more often than not?

You can compare what would be the phase change of some desired
frequency versus a measured phase change to determine how far
apart the two frequencies are, whether or not either one
corresponds to a bin center frequency.

> Does this technique have a name that I can google for?

Something like this is used in the analysis phase of a phase vocoder
for pitch shifting.  Try googling for "phase vocoder" or vocoding.
You can also try googling for a paper on pitch detection by
Judith Brown at Wellesley, whose algorithm might be worth
looking at if you are not already calculating overlapped FFT
frames.

IMHO. YMMV.
--
rhn A.T nicholson d.O.t C-o-M

```
```Ron N. wrote:
> Any phase change between non-overlapped successive frames
> will point out a sharpness or flatness relative to the bin center
> frequency.  For overlapped frames, even bin center frequencies
> might have an expected phase offset, depending on the bin number
> versus the frame overlap percentage.

This may indicate a lack of my understanding about phase.  I understand
it to mean the point in the sine wave at which the wave begins for the
given bin.  Is this correct?

With this being the case, wouldn't the phase be expected to change from
frame to frame even if the pitch does not unless the frame size is an
integral multiple of the period?  Or perhaps I'm misunderstanding the
phrase "bin center."

> Something like this is used in the analysis phase of a phase vocoder
> for pitch shifting.  Try googling for "phase vocoder" or vocoding.
> You can also try googling for a paper on pitch detection by
> Judith Brown at Wellesley, whose algorithm might be worth
> looking at if you are not already calculating overlapped FFT
> frames.

I'll take a look around for these.  Thanks!

--
Brian
```
```Brian Victor wrote:
> Ron N. wrote:
> > Any phase change between non-overlapped successive frames
> > will point out a sharpness or flatness relative to the bin center
> > frequency.  For overlapped frames, even bin center frequencies
> > might have an expected phase offset, depending on the bin number
> > versus the frame overlap percentage.
>
> This may indicate a lack of my understanding about phase.  I understand
> it to mean the point in the sine wave at which the wave begins for the
> given bin.  Is this correct?

Close enough.  Some interpret phase a referenced to the cosine wave.

> With this being the case, wouldn't the phase be expected to change from
> frame to frame even if the pitch does not unless the frame size is an
> integral multiple of the period?  Or perhaps I'm misunderstanding the
> phrase "bin center."

I use the term "bin center frequency" for the frequencies whose
periods in samples are integer submultiples of the FFT frame
length (only multiples up to half the FFT frame length of course).
Any frequency whose period is not an integer submultiple of the
offset of two FFT frames will show a phase change between the two
frames.  Note that it's the relationship between the period and
frame offset, not the frame size, which affects the change in phase
of the sine wave between two frames.  You can use a lot of frame
overlap to get a bit more locality in exchange for resolution.

Also, I think that windowing before the FFT will give you a
better measurement of the phase of your fundamental frequency.

IMHO. YMMV.
--
Ron
rhn A.T nicholson d.O.t C-o-M

```