Detecting proximity to known frequencies

Started by Brian Victor November 23, 2005
Like several others over the years in this group, I'm trying to write
code that can be used as a music tuner.  Specifically, I'm trying to
emulate programs like SmartMusic and Karaoke Revolution.  The former is
a program that accompanies a music performance and can vary its
accompaniment based on what the human performer is playing.  The latter
is a PlayStation 2 game that scores the player on how on-pitch he or she
is singing.

For the Karaoke Revolution case in particular, I notice that the program
always knows what pitch is supposed to be present and merely has to
assess how far from that pitch the singer is.  As far as I could tell
the game is octave-agnostic and for my current purposes I can be as
well; catching the first harmonic rather than the fundamental isn't a
big problem.

Given these restrictions on what I need to analyze, is there an
algorithm I can use to determine with precision of at least 5 cents
(i.e., 1/20th of a semitone) how close the strongest frequency in the
signal is to the desired frequency?  I'm a DSP novice, but I've been
reading several papers the past few days so I at least have a general
understanding of the field.

Thanks!

-- 
Brian
Brian Victor wrote:
> Like several others over the years in this group, I'm trying to write > code that can be used as a music tuner. Specifically, I'm trying to > emulate programs like SmartMusic and Karaoke Revolution. The former is > a program that accompanies a music performance and can vary its > accompaniment based on what the human performer is playing. The latter > is a PlayStation 2 game that scores the player on how on-pitch he or she > is singing. > > For the Karaoke Revolution case in particular, I notice that the program > always knows what pitch is supposed to be present and merely has to > assess how far from that pitch the singer is. As far as I could tell > the game is octave-agnostic and for my current purposes I can be as > well; catching the first harmonic rather than the fundamental isn't a > big problem. > > Given these restrictions on what I need to analyze, is there an > algorithm I can use to determine with precision of at least 5 cents > (i.e., 1/20th of a semitone) how close the strongest frequency in the > signal is to the desired frequency? I'm a DSP novice, but I've been > reading several papers the past few days so I at least have a general > understanding of the field.
Finding the strongest frequency implies that you might be using an FFT based approach. If you use overlapping frames for your FFT analysis, the complex phase change between frames of the strongest frequency bin of the two frames might be able to give you a frequency determination accuracy to within a few cents for some suitable frame sizes and overlaps. This is similar to the approach that some phase vocoders use to determine pitch. You can also try interpolated autocorrelation for frequency determination with a greater time resolution (e.g. cycle by cycle for measuring vibrato, etc.) IMHO. YMMV. -- rhn A.T nicholson d.O.t C-o-M
Ron N. wrote:
> Finding the strongest frequency implies that you might be > using an FFT based approach. If you use overlapping frames > for your FFT analysis, the complex phase change between > frames of the strongest frequency bin of the two frames might > be able to give you a frequency determination accuracy to > within a few cents for some suitable frame sizes and overlaps.
This is an interesting idea. Am I correct in thinking that in order to do this, I would need to caculate where I would expect the phase to be at each frame and use the offset from that expected value to determine sharpness or flatness? Would that work in cases where no bin directly corresponds to a frequency I want to measure, as will be the case more often than not? Does this technique have a name that I can google for? Thanks! -- Brian
Brian Victor wrote:
> Ron N. wrote: > > Finding the strongest frequency implies that you might be > > using an FFT based approach. If you use overlapping frames > > for your FFT analysis, the complex phase change between > > frames of the strongest frequency bin of the two frames might > > be able to give you a frequency determination accuracy to > > within a few cents for some suitable frame sizes and overlaps. > > This is an interesting idea. Am I correct in thinking that in order to > do this, I would need to caculate where I would expect the phase to be > at each frame and use the offset from that expected value to determine > sharpness or flatness?
Any phase change between non-overlapped successive frames will point out a sharpness or flatness relative to the bin center frequency. For overlapped frames, even bin center frequencies might have an expected phase offset, depending on the bin number versus the frame overlap percentage.
> Would that work in cases where no bin directly > corresponds to a frequency I want to measure, as will be the > case more often than not?
You can compare what would be the phase change of some desired frequency versus a measured phase change to determine how far apart the two frequencies are, whether or not either one corresponds to a bin center frequency.
> Does this technique have a name that I can google for?
Something like this is used in the analysis phase of a phase vocoder for pitch shifting. Try googling for "phase vocoder" or vocoding. You can also try googling for a paper on pitch detection by Judith Brown at Wellesley, whose algorithm might be worth looking at if you are not already calculating overlapped FFT frames. IMHO. YMMV. -- rhn A.T nicholson d.O.t C-o-M
Ron N. wrote:
> Any phase change between non-overlapped successive frames > will point out a sharpness or flatness relative to the bin center > frequency. For overlapped frames, even bin center frequencies > might have an expected phase offset, depending on the bin number > versus the frame overlap percentage.
This may indicate a lack of my understanding about phase. I understand it to mean the point in the sine wave at which the wave begins for the given bin. Is this correct? With this being the case, wouldn't the phase be expected to change from frame to frame even if the pitch does not unless the frame size is an integral multiple of the period? Or perhaps I'm misunderstanding the phrase "bin center."
> Something like this is used in the analysis phase of a phase vocoder > for pitch shifting. Try googling for "phase vocoder" or vocoding. > You can also try googling for a paper on pitch detection by > Judith Brown at Wellesley, whose algorithm might be worth > looking at if you are not already calculating overlapped FFT > frames.
I'll take a look around for these. Thanks! -- Brian
Brian Victor wrote:
> Ron N. wrote: > > Any phase change between non-overlapped successive frames > > will point out a sharpness or flatness relative to the bin center > > frequency. For overlapped frames, even bin center frequencies > > might have an expected phase offset, depending on the bin number > > versus the frame overlap percentage. > > This may indicate a lack of my understanding about phase. I understand > it to mean the point in the sine wave at which the wave begins for the > given bin. Is this correct?
Close enough. Some interpret phase a referenced to the cosine wave.
> With this being the case, wouldn't the phase be expected to change from > frame to frame even if the pitch does not unless the frame size is an > integral multiple of the period? Or perhaps I'm misunderstanding the > phrase "bin center."
I use the term "bin center frequency" for the frequencies whose periods in samples are integer submultiples of the FFT frame length (only multiples up to half the FFT frame length of course). Any frequency whose period is not an integer submultiple of the offset of two FFT frames will show a phase change between the two frames. Note that it's the relationship between the period and frame offset, not the frame size, which affects the change in phase of the sine wave between two frames. You can use a lot of frame overlap to get a bit more locality in exchange for resolution. Also, I think that windowing before the FFT will give you a better measurement of the phase of your fundamental frequency. IMHO. YMMV. -- Ron rhn A.T nicholson d.O.t C-o-M