DSPRelated.com
Forums

Pitch detection in voice (singing)

Started by smuglr May 13, 2005
Hi,

I know the general subject of pitch detection has been flogged to death,
but I am looking specifically for pitch detection in a sung melody. It
will be performed in real-time with high signal-to-noise. 

I'm currently using auto-correlation to find an estimate or the
fundamental frequency, and then using phase-unwrapping to get high
resolution. This works well for many instruments, but seems to get
confused by the formants in certain vowel sounds.

Can anyone suggest a method which would suit my needs?

Thanks,
Dougie


		
This message was sent using the Comp.DSP web interface on
www.DSPRelated.com
>I'm currently using auto-correlation to find an estimate or the >fundamental frequency, and then using phase-unwrapping to get high >resolution.
Meant to say Cepstrum, not auto-correlation - oops! Dougie This message was sent using the Comp.DSP web interface on www.DSPRelated.com
Hi Dougie,

if you take some tool to visualize an FFT of your vocal audio-input,
you should see clear peaks for fundamental and several harmonics.

So Cepstrum should give better results, even if the fundamental and
a few harmonics are missing.
I remember several different theories and formulas, so you may check
your Cepstrum-routines with some reference-input, though it may be
an issue of samplerate and FFT-size too.

Hope it helps,

Carsten Neubauer
http://www.c14sw.de

Looked at code for first time in a while, realised I am using
autocorrelation - the difference is just using a non-linearity is that
right?  
Eg.
AC = FT ( Powerspectrum )
Cepstrum = FT (  Log(PowerSpectrum) )

And if so, there's no practical difference between the two methods for my
purposes is there?


>if you take some tool to visualize an FFT of your vocal audio-input, >you should see clear peaks for fundamental and several harmonics.
The code was calculating the correct answer in most cases, but as I altered the tonal quality of my voice maintaining pitch as best I could, the estimated frequency would change with certain sounds. I have noticed that I temporarily removed the section which windowed before each each FFT. I'll stick that back in and have a go, but I didn't think that would make too much difference to pitch estimation. Maybe it's my singing voice. The guys in the lab will be pleased to hear I'm testing the pitchtracker again :). This message was sent using the Comp.DSP web interface on www.DSPRelated.com
Forget about using FFT, cepstrum, auto/cross-correlation, AMDF/ASDF
etc.

Pitch detection problem is solved:

http://www.soundmathtech.com/p=ADitch

Dmitry Terez

dt@soundmathtech.com wrote:

> Forget about using FFT, cepstrum, auto/cross-correlation, AMDF/ASDF > etc. > > Pitch detection problem is solved: > > http://www.soundmathtech.com/p�itch >
link as given gives 404 error when clicked it goes to http://www.soundmathtech.com/p=ADitch not http://www.soundmathtech.com/pitch which works
> Dmitry Terez >
smuglr wrote:
> The code was calculating the correct answer in most cases, but as I > altered the tonal quality of my voice maintaining pitch as best I
could,
> the estimated frequency would change with certain sounds.
Are you searching for the highest peak in the cepstrum? If so, you'll get the effect you describe whenever there's a strong overtone. Perhaps a more robust way is to find the lowest-frequency peak -of a relatively high amplitude-, restricted to being harmonically related to other (higher-frequency peaks). That way you are more likely to find the fundamental tone, although coding will be a bit more troublesome. Good luck!
> Looked at code for first time in a while, realised I am using > autocorrelation - the difference is just using a non-linearity is
that
> right? > Eg. > AC = FT ( Powerspectrum ) > Cepstrum = FT ( Log(PowerSpectrum) ) > And if so, there's no practical difference between the two methods
for my
> purposes is there?
Never tried autocorrelation myself. I use Cepstrum = DCT( Log(Abs(FFT(WindowedSignal)))), which seems a bit unusual, most formulas I've seen go Cepstrum = FFT( Log(Abs(FFT(WindowedSignal)))) or Cepstrum = IFFT( Log(Abs(FFT(WindowedSignal)))), I think the log-stage makes a difference, though I can't remember why. Does somebody else know?
>>if you take some tool to visualize an FFT of your vocal audio-input, >>you should see clear peaks for fundamental and several harmonics. > The code was calculating the correct answer in most cases, but as I > altered the tonal quality of my voice maintaining pitch as best I
could,
> the estimated frequency would change with certain sounds.
You are sure your buffers are big enough to hold a few periods of low frequencies, are you?
> I have noticed that I temporarily removed the section which windowed > before each each FFT. I'll stick that back in and have a go, but I
didn't
> think that would make too much difference to pitch estimation.
Perhaps you better put it back in and analyze overlapping buffers. Imagine the short audio-buffer you want to analyze. As FFT is for (endless) repeating Signals, imagine you would play it repeatedly. Now in most cases you will hear a loud crack, each time the buffer is played, because the end and the start won't fit together properly. Windowing removes the crack before doing FFT, but of course introduces another error, but with a smaller effect. Hanning- or cosine-windows work well.
> Maybe it's my singing voice. The guys in the lab will be pleased to
hear
> I'm testing the pitchtracker again :).
Sure, if we could sing that good, we wouldn't be programming stuff like this... Carsten Neubauer http://www.c14sw.de
in article 1115995137.854029.260950@g43g2000cwa.googlegroups.com,
dt@soundmathtech.com at dt@soundmathtech.com wrote on 05/13/2005 10:38:

> Forget about using FFT, cepstrum, auto/cross-correlation, AMDF/ASDF etc. > > Pitch detection problem is solved:
so you have told us.
> http://www.soundmathtech.com/p�itch
as Richard pointed out the page is at: http://www.soundmathtech.com/pitch hey Dmitry, i can download and open ordinary .zip files on my Mac, but i cannot open your multisegmented "download.cgi" file. i dno't even know what the hell it is, when what i want is your pdemo.zip file can you please put just the MATLAB files of your demo into a plain old zip file and put that up on your page? or email it to me? long ago, i *did* get your paper and i went through it carefully and told the group that it was a "jazzed up" version of AMDF that actually lost information due to the Heaviside step function that you put into it. that loss of information makes the algorithm vulnerable to getting fooled. one can create two different waveforms that will be treated identically because of the loss of information due to the use of the Heaviside step function. that means pitch errors. i know you're very proud of your algorithm and, given the state of today's USPTO, i have little doubt that you'll get your patent, but you seriously need to step back and look at it objectively. but, now that the semester is over, i am now willing to take on your challenge. if you send me a good MATLAB version of this with no .mex files (something that i can just run on MATLAB v 5.something), i will try to create some .wav files that give your algorithm trouble. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
That's the 3rd or 4th time I've seen this problem with links posted to this
group recently!  (The problem is hidden characters that don't show up in the
text-based URL.)  What is the causing that?  Some strange newsreader software?
Some other character encoding?

-- 
Jon Harris
SPAM blocked e-mail address in use.  Replace the ANIMAL with 7 to reply.

"Richard Owlett" <rowlett@atlascomm.net> wrote in message
news:1189gc3qtu145f3@corp.supernews.com...
> dt@soundmathtech.com wrote: > > > Forget about using FFT, cepstrum, auto/cross-correlation, AMDF/ASDF > > etc. > > > > Pitch detection problem is solved: > > > > http://www.soundmathtech.com/p&#4294967295;itch > > > > link as given gives 404 error > when clicked it goes to > http://www.soundmathtech.com/p=ADitch > not > http://www.soundmathtech.com/pitch > which works