DSPRelated.com
Forums

Pitch detection

Started by altmeyermartin March 21, 2005
In article <BE7E289E.610D%rbj@audioimagination.com>,
robert bristow-johnson  <rbj@audioimagination.com> wrote:
>... the AMDF or ASDF will find the best fit for the period, which is >influenced by all of the harmonics, and the harmonics greater in amplitude >will influence the measure more. the reciprocal of that would be called the >fundamental frequency, but it might not be exactly the same frequency as the >1st harmonic. as in the case above, if there was zero amplitude at 109.8 (i >dunno what meaning that precise frequency would have) but a decent amount of >energy at 220, 330.3, 440.8, 551.5, the AMDF will not measure a period of >1/109.8, but will be shorter than 1/110 because of the other harmonics.
...
>if you have a good (and short) sound file of >a note or even just a collection of amplitudes and frequencies that you >think would fool this, i might want to try it with a MATLAB kludge to see >if it does.
Your above example might work. With the spectral peak at 220. But I wouldn't use zero dB at 109.8, maybe some amount well under whatever the telco roll-off below 300 Hz is instead. IMHO. YMMV. -- Ron Nicholson rhn AT nicholson DOT com http://www.nicholson.com/rhn/ #include <canonical.disclaimer> // only my own opinions, etc.
in article d3bqle$msd$1@blue.rahul.net, Ronald H. Nicholson Jr. at
rhn@mauve.rahul.net wrote on 04/10/2005 14:18:

> In article <BE7E289E.610D%rbj@audioimagination.com>, > robert bristow-johnson <rbj@audioimagination.com> wrote: >> ... the AMDF or ASDF will find the best fit for the period, which is >> influenced by all of the harmonics, and the harmonics greater in amplitude >> will influence the measure more. the reciprocal of that would be called the >> fundamental frequency, but it might not be exactly the same frequency as the >> 1st harmonic. as in the case above, if there was zero amplitude at 109.8 (i >> dunno what meaning that precise frequency would have) but a decent amount of >> energy at 220, 330.3, 440.8, 551.5, the AMDF will not measure a period of >> 1/109.8, but will be shorter than 1/110 because of the other harmonics. > ... >> if you have a good (and short) sound file of >> a note or even just a collection of amplitudes and frequencies that you >> think would fool this, i might want to try it with a MATLAB kludge to see >> if it does. > > Your above example might work. With the spectral peak at 220. But > I wouldn't use zero dB at 109.8, maybe some amount well under whatever > the telco roll-off below 300 Hz is instead.
you mean zero linear amplitude at 109.8, i believe. 0 dB is a relative measure that can't include 0 linear amplitude. "0 dB" might mean full-scale. anyway, Ronald, i've been doing stuff about pitch detection and pitch shifting since 1989 and in 1992 some of my work ended up in a pretty high end commercial product and some others since then (if you wanna know who, i'll name-drop in a private e-mail). i have some good experience of what works and some other experience on what doesn't. i don't have any patents (except one stupid one from Fostex) with my name on it (because all of these companies have been relying on trade-secret, a line i haven't been crossing since AMDF and ASDF are in the literature) but there are some good patents to look at and see what some of my competitors have done. go to the USPTO and check up on "Brian Gibson" of IVL (they have a search facility) and you can see some nice patents on pitch detection and pitch shifting. some of them were patenting the obvious, but that was in his business plan also. there is also some clever little ideas that he scooped me on. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
In article <BE7EFA90.6132%rbj@audioimagination.com>,
robert bristow-johnson  <rbj@audioimagination.com> wrote:
>i've been doing stuff about pitch detection and pitch >shifting since 1989 and in 1992 some of my work ended up in a pretty high >end commercial product and some others since then
Interesting stuff. What is the figure of merit for algorithms used for your pitch detecting/shifting work? I'm mostly interested in pitch detection for the stringed instrument tuning problem (including very cheap pianos, since that's what I have), where one must deal with missing fundamentals, not only deal with but potentially measure the inharmonicity, as well as measure to a tuning accuracy sufficient to determine the tuning used (Equal temperament versus Just temperament in some key, etc.) and the amount of "stretch" present (with a figure of merit in cents = 2^(1/1200)), while also making a quick and decent probability guess at which note was just played (similar to a very simplified form of the music transcription problem) as a human interface assist in auto-configuring the tuner. One of the best jobs of tuning my cheap spinet piano was done by a blind gent who was mostly, I think, just listening for beats between various overtones of simultaneously played notes. I couldn't hear them well enough, so I eventually wrote some DSP & visualization software for my Mac and PalmPilot so that I could see them (e.g. 'scope the various harmonics of one string synced to differing harmonics of another string and/or to a calibrated tuning reference, etc.). IMHO. YMMV. -- Ron Nicholson rhn AT nicholson DOT com http://www.nicholson.com/rhn/ #include <canonical.disclaimer> // only my own opinions, etc.
In article <070420052152265179%x@x.x>, ben  <x@x.x> wrote:
>... -- so what's the difference? the frequency of >spikes make a pitch right? lots of spikes per second -- high pitched >sound. sure, the fart sample is a bit rougher and maybe gappier but >it's still the same situation isn't it? i'm definetely not seeing the >difference between frequency and pitch but i don't think i'm that >fussed about the difference (although it is very interesting -- >certainly got me thinking). maybe it is important though, i'm not sure.
Lots of frequencies might sum together to something that is recognized by the human ear/brain as just a single pitch, so there is not a one-to-one relationship. These frequencies might not be exactly harmonically related (e.g. the distance between the sound spikes might not quite be repeating in a local pattern). The recognized pitch may be for a frequency that is not at all present in the sound waveform (you can put a notch filter on the frequency of a low piano key, and you'll still think the pitch was that of that same low key). IMHO. YMMV. -- Ron Nicholson rhn AT nicholson DOT com http://www.nicholson.com/rhn/ #include <canonical.disclaimer> // only my own opinions, etc.
in article d3c7bu$9t5$1@blue.rahul.net, Ronald H. Nicholson Jr. at
rhn@mauve.rahul.net wrote on 04/10/2005 17:55:

> What is the figure of merit for algorithms used > for your pitch detecting/shifting work?
it's mostly subjective. if the pitch detector output is connected to a simple single-tone synthesizer, how does the output tone sound given the input (from some other instrument)? does it follow the input pitch well, even when it varies? does it make octave errors? what's the pitch acquisition time or throughput delay? other than the degree of vibrato (how fast and how far the pitch deviates) that can be tracked, i don't think i've quantitatively made any figure of merit. some of these issues go away for non-real-time pitch detection. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Ronald H. Nicholson Jr. wrote:
> In article <BE7EFA90.6132%rbj@audioimagination.com>, > robert bristow-johnson <rbj@audioimagination.com> wrote: > >>i've been doing stuff about pitch detection and pitch >>shifting since 1989 and in 1992 some of my work ended up in a pretty high >>end commercial product and some others since then > > > Interesting stuff. What is the figure of merit for algorithms used > for your pitch detecting/shifting work? > > I'm mostly interested in pitch detection for the stringed instrument > tuning problem (including very cheap pianos, since that's what I have), > where one must deal with missing fundamentals,
Missing fundamental? In a piano? I doubt it, no matter how cheap.
> not only deal with but > potentially measure the inharmonicity, as well as measure to a tuning > accuracy sufficient to determine the tuning used (Equal temperament versus > Just temperament in some key, etc.) and the amount of "stretch" present > (with a figure of merit in cents = 2^(1/1200)), while also making a quick > and decent probability guess at which note was just played (similar to > a very simplified form of the music transcription problem) as a human > interface assist in auto-configuring the tuner. > > One of the best jobs of tuning my cheap spinet piano was done by a blind > gent who was mostly, I think, just listening for beats between various > overtones of simultaneously played notes.
That's how every good tuner does it. It's also the reason for stretch. I wish Igor Kimpnis were still alive. He was a whiz at doing it who could also lucidly explain it all.
> I couldn't hear them well > enough, so I eventually wrote some DSP & visualization software for > my Mac and PalmPilot so that I could see them (e.g. 'scope the various > harmonics of one string synced to differing harmonics of another string > and/or to a calibrated tuning reference, etc.).
That sounds great. Lots of luck! (I'm sure there are lots of sites like http://www-scf.usc.edu/~chinghuc/pitch_detection_algorithms.htm around. Have you seen some?) Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
In article <Ov-dnWevlbH-csffRVn-oQ@rcn.net>, Jerry Avins  <jya@ieee.org> wrote:
>> I'm mostly interested in pitch detection for the stringed instrument >> tuning problem (including very cheap pianos, since that's what I have), >> where one must deal with missing fundamentals, > >Missing fundamental? In a piano? I doubt it, no matter how cheap.
I should have added that the microphone used for pitch detection was pretty cheap also. If the low octave fundamantal frequencies weren't missing, they seemed pretty well buried in roll-off loss, and room noise. IMHO. YMMV. -- Ron Nicholson rhn AT nicholson DOT com http://www.nicholson.com/rhn/ #include <canonical.disclaimer> // only my own opinions, etc.

Well it was along time ago when I did signals&systems at anu (2000) and we
had to write a FFT pitch detection thing in matlab. Actually one would
question whether it would work with something other than just phone tones
(eg to automatically determine each of the guitar or piano pitches in any
song on a CD even if more than one note at once and with harmonics etc).

A suggestion I saw from you is to take the waveform, shift it, and then
subtract it from itself. If the waveform was a sine wave of 5hz and you
shifted it by a period of 1/5 then you would get exactly zero (for example).
So you can see that this is a simple way of determining frequency
components, just by trying different shifts. The resulting values would not
be 'fourier coefficients' they would be some other kind of set of
coefficients.

Presumably then you would have to built a template library of coefficients
(within some acceptable kind of variance) that correspond with expected
response for a particular instrument at a particular pitch. You'd need to
subtract the "guitar-A1" template from it, then subtract the "piano-C3"
template, then subtract the "Drum-Kick" template (each when one thinks it
was played), then see what was left. It shouldn't matter the order you do
the subtracting in, if you template sound is good enough (it must include
the coefficents for the resonant frequencies as well), just like it doesn't
matter when we listen to music as humans whether we isolate the guitar,
piano, drum or whatever sound to see what its doing, can be done just by
changing our focus.

I'm sure its probably already been done before though.

Even if the computer can notate perfectly from CD for you, they still need
musicians for the "human aspect".. same goes for engineers etc.. since
software can invent things for us (remember the article I sent you about a
software which invented the last 10 years worth of electronic inventions
made by engineers by using a partly "trial and error" algorithm with
choosing & connecting electonic parts given known responses (behaviours) for
the parts, to automatically reach a particular system response specification
which corresponded with the discovery having been made).. but we still need
something to do thus we would train humans still.

JP


"robert bristow-johnson" <rbj@audioimagination.com> wrote in message
news:BE7887DE.5F02%rbj@audioimagination.com...
> in article 4252FD0C.BDDBEDA9@mega-nerd.com, Erik de Castro Lopo at > nospam@mega-nerd.com wrote on 04/05/2005 17:03: > > > dt@soundmathtech.com wrote: > >> > >> Didn't I tell everybody here that pitch detection problem is solved ? > >> > >> It seems that some people are unable to learn, or they are deliberatly > >> trying to mislead general public asking questions here. > >> > >> Go to http://www.soundmathtech.com/p&#4294967295;itch for more information. > >> > >> Also, US Patent Application at http://www.uspto.gov/patft > >> (Pub. No. 20030088401) > > > > Thats one solution. My guess is that there are at least 10 other > > possible solutions, at least one of which is better than whatever > > is covered by that patent. > > i'm gonna download the MATLAB stuff and see if i can try it out. > > long ago (about 2002 when it came out) i took a good look at the ICASSP > paper Dmitry published about this method, and concluded that there were
many
> elements of the AMDF function in it. it was jazzed up in some ways, but
it
> came down to subtracting shifted versions of the waveform from itself, > applying the magnitude, subtracting that magnitude difference from some > parameter (that inverts the measure), applying the unit step function, > adding up a few of these (i thought it was exceedingly small number, like
3
> terms) and then doing some kind of statistical histogram thing and looking > for a maximum. Dmitry has denied that it's merely a jazzed-up AMDF and
has
> challenged me to produce some waveform that fools it (i said that one
could
> since the unit step function throws away information and that can be > exploited to fool the algorithm). i probably should put some time into > answering that challenge but i don't have the time (it would take hours to > days to really research the thing and i ain't a grad student anymore). > > anyway to Dmitry: i think you underestimate some of the expertise in
pitch
> detection in both of these newsgroups. some of us have a great deal of > experience in it and some theoretical expertise also. the overt
confidence
> in your W.C. Fields sales technique might not be warranted. > > -- > > r b-j rbj@audioimagination.com > > "Imagination is more important than knowledge." > >
>... had to write a FFT pitch detection thing in matlab. Actually one would > question whether it would work with something other than just phone tones > (eg to automatically determine each of the guitar or piano pitches in any > song on a CD even if more than one note at once and with harmonics etc)....
The problem with "pitch detection" is that most voiced speech is essentially two-note chords. Is the pitch of a vowel its high or its low component? Most pitch tracking algorithms try to pick the fundamental, low note, but that is not entirely satisfying, in that some transitions lead to awkward transitions when the low component fades out and the high component becomes the low one. Sincerely, James Salsman -- www.readsay.com - maker of the ReadSay PROnounce English literacy system 400 MHz PDA included: $499 -- http://www.readsay.com/PROnounce.html