DSPRelated.com
Forums

Pitch Estimation using Autocorrelation

Started by olivers September 7, 2005
Rune Allnor wrote:

   ...

> I know that I percieve a changing pitch when the kettle > boild up water, but I am not able to detect any spectral > lines in a recording I made. I hav, on the other hand, > not tried to do time-domain analysis of that signal.
Take broadband noise such as the output of an FM receiver with no input, add a single tone about 10 dB down, and look at its spectrum. Sophisticated tools can find the tone if you know what to look for, but you are unlikely to find it in a spectrogram. ... Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
in article dg9ph1$469$1@avnika.corp.mot.com, Chip Wood at
chip.wood@motorola.com wrote on 09/14/2005 14:18:

> BTW, I would NEVER say that resonances determine pitch, you > misread me. The fundamental frequency and its harmonics are > generated at the source, the resonances of the cavity shape > these harmonics into the final spectra.
but i don't think we are misreading you when you are saying that the fundamental frequency (which, if it's there, can be determined by autocorrelation-like algorithms) solely defines the pitch (by use of the log2() function if pitch is measured in octaves). if that is what you're saying, i sorta agree but only in two cases: 1. input is not quasi-periodic. so no fundamental can be determined. then the perceived pitch may be very difficult for a silicon based "machine" (since you don't grant that we are machines) to be determined, if not impossible at the moment. and people may very well disagree as to what the pitch is or even if it has a pitch. 2. input is quasi-periodic, but there are "octave problems" which happen when all of the lower odd harmonics are so weak that the human hears it as one octave higher than it really is mathematically. other than that, i have a bit of experience in this also (with musical instruments and the sung human voice) and i can tell you that it is a very safe bet that pitch, in octaves, relative to pitch at fr = log2(f0/fr) where f0 is the fundamental frequency of the quasi-periodic signal that has a decent amount of energy in at least some of the lower odd harmonics and fr is the frequency of the reference pitch (like A440) or, for another example (MIDI): pitch (measured as MIDI note number) = 12*log2(f0/fr) where f0 is the fundamental frequency of the note and fr = 8.1758 Hz, so that middle C (261.625 Hz) comes out to be MIDI note number 60. you can disagree, but you'ld be wrong.
> Also, BTW, not to throw my credentials around, but I have a > PH.D in Speech Science,
so which is it? are you throwing them around or not?
> taught Speech Science and Musical > Acoustics at the University level, and have over 40 years of > experience in speech, acoustics, and DSP. Getting something > like this wrong at this point in my career rarely happens.
in article dg9oa4$427$1@avnika.corp.mot.com, Chip Wood at chip.wood@motorola.com wrote on 09/14/2005 13:57:
> Not to disagree, but I will.
so which is it? are you disagreeing or not? if so, with what in particular?
> When we psychoacousticians > refer to "pitch", it is ONLY measurable by asking a human. > True, the human response is close to the even tempered > scale, when played on a well tuned piano. Which can be tuned > by looking at the readout of a machine or by a expert's > human ear. I prefer the ear tuned piano myself. > > I will also disagree that a human is a machine. When a > machine is programmed or designed correctly and running with > no faults, the outcome is 100% predictable for repeated > identical inputs.
naw, you can toss in random number generators and fuzzy logic into the programming so kill the exact repeatability.
> The human rarely makes 100% exactly the > same response EVER to similar inputs.
if you were to replicate the exact human clone and provide that exact human with exactly the same stimulus, i am not so sure of that. of course that is not in the cards at the present.
> If the black box is not predictable then it is not a machine.
not true at all. not at all.
> BTW, I was trying to be funny about the dissertations. > Having written one myself, I know how worthless many of them are.
just not our own. :-) -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
duh.  i seemed to have missed 2 very important words: "does not".

in article BF4E12B6.A647%rbj@audioimagination.com, robert bristow-johnson at
rbj@audioimagination.com wrote on 09/14/2005 17:47:

> in article dg9ph1$469$1@avnika.corp.mot.com, Chip Wood at > chip.wood@motorola.com wrote on 09/14/2005 14:18: > >> BTW, I would NEVER say that resonances determine pitch, you >> misread me. The fundamental frequency and its harmonics are >> generated at the source, the resonances of the cavity shape >> these harmonics into the final spectra. > > but i don't think we are misreading you when you are saying that the > fundamental frequency (which, if it's there, can be determined by > autocorrelation-like algorithms) DOES NOT solely defines the pitch (by use > of the log2() function if pitch is measured in octaves). if that is what > you're saying, i sorta agree but only in two cases:
-- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Jerry Avins wrote:
> Pitch is subjective.
Even perceptual pitch can be considered objective if you consider it, not just as a single number, but as a statistical distribution, given a large enough sample set of reported observations & observers under sufficiently controlled conditions. You might even be able to model the distribution as the signal-to-noise ratio of the human ear-brain system. IMHO. YMMV. -- rhn A.T nicholson d.O.t C-o-M
Chip Wood wrote:
> Not to disagree, but I will. When we psychoacousticians > refer to "pitch", it is ONLY measurable by asking a human. > True, the human response is close to the even tempered > scale, when played on a well tuned piano. Which can be tuned > by looking at the readout of a machine or by a expert's > human ear. I prefer the ear tuned piano myself.
You contradict yourself. If you could only resolve pitch by roughly equal tempered scale intervals, then the exact tuning and intonation of a piano, often done to hundredths of a semitone (cents), would not make a difference to you. But, in terms of relative pitch (how often does one care about some isolated tone?), humans, with a bit of training, can detect the difference between equal temperament and just intonation, a much finer pitch resolution than a scale interval. IMHO. YMMV. -- rhn A.T nicholson d.O.t C-o-M
Chip Wood wrote:
> I will also disagree that a human is a machine. When a > machine is programmed or designed correctly and running with > no faults, the outcome is 100% predictable for repeated > identical inputs. The human rarely makes 100% exactly the > same response EVER to similar inputs. If the black box is > not predictable then it is not a machine.
All machines also have error rates and reliability bounds, and are not 100% predictable. You might think some little DSP card is 100% predictable, but just try to get it certified without modification for a long term space mission, heart pacemaker, enterprise/banking database application, and find out how few "9"'s it's considered to have. IMHO. YMMV. -- rhn A.T. nicholson d.O.t C-o-M
rhnlogic@yahoo.com wrote:
> Jerry Avins wrote: > >>Pitch is subjective. > > > Even perceptual pitch can be considered objective if you > consider it, not just as a single number, but as a statistical > distribution, given a large enough sample set of reported > observations & observers under sufficiently controlled conditions. > > You might even be able to model the distribution as the > signal-to-noise ratio of the human ear-brain system.
> IMHO. YMMV.
Aptly put. However it's modeled, pitch falls into the purview of psychology, not engineering. Both are science, but they are not the same. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Rune Allnor wrote:

   ...

> I don't know the English terminology of musical instruments. > I am thinking about the wood chip in the mouthpiece of the > clarinet.
Here, that's called a "reed" even the ones made of plastic. "Lip" means either edge; margin, or that part of one's face used for kissing and emptying spoons.
> >>> It is turbulence of the air blown >>>into the (modern) flute mouthpiece that resonates in the flute. >> >>How does a flute's excitation differ from an organ pipe's? > > > It doesn't, as far as I know. I haven't seen an organ pipe > up close, though.
The exciter of an organ pipe is a recorder or pennywhistle. http://www.pennywhistle.com/ Turbulence is avoided as much as possible,
>>>What determines the pitch in all these instruments is the air >>>that resonates inside some cavity. >> >>The mouth can be part of that cavity. > > > Do you have examples? Except for the kazoo?
Baroque trumpet, as I already mentioned. To a smaller extent, most brasses. A fluglehorm is easier to pull that way than a tuba. ... Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
robert bristow-johnson wrote:
"naw, you can toss in random number generators and fuzzy logic into the
programming so kill the exact repeatability."


Fuzzy logic, in itself, is completely deterministic.


-Will Dwinnell
http://will.dwinnell.com

in article 1126745948.837810.18970@z14g2000cwz.googlegroups.com, Predictor
at predictr@bellatlantic.net wrote on 09/14/2005 20:59:

> robert bristow-johnson wrote: >> naw, you can toss in random number generators and fuzzy logic into the >> programming so kill the exact repeatability. > > > Fuzzy logic, in itself, is completely deterministic. >
i thought there was a RNG in there and the thresholds required to determine if a state went one way or the other was dependent, to some degree, on such a RNG. if that is not part of the canonical "fuzzy logic", then i retract such usage and will just say that a machine with internal programming designed with some RNG that meaningfully affects its output is not 100% predictable for repeated identical inputs. same technology, different terminology. the end point is the same. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."