Rune Allnor wrote:> If it is "obvious" to an expert, true or self- > proclaimed, on speech processing that the pitch is encoded > in the time-domain autocorrelation function, it is not at > all so for the generalist or somebody who have other fields > of interest.Yesterday I checked out some of my books. It turns out that the model for speech includes a source comprising an impulse train and a formant filter. If I am right in the impression that the period between pulses determines the pitch of the speach, I agree that the pitch of the speech ought to be possible to extract from the time-domain autocorrelation function. This would be a consequence of the regular pulse pattern of the speech signal, that as far as I can tell is all but unique to speech signals. Rune
Pitch Estimation using Autocorrelation
Started by ●September 7, 2005
Reply by ●September 11, 20052005-09-11
Reply by ●September 12, 20052005-09-12
> It turns out that the model for speech includes a source > comprising an impulse train and a formant filter. If I am > right in the impression that the period between pulses > determines the pitch of the speach, I agree that the pitch > of the speech ought to be possible to extract from the > time-domain autocorrelation function. > > This would be a consequence of the regular pulse pattern > of the speech signal, that as far as I can tell is all but > unique to speech signals.This is true for voiced speech but not for unvoiced speech (which is characterized by noiselike excitation caused by airflow through a narrow constriction).
Reply by ●September 13, 20052005-09-13
Glad you got out your books on speech. Now get them out for musical acoustics. The exact same paradigm of a periodic, usually full of harmonics (depends which ones on the instrument) or aperiodic source (usually whitish noise) driving a cavity resonance exists for all acoustical instruments whether it be voice, violin, trumpet, drum, or piano. It is not unique to voice, it is the fundamental principle of all sound created by wo/man. Chip Wood "Rune Allnor" <allnor@tele.ntnu.no> wrote in message news:1126424159.755462.120720@f14g2000cwb.googlegroups.com...> > Rune Allnor wrote: > Yesterday I checked out some of my books. > > It turns out that the model for speech includes a source > comprising an impulse train and a formant filter. If I am > right in the impression that the period between pulses > determines the pitch of the speach, I agree that the pitch > of the speech ought to be possible to extract from the > time-domain autocorrelation function. >, that as far as I can tell is all but> unique to speech signals. > > Rune >
Reply by ●September 13, 20052005-09-13
I understand that many people outside of speech/audio processing area get easily confused when a discussion on pitch starts. While it is certainly understandable for ordinary folks and novices, it completely amazes me that some people in academia spend their entire successful careers (in terms of number of published papers) and retire as distinguished professors while still being totally confused about the subject. And I am not even talking about more complex and less intuitive matters like time-domain vs. short-term vs. frequency-domain analysis techniques and time-frequency resolution vs. uncertainty principle as applied to signal processing, etc. etc. For starters I can suggest automatically substituting "pitch" with "fundamental frequency" or, better yet, "fundamental period" (or "glottal period", if you want), wherever you see a discussion related to speech processing. This will greatly reduce the amount of confusion.
Reply by ●September 13, 20052005-09-13
I understand that many people outside of speech/audio processing area get easily confused when a discussion on pitch starts. While it is certainly understandable for ordinary folks and novices, it completely amazes me that some people in academia spend their entire successful careers (in terms of number of published papers) and retire as distinguished professors while still being totally confused about the subject. And I am not even talking about more complex and less intuitive matters like time-domain vs. short-term vs. frequency-domain analysis techniques and time-frequency resolution vs. uncertainty principle as applied to signal processing, etc. etc. For starters I can suggest automatically substituting "pitch" with "fundamental frequency" or, better yet, "fundamental period" (or "glottal period", if you want), wherever you see a discussion related to speech processing. This will greatly reduce the amount of confusion.
Reply by ●September 13, 20052005-09-13
fizteh89 wrote:> For starters I can suggest automatically substituting "pitch" with > "fundamental frequency" or, better yet, "fundamental period" > (or "glottal period", if you want), wherever you see a discussion > related to speech processing. > This will greatly reduce the amount of confusion.I am not sure that "pitch" and fundamental frequency or period should be considered identical. I prefer to use the term pitch in reference to music or sound perception. But there can be frequency components in an audio signal which are not normally perceived as pitch (masked frequency bands, sub-harmonics, beating, etc.) And vice-versa (making a tune by playing back sound samples of an car crash at different sample rates). IMHO. YMMV. -- rhn A.T nicholson D.o.T c-O-m
Reply by ●September 13, 20052005-09-13
fizteh89 wrote:> I understand that many people outside of speech/audio processing area > get easily confused when a discussion on pitch starts. > While it is certainly understandable for ordinary folks and novices, it > completely amazes me that some people in academia spend their entire > successful careers (in terms of number of published papers) and retire > as distinguished professors while still being totally confused about > the subject. > > And I am not even talking about more complex and less intuitive matters > like time-domain vs. short-term vs. frequency-domain analysis > techniques and time-frequency resolution vs. uncertainty principle as > applied to signal processing, etc. etc. > > For starters I can suggest automatically substituting "pitch" with > "fundamental frequency" or, better yet, "fundamental period" > (or "glottal period", if you want), wherever you see a discussion > related to speech processing. > This will greatly reduce the amount of confusion.Einstein reminded us that phenomena should be described as simply as possible, but not more simply than that. It is possible to claim that a falling tree makes no sound if there is no listener, but that complicates descriptions of events. Likewise, if one equates pitch to fundamental frequency, then one must invent "perceived pitch" which is not the same thing. R.B-J.'s example of a second harmonic at 0 dB and a fundamental at -60 is an example of perceived pitch -- simpler to just call it pitch -- being higher than the fundamental; synthetic bass is one where the pitch is lower. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●September 13, 20052005-09-13
Einstein reminded us of many things... One of my favorite quotes is: "A practical profession is a salvation for a man of my type; an academic career compels a young man to scientific production, and only strong characters can resist the temptation of superficial analysis." You are mixing subjective, psychoacoustic, definitions with objective, computer-measurable ones. Human hearing is a rather peculiar instrument and can be easily fooled, no doubt. (Have you heard about Huggins pitch or Fourcin pitch?) But does it have much to do with tracking exact period of a constantly changing voice signal in a low-bit-rate vocoder or a pitch-synchronous front-end feature extractor for a speech-recognition application?
Reply by ●September 13, 20052005-09-13
fizteh89 wrote:> Einstein reminded us of many things... One of my favorite quotes is: > "A practical profession is a salvation for a man of my type; an > academic career compels a young man to scientific production, and only > strong characters can resist the temptation of superficial analysis." > > You are mixing subjective, psychoacoustic, definitions with objective, > computer-measurable ones.I'm not mixing them, but disentangling them. Either we have pitch and frequency as synonyms (so requiring perceived pitch as a distinction) or re assign objective, computer-measurable attributes to frequency and subjective and psychoacoustic attributes to pitch, which many people do anyway.> Human hearing is a rather peculiar instrument and can be easily fooled, > no doubt. > (Have you heard about Huggins pitch or Fourcin pitch?)I assume that effects that are manifest only binaurally are outside what I thing this discussion is about.> But does it have much to do with tracking exact period of a constantly > changing voice signal in a low-bit-rate vocoder or a pitch-synchronous > front-end feature extractor for a speech-recognition application?I don't know. I imagine that what a human can't hear is not important for reproducing speech, but it might be important for distinguishing speakers. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●September 13, 20052005-09-13
Hey guys, it is not rocket science. "Pitch' and "loudness" are terms of perception that are measurable only by asking a human. "Fundamental frequency" (or its inverse "Fundamental period"), and "Sound Pressure Level (SPL)" are physical entities measurable by machines. There is much correlation (basically a log function) between each pair, BUT many other elements also factor in to their relationships. As I said before, many a Ph.D dissertation have been written and many full professorships have been attained studying these other relationships. "Heavy" and "weight" are another pair. "Hot"/"cold" vs "temperature". "Color" vs "spectra". This is all Perception 101! And if you want to work or even dabble in the world of humans' response to physical events and don't know S.S. Stevens, you should! He invented psychophysics back in the 1930s. The good engineer and scientist makes these leaps back and forth between perception and physical almost unconsciously and may occasionally use the wrong term (as I myself have been guilty of) in front of naive listeners, but they are distinctly different terms with very different meanings. The tree in the forest w/o a human to hear?- It had no loudness , but high SPL. My wife gave me a t-shirt that reads: "If a man makes a statement in the forest and there is no woman to hear, is he still wrong?" Want to start another discussion- What is sound "quality"? -- Chip Wood "Jerry Avins" <jya@ieee.org> wrote in message news:qP-dnUN44fkIgLreRVn-2w@rcn.net...> fizteh89 wrote: > > I understand that many people outside of speech/audioprocessing area> > get easily confused when a discussion on pitch starts. > > While it is certainly understandable for ordinary folksand novices, it> > completely amazes me that some people in academia spendtheir entire> > successful careers (in terms of number of publishedpapers) and retire> > as distinguished professors while still being totallyconfused about> > the subject.