DSPRelated.com
Forums

Pitch detection - for a newbi

Started by EagerToLearn January 5, 2007
Ron N. wrote:

> You might want to add to your reading some books on audio > perception and the physics of musical instruments. Pitch > is different from spectra is different from pure frequency, > especially related to what humans hear. You might also > want to look into some of the literature on voice recognition, > as humans may also infer pitch from transient information > when the spectra is ambiguous, as it often is.
Which brings me to the interesting question for the OP: why, oh-why, would you want to go over so much trouble to teach your grandkids to play simple melodies by ear????? Get them to *real music classes* --- have them learn to *read* music and play seriously!! :-) Carlos --
On Sat, 06 Jan 2007 14:36:02 -0500, Carlos Moreno
<moreno_at_mochima_dot_com@mailinator.com> wrote:

>Ron N. wrote: > >> You might want to add to your reading some books on audio >> perception and the physics of musical instruments. Pitch >> is different from spectra is different from pure frequency, >> especially related to what humans hear. You might also >> want to look into some of the literature on voice recognition, >> as humans may also infer pitch from transient information >> when the spectra is ambiguous, as it often is. > >Which brings me to the interesting question for the OP: > >why, oh-why, would you want to go over so much trouble to >teach your grandkids to play simple melodies by ear????? >Get them to *real music classes* --- have them learn to >*read* music and play seriously!! > >:-) > >Carlos
Warning: the following is sort of off-topic, so proceed at your own risk! REAL and SERIOUS are relative depending on your outlook. You can roughly categorize music into two types 1) Art or Popular Music and 2) Folk or Traditional Music. Art Music is typically studied as any other disipline. Art music is typically performed by a trained musician and the audience are typically passive listeners. Folk or Traditional music is typcially played by untrained musicians and the audience typically participates, either in song or dance. Art music is typically learned in schools and doesn't evolve or change. It's written down and the tune is the tune is the tune. Folk music on the other hand is passed down from generation to generation. It's an aural tradition. The tune tends to change and evolve with each passing generation. I don't think you can say one is any more REAL or SERIOUS than the other. They are two different animals. Ear traing, the ability to distinguish and identify pitch is equally important to both types of music. We teach kids their colors, their numbers and their alphabet very early, however there is no emphasis on teachig them to recognize and identify the notes of the scale. My intention is provide my grandsons, at an early age, with an ability that will serve them well no matter what type of music they are interested in as they grow up. At some point in the future I hope they will take formal music education. Hopefully they will learn to read music and will be serious about music. Keep in mind that what music does to the heart and soul is NOT dependant on knowing how to read music or play SERIOUSLY. Some of the most emotional music I have heard has come from people who never knew nothing about music theory, music education or notes on paper, hell a lot of them never even knew how to read no less read music.
EagerToLearn wrote:
..
>>why, oh-why, would you want to go over so much trouble to >>teach your grandkids to play simple melodies by ear????? >>Get them to *real music classes* --- have them learn to >>*read* music and play seriously!!
.
> Warning: the following is sort of off-topic, so proceed at your own > risk! > > REAL and SERIOUS are relative depending on your outlook. You can > roughly categorize music into two types 1) Art or Popular Music and 2) > Folk or Traditional Music. ..
..
> > Keep in mind that what music does to the heart and soul is NOT > dependant on knowing how to read music or play SERIOUSLY. Some of the > most emotional music I have heard has come from people who never knew > nothing about music theory, music education or notes on paper, hell a > lot of them never even knew how to read no less read music.
I understood "real" was used to signify "taught by carbon-based lifeforms", rather than by silicon-based ones. Good teaching is a factor of the teacher, much more than of the style. All the same principles apply - one is limited by what one can't do. One must put in the time and focus, period. All scale-based styles require a sense of pitching and intonation (NOT necessarily equal-termperament!), rhythm and chords - the same building-blocks. And being able to read music does not disqualify one from playing music that does not use notation, it simply enables you to play from sheet music (and to write it) should the need arise (which it might). And people are themselves different in how they learn best - some are visual, other audile, so the one may benefit from teaching via written materials, the other by oral transmission. And we must beware of hidden moral imperatives - the fact that many muscians achieve greatness without being able to read music does not mean (a) they achieved it ~because~ they didn't read music, or (b) we should arbitrarily discorage others from learning to. Richard Dobson

fizteh89 wrote:


>>The most common method for the pitch computation is by normalized >>autocorrelation in the time domain. Reason: it is dead simple and good >>enough for most of applications. > > > It is NOT good enough, and everybody knows it...
It is good enough and it is used widely. However yes, the multiples of the pitch period are the problem unless the precautions are taken.
> (Otherwise why would they invent all sorts of artificial nonlinear > tricks like center-clipping ?)
Come on, leave the center clipping to the grandpas Rabiner and Shaefer. They didn't have anything better then the frequency counter, that's why.
> But again, it depends on your goals. > > >>For my tasks, I prefer frequency domain perceptually optimized methods. >> > > > Frequency domain pitch detection cannot be used for pitch tracking in > highly-nonstationary signals such as speech, and everybody knows it..
You should know that the frequency domain pitch detection is indeed highly accurate and it is one of the most popular methods (although it can be to expensive in the computation). If the signal is nonstationary, then the notion of pitch does not have much of sense.
> >>Modesty is the virtue of mediocrits, isn't it? > > > Well, in order to critisize and downplay other people's achievements > you'd better first show some real contribution of your own. Do you have > something to show, Vladimir ?
Yes. We can also compare the 1040s and some parts of the body. Any questions? Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
EagerToLearn wrote:

>>Get them to *real music classes* --- have them learn to >>*read* music and play seriously!! >> >>:-) > > Warning: the following is sort of off-topic, so proceed at your own > risk!
Ok, let's continue with the off-topic yet interesting discussion... Notice above, how I quoted my :-) ... In a sense, despite some touch of seriousness and of being convinced of what I was claiming, I tried to make it clear that there was a little bit of kidding in my comment --- that is, I'm no one to decide or try to dictate even in the slightest way, what your grand kids do with their lives and what you do about guiding them. There is maybe a bit of bitterness from my part in that, as a kid, I was kind of good with music in the informal sense (more or less what you say you'll try to do with your grand kids --- at least the way I understand it). But I never had the luck of having any formal training in music --- it was only as an adult (at age 25) that I decided to take up on it; I did ok, all things considered, but it was nowhere nearly enough I would have liked for my life, as it was too late for my brain to adapt to it (not that my brain was utterly unable to learn --- again, I think I did very good, given the circumstances). Formal training in music is wonderful, and I believe (and read) that it can greatly contribute to the mental development of a kid --- astonishingly enough, it greatly develops the ability to deal with math, and also the artistic and creative skills. You are absolutely right that there are many aspects to music, and one should not dismiss some of the aspects just because there are other aspects that may appear "better" (for a suitable definition of "better")... But you know, it sounded like an intentionally missed opportunity, when I initially read your post (again, with the touch of "kidding, more than seriously asking you to do this-or-that instead") Either way, I wish you good luck with your project (both the technical part of pitch detection that you were asking about, and the "personal" part --- that is, hope your grand kids appreciate what you are doing and develop a taste for music --- formal or informal). Cheers, Carlos --

Jeff Caunter wrote:

> I wonder how close Pitch Detection is to the techniques I use for the > detection of tones in underwater sonar?
Jeff, The problem with the audio is that the signal is rather nonstationary. I.e. it varies from period to period, and there are only several periods available to make the decision about pitch. Also, this decision has to be perceptually optimal.
> Here, I perform short, 50% ovelapped FFTs, and determine detections based > on consistency (or trend) of phase-differences between FFTs, at each bin > (there are as many 'detectors' as there are 'bins of interest' within the > spectrum). Accurate frequency determination is made by observing the mean > phase differences as each bin progresses over time (similar to vocoder). > This results in a very sensitive and accurate tone detector. FFT sizes are > varied according to the duration of the tones i.e. 'pulses' are > accommodated.
And this is the problem. In the speech processing, the typical duration of the quasi-stationarity of the pitch is at the order of 10ms, and the desired accuracy of the pitch period determination is about 50uS. In a real sonar, several FFT sizes are employed, working in
> parallel, to cover a range of pulse durations as well as constant tones, > but this may not be necessary in the case of music. I have never tried > playing music into one of my sonar processors.
You can probably do it this way, however the significant amount of post processing will be required. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
Vladimir Vassilevsky wrote:
>Yes. We can also compare the 1040s and some parts of the body. Any questions?
US 20050035878 ? Didn't work out ? Well, next time do a better prior art search and buy a book "Patent it yourself" (I can give you my copy - an excellent book BTW)
fizteh89 wrote:
> > I am not particularly enthusiastic about your solution. > > Well, you should be if you do speech processing...
always the salesman, no, Dmitry?
> > The most common method for the pitch computation is by normalized > > autocorrelation in the time domain. Reason: it is dead simple and good > > enough for most of applications. > > It is NOT good enough, and everybody knows it...
depending on how autocorrelation is done, there are problems with attack transients and plosives that AMDF might not have to the same extent. also any pitch detection algorithm suffers, to some degree, the bane of the "octave ambiguity". you can have tones that *sound* like some frequency (like 262 Hz for middle-C), yet are mathematically more accurately described as a tone an octave (or more) lower (like 131 Hz). some of this is unsolvable analytically, but perceptual heuristics might be helpful.
> (Otherwise why would they invent all sorts of artificial nonlinear > tricks like center-clipping ?) But again, it depends on your goals.
center-clipping is not always useful. center-clipping destroys information (since it is not an invertable function). the information lost could be used to differentiate one period length from another (usually this is revealed as the "octave problem" but not always).
> > > > For my tasks, I prefer frequency domain perceptually optimized methods. > > > > Frequency domain pitch detection cannot be used for pitch tracking in > highly-nonstationary signals such as speech, and everybody knows it.. > > > Modesty is the virtue of mediocrits, isn't it? > > Well, in order to critisize and downplay other people's achievements > you'd better first show some real contribution of your own. Do you have > something to show, Vladimir ?
i do. and probably so does Vladimir. r b-j
Your point about real vs. computer generated is a good one.  Oh how
lucky the person who could obtain an ear training teacher for their
daily practice sessions!!  For many people, myself included, ear
training is not trivial and takes many months or years of daily
practice to master.  

My experience of learning to play by ear involved a teacher in a group
fiddle class who would play a phrase repeatedly untill all the members
of the group had figured out how to play it by trail and error.  He
would then move on to the next phrase and graduallly add phrases until
he worked through the whole tune.  This process would take would take
about an hour for the group to 'get' the whole tune.  From my
observations, it would take students about a year of these weekly
sessions like this to gain a fair degree of skill.

Not everyone has access to 'carbon' based instructors.  So why not
ustilize a computers capability to perform the same type task and use
it daily??


On Sat, 06 Jan 2007 22:40:05 GMT, Richard Dobson
<richarddobson@blueyonder.co.uk> wrote:

>EagerToLearn wrote: >.. >>>why, oh-why, would you want to go over so much trouble to >>>teach your grandkids to play simple melodies by ear????? >>>Get them to *real music classes* --- have them learn to >>>*read* music and play seriously!! >. >> Warning: the following is sort of off-topic, so proceed at your own >> risk! >> >> REAL and SERIOUS are relative depending on your outlook. You can >> roughly categorize music into two types 1) Art or Popular Music and 2) >> Folk or Traditional Music. .. >.. >> >> Keep in mind that what music does to the heart and soul is NOT >> dependant on knowing how to read music or play SERIOUSLY. Some of the >> most emotional music I have heard has come from people who never knew >> nothing about music theory, music education or notes on paper, hell a >> lot of them never even knew how to read no less read music. > >I understood "real" was used to signify "taught by carbon-based >lifeforms", rather than by silicon-based ones. Good teaching is a factor >of the teacher, much more than of the style. All the same principles >apply - one is limited by what one can't do. One must put in the time >and focus, period. All scale-based styles require a sense of pitching >and intonation (NOT necessarily equal-termperament!), rhythm and chords >- the same building-blocks. And being able to read music does not >disqualify one from playing music that does not use notation, it simply >enables you to play from sheet music (and to write it) should the need >arise (which it might). And people are themselves different in how they >learn best - some are visual, other audile, so the one may benefit from >teaching via written materials, the other by oral transmission. > >And we must beware of hidden moral imperatives - the fact that many >muscians achieve greatness without being able to read music does not >mean (a) they achieved it ~because~ they didn't read music, or (b) we >should arbitrarily discorage others from learning to. > > > > >Richard Dobson
Well, so much for answering a newbie's question.  

I guess we're really not welcome here.  Seems like the infighting is
more important than sharing the knowlege.

I did get two references to books that I may or may not need.  Thank
you. I guess after I read the two books I already knew about, I'll be
able to decide whether the references were of any value.






On 7 Jan 2007 21:16:17 -0800, "robert bristow-johnson"
<rbj@audioimagination.com> wrote:

> >fizteh89 wrote: >> > I am not particularly enthusiastic about your solution. >> >> Well, you should be if you do speech processing... > >always the salesman, no, Dmitry? > >> > The most common method for the pitch computation is by normalized >> > autocorrelation in the time domain. Reason: it is dead simple and good >> > enough for most of applications. >> >> It is NOT good enough, and everybody knows it... > >depending on how autocorrelation is done, there are problems with >attack transients and plosives that AMDF might not have to the same >extent. also any pitch detection algorithm suffers, to some degree, >the bane of the "octave ambiguity". you can have tones that *sound* >like some frequency (like 262 Hz for middle-C), yet are mathematically >more accurately described as a tone an octave (or more) lower (like 131 >Hz). some of this is unsolvable analytically, but perceptual >heuristics might be helpful. > >> (Otherwise why would they invent all sorts of artificial nonlinear >> tricks like center-clipping ?) But again, it depends on your goals. > >center-clipping is not always useful. center-clipping destroys >information (since it is not an invertable function). the information >lost could be used to differentiate one period length from another >(usually this is revealed as the "octave problem" but not always). > >> > >> > For my tasks, I prefer frequency domain perceptually optimized methods. >> > >> >> Frequency domain pitch detection cannot be used for pitch tracking in >> highly-nonstationary signals such as speech, and everybody knows it.. >> >> > Modesty is the virtue of mediocrits, isn't it? >> >> Well, in order to critisize and downplay other people's achievements >> you'd better first show some real contribution of your own. Do you have >> something to show, Vladimir ? > >i do. and probably so does Vladimir. > >r b-j