Reply by Vladimir Malakhov●January 15, 20072007-01-15
> I can spend time with the boys each week and provide them some sort of
> music experience including ear training but it would be great if they
> had grandpa's special computer that did special things and they could
> play with it even when he wasn't there.
There are many methods for fundamental frequency detection, amplitude
based and frequency based. But main problem is in human ears. We humans
hear something different from the pure amplitude or pure frequency.
Many psychologists over the world now research it.
However in many real situations the existing methods work well. And
many people have already made their software applications based on
existing methods. I can recommend You to search for "WAV to MIDI"
freeware or shareware products. There are many such software products,
at least more then 10.
Sorry for my poor English.
Reply by ●January 13, 20072007-01-13
If the csound code mentioned earlier is not a good fit for you, I
suggest you check out the Snack library or Praat. They both contain C
code for pitch estimation, if I understand correctly. I think the
Snack code is originally from ESPS, so you may also want to check out
the ESPS code to see if it's more convenient to use. You can find
Snack and ESPS, along with a nice speech visualizer named WaveSurfer
which is built on Snack, at http://www.speech.kth.se/software/. You
can find Praat at http://www.fon.hum.uva.nl/praat/. All these links
are in the speech software listings at
Reply by Rick Lyons●January 13, 20072007-01-13
On Tue, 09 Jan 2007 10:33:19 GMT, Richard Dobson
>Music is a lot like sex - you could rely on books and machines to
>"learn" all sort of things, but when suddenly presented with a real
>person of the relevant gender for the first time, all that "knowledge"
>will likely evaporate in seconds. On the other hand, if you learn first
>from humans, ~then~ supportive learning resources of all kinds may
>indeed be valuable.
>What about a local choir? Church/school/community/whatever. The best
>start any child can get. But it is important to find someone (not
>necessarily a "professional") who has some idea how to teach.
>Of course this is all rather OT for comp.dsp, who really want to talk
>dirty about the algorithm, and who's got the best one!
Your post is not too OT.
Sex and DSP are related in the sense that the
MATLAB code that I write keeps doing to
me what I was unsuccessful in doing to
my first girlfriend in high school.
> it probably is. the middle initial "A." for the inventor is probably
> for "Andy". i never knew this guy's first name was "Harold". also the
> "Auburn Audio Technologies" must be the original parent corp to Antares
> (hadn't previously heard of that either).
Yeah, I was wondering for quite some time how the ATR-1/Auto Tune ticks and
I started clicking about the patent when I read some of the 'company
history' on the Antares website.
> he patented that??? prior art exists, at least using this same thing
> for AMDF.
I agree for the ACF computation trick but I've never seen the E(L) >= 2H(L)
criteria before and I read a huge number of papers and patents (including
from you and the IVL gang). It's a pretty amazing idea (instead of looking
for Max/Minima in the ACF directly...).
> it's not one MAC?
Well, one MAC to accumulate the new X(n)*X(n+L) and another one to subtract
the old one from the delayline, right?
> this is similar to producing these cross-product terms and running them
> into a moving sum (or moving average) filter. the old terms fall of
> the edge of the delay line and you subtract them out of the sum and add
> in the new term that pops into the delay line. there has to be a
> separate delay line for each lag's cross-product. this is effectively
> rectangular windowing the data in the sum.
Yes, I actually used that 'rectangular window trick' before to detect energy
peaks for transient detection.
> the problem of rectangularly windowing, with the discontinuity, applied
> to the autocorrelation sum is worse than if it is applied to the AMDF
> sum. if your period has an integer multiple that is slightly longer
> than the length of the summation (that delay line), you could have an
> autocorrelation peak that is at a slightly larger lag than where the
> period really is (and choose that peak location as your period). even
> for highly periodic input. you would not get that using AMDF for
> periodic input.
Good to know. It's also interesting that Hildebrand sums up the double
period for the E(L) function. I'm sure there must be a god reason for that.
Of course also for H(L) one could accumulate more than one period...
BTW: That Pitch Tracker must be amazingly good if that pitch shifter of his
is really a WSSNONANK (Wavelength Synchronized Splicing, No Overlap, NO Add,
No kidding :-).
If he really does no overlap/X-fade at all for the splicing, his pitch
tracker must be REALLY good. I fed even polyphonic signals into Auto Tune
and didn't hear any artifacts.
Reply by robert bristow-johnson●January 10, 20072007-01-10
> Having read many publications about pitch tracking I pretty much fell from
> my chair when I read US patent No. 5,973,252. If I'm not mistaken, this
> patent describes the Auto Tune algorithm by Antares Audio Tech.
it probably is. the middle initial "A." for the inventor is probably
for "Andy". i never knew this guy's first name was "Harold". also the
"Auburn Audio Technologies" must be the original parent corp to Antares
(hadn't previously heard of that either).
> This paper is definitely worth reading! The Pitch Tracking Method here is
> basically ACF but with a couple of very interesting specialties.
> 1.. The ACR is computed with a nice trick, requires only two MACs per lag
> (per Sample)
it's not one MAC?
> 2.. For every lag the accumulated energy is subtracted from the ACF
he patented that??? prior art exists, at least using this same thing
this is similar to producing these cross-product terms and running them
into a moving sum (or moving average) filter. the old terms fall of
the edge of the delay line and you subtract them out of the sum and add
in the new term that pops into the delay line. there has to be a
separate delay line for each lag's cross-product. this is effectively
rectangular windowing the data in the sum.
the problem of rectangularly windowing, with the discontinuity, applied
to the autocorrelation sum is worse than if it is applied to the AMDF
sum. if your period has an integer multiple that is slightly longer
than the length of the summation (that delay line), you could have an
autocorrelation peak that is at a slightly larger lag than where the
period really is (and choose that peak location as your period). even
for highly periodic input. you would not get that using AMDF for
> Any thoughts on this one?
them's are mine.
> (true achievements are those that bring more benefits than recognition)
unless your name is George W. Bush.
Reply by robert bristow-johnson●January 10, 20072007-01-10
> Vladimir Vassilevsky wrote:
> > 1. What is new in your method compared to AMDF (which was used in the
> > LPC10 vocoder designed in 80x) ?
> Problems with elementary math ?
you don't see the hole you're digging for yourself? some people don't
need others to like them (it's not me, but i can sorta understand
that), but most people *do* desire others' respect. you're not getting
it that way, Dmitry.
> Here is "periodicity histogram":
> hist(k)=sum H(r - |x(i) - x(i+k)|) (where H is Heaviside, or unit
> step, function)
a function of |x(i) - x(i+k)| .
> Here is AMDF function:
> amdf(k)=sum |x(i) - x(i+k)|
another function of |x(i) - x(i+k)| .
and as long as we're not being to picky about the limits on the
> and here is autocorrelation:
> corr(k) = sum x(i) * x(i+k)
corr(k) = sum x(i) * x(i+k)
= 1/2 sum |x(i)|^2 - 1/2 sum |x(i) - x(i+k)|^2
still another function of |x(i) - x(i+k)| .
> Why don't you start calling AMDF function an autocorrelation and vice
> versa, cause they sorta look the same to you ?
dunno what Vladimir is saying about it, but there is a common theme to
all of these algorithms: find out how similar x(i) is to x(i+k) for a
given value (lag) of k. for values of k where x(i) and x(i+k) are very
similar waveforms, |x(i) - x(i+k)| is low, amdf(k) is low, corr(k) is
high, and hist(k) is high.
> And BTW, you CANNOT reduce "periodicity histogram" to AMDF. Period.
your "periodicity histogram", hist(k), is a function of this difference
signal |x(i) - x(i+k)| just as amdf(k) or corr(k) is. it's true, you
cannot "reduce" hist(k) to amdf(k) because hist(k) is applying a
non-linear function to |x(i) - x(i+k)| before summing (this non-linear
function, H(.) also destroys information because it is not one-to-one
or invertible and also requires a threshold parameter, r, that somehow
has to be meaningfully determined). but the principles of all three
are the same: postulate a lag, k, and see how similar the lagged
waveform is to the original by subtracting the lagged waveform from the
original. negative errors count the same as positive errors. add up
the errors (or some function of the errors) and the lag that gives you
the least sum of error (as reflected through such function before
summing) is a good guess for the period since |x(i) - x(i+k)| would be
small if k is a period. all three algs must still worry about the
"octave problem" since a lag of 2 periods is expected to be as good as
a lag of 1 period and might be mathematically better (perhaps because
of inaudible sub-harmonics so the "2 periods" are really the "true"
period), even though the waveform still *sounds* like it should be just
the 1 period. that's where different pitch detection algoritms have
their salient properties or features.
Reply by JoergW●January 10, 20072007-01-10
I'd like to bring this discussion back to the subject (if I may):
Having read many publications about pitch tracking I pretty much fell from
my chair when I read US patent No. 5,973,252. If I'm not mistaken, this
patent describes the Auto Tune algorithm by Antares Audio Tech.
This paper is definitely worth reading! The Pitch Tracking Method here is
basically ACF but with a couple of very interesting specialties.
1.. The ACR is computed with a nice trick, requires only two MACs per lag
2.. For every lag the accumulated energy is subtracted from the ACF
I haven't tried this method but I'd imagine that this must work much better
(more reliable) than any other time-domain solution.
Any thoughts on this one?
(true achievements are those that bring more benefits than recognition)
"robert bristow-johnson" <email@example.com> wrote in message
> fizteh89 wrote:
> > > I am not particularly enthusiastic about your solution.
> > Well, you should be if you do speech processing...
> always the salesman, no, Dmitry?
> > > The most common method for the pitch computation is by normalized
> > > autocorrelation in the time domain. Reason: it is dead simple and good
> > > enough for most of applications.
> > It is NOT good enough, and everybody knows it...
> depending on how autocorrelation is done, there are problems with
> attack transients and plosives that AMDF might not have to the same
> extent. also any pitch detection algorithm suffers, to some degree,
> the bane of the "octave ambiguity". you can have tones that *sound*
> like some frequency (like 262 Hz for middle-C), yet are mathematically
> more accurately described as a tone an octave (or more) lower (like 131
> Hz). some of this is unsolvable analytically, but perceptual
> heuristics might be helpful.
> > (Otherwise why would they invent all sorts of artificial nonlinear
> > tricks like center-clipping ?) But again, it depends on your goals.
> center-clipping is not always useful. center-clipping destroys
> information (since it is not an invertable function). the information
> lost could be used to differentiate one period length from another
> (usually this is revealed as the "octave problem" but not always).
> > >
> > > For my tasks, I prefer frequency domain perceptually optimized
> > >
> > Frequency domain pitch detection cannot be used for pitch tracking in
> > highly-nonstationary signals such as speech, and everybody knows it..
> > > Modesty is the virtue of mediocrits, isn't it?
> > Well, in order to critisize and downplay other people's achievements
> > you'd better first show some real contribution of your own. Do you have
> > something to show, Vladimir ?
> i do. and probably so does Vladimir.
> r b-j
Reply by fizteh89●January 10, 20072007-01-10
Vladimir Vassilevsky wrote:
> 1. What is new in your method compared to AMDF (which was used in the
> LPC10 vocoder designed in 80x) ?
Problems with elementary math ?
Here is "periodicity histogram":
hist(k)=sum H(r - |x(i) - x(i+k)|) (where H is Heaviside, or unit
Here is AMDF function:
amdf(k)=sum |x(i) - x(i+k)|
and here is autocorrelation:
corr(k) = sum x(i) * x(i+k)
Why don't you start calling AMDF function an autocorrelation and vice
versa, cause they sorta look the same to you ?
And BTW, you CANNOT reduce "periodicity histogram" to AMDF. Period.
Reply by EagerToLearn●January 10, 20072007-01-10
Sorry, I just noticed that this topic had moved.
I too share some of your regrets about my early childhood music
experience. Until my teen age years the only music we had in our
house was a 78 RPM player with a half dozen 78s from the war years. My
sister and I would sit around the player and play a song titled
"Praise the Lord and Pass the Amunition".
When you look at some of the most successful musicians you typically
see a long history of music in their early years. I regret not having
that myself and have taken it as my special project to provide the
boys ( 2 1/2 yrs and 6 mo) with good early music experiences. The
oldest and I spent a year at a program called Wiggleowrms at the Old
Town School of Folk Music in Chicago which is a early childhood music
program that meets weekly. We had a wonderful and I wouldn't trade
the relationship we've developed during that period for anything.
I am just starting the same with the younger one. I hope our
experience will be as good as with the older one. When the time comes,
I will certainly lobby with their parents, for formal music education
for both of them. I am even considering Suzuki at 4 or 5. I am aware
of the cognitive benefits of early music and that is part of my
motivation. They both listen to Mozart (certainly not by their own
I wonder if 'play seriously' isn't an oxymoron. :-) I agree that
formal music education is wonderful. However, I believe that music
has to be first and foremost PLAYED, not studied or learned or
performed. It has to touch your heart and soul and only then can you
be a real musician. Thus my preference for folk and traditional music
(people music) over art and popular music (commercial music). As I
get oloder it seems harder and harder to preserve the traditional. I
guess I have a little bitterness too when I see traditional music
being replaced by commercial music.
As an adut I learned to play fiddle (I'm careful not to say violin)
from a teacher who taught strictly by ear. After playing self-taught
guitar for 35+ years, the experience gave me a whole new perspective
on music. My sense of timing is better, my sense of harmony is
better, I can sing better and given sufficient time, I can learn to
play just about any fiddle tune I hear. I don't need a teacher to
learn a new fiddle tune.
My teacher had an uncanny sense of just how much of a phrase to
present at a time. He was also able to identify that point in time
when most the people in the group class 'got' the melody and he could
move on to the next phrase. Using him as a model, I think it would be
a fun challange to try to computerize that process. (That of course is
what I'm good at - the computer part not the music part)
I can spend time with the boys each week and provide them some sort of
music experience including ear training but it would be great if they
had grandpa's special computer that did special things and they could
play with it even when he wasn't there.
On Sat, 06 Jan 2007 18:12:20 -0500, Carlos Moreno
>>>Get them to *real music classes* --- have them learn to
>>>*read* music and play seriously!!
>> Warning: the following is sort of off-topic, so proceed at your own
>Ok, let's continue with the off-topic yet interesting discussion...
>Notice above, how I quoted my :-) ... In a sense, despite some
>touch of seriousness and of being convinced of what I was claiming,
>I tried to make it clear that there was a little bit of kidding in
>my comment --- that is, I'm no one to decide or try to dictate even
>in the slightest way, what your grand kids do with their lives and
>what you do about guiding them.
>There is maybe a bit of bitterness from my part in that, as a kid,
>I was kind of good with music in the informal sense (more or less
>what you say you'll try to do with your grand kids --- at least the
>way I understand it). But I never had the luck of having any formal
>training in music --- it was only as an adult (at age 25) that I
>decided to take up on it; I did ok, all things considered, but it
>was nowhere nearly enough I would have liked for my life, as it was
>too late for my brain to adapt to it (not that my brain was utterly
>unable to learn --- again, I think I did very good, given the
>Formal training in music is wonderful, and I believe (and read) that
>it can greatly contribute to the mental development of a kid ---
>astonishingly enough, it greatly develops the ability to deal with
>math, and also the artistic and creative skills.
>You are absolutely right that there are many aspects to music, and
>one should not dismiss some of the aspects just because there are
>other aspects that may appear "better" (for a suitable definition
>of "better")... But you know, it sounded like an intentionally
>missed opportunity, when I initially read your post (again, with
>the touch of "kidding, more than seriously asking you to do
>Either way, I wish you good luck with your project (both the
>technical part of pitch detection that you were asking about, and
>the "personal" part --- that is, hope your grand kids appreciate
>what you are doing and develop a taste for music --- formal or