comp.dsp | funding fundamental frequency(pitch)

HI,

We re working on a project dealing with south Indian
music signals.
We re right now stuck with finding out the fundamental
frequency of the signal.
Wat we ve done so far:
1)Segmented the signal and store it in an array
2)Find out DFT of the segments and store.
3)Find the cross correlation between adjacent elements
from the DFT values and store.
4) Take the maximum value from the cross correlated
values.

How do we get the fundamental frequency from this
value?
We re total newbies to dsp.So plz help us!!
Thanks
Aishwarya

Reply by mnentwig ●November 3, 20072007-11-03

I have an example program on my webpage that might get you started.
However, be aware that audio pitch detection is a science of its own. 

http://www.elisanet.fi/mnentwig/webroot/FFT_peaksearch_audio_example/index.html

To give one example, the ear may be tricked into hearing a fundamental
that isn't there: If I filter the fundamental away from a piano note,
chances are that my ear will still hear it as the original note. 
There are many older threads on this topic.

Cheers

Markus

Reply by robert bristow-johnson ●November 3, 20072007-11-03

On Nov 3, 11:53 am, "mnentwig" <mnent...@elisanet.fi> wrote:
> I have an example program on my webpage that might get you started.
> However, be aware that audio pitch detection is a science of its own.
>
> http://www.elisanet.fi/mnentwig/webroot/FFT_peaksearch_audio_example/...
>
> To give one example, the ear may be tricked into hearing a fundamental
> that isn't there: If I filter the fundamental away from a piano note,
> chances are that my ear will still hear it as the original note.
> There are many older threads on this topic.
>

i'm still of the opinion that the old AMDF (Average Magnitude
Difference Function) or a variant (like ASDF with a window and perhaps
a filter on the difference signal) is the method that makes the fewest
assumptions.  it only assumes some notion of periodicity and looks for
the best period, given some error cost weighting applied to the
difference signal (absolute value and squared are but two possible
choices).  you look for minimums in that and try to wisely choose (and
stick with) the right minimum.  that takes a little "expert systems"
or AIish thinking in the alg.

r b-j

Reply by Vladimir Vassilevsky ●November 3, 20072007-11-03

cyberaishu wrote:

> HI,
> 
> We re working on a project dealing with south Indian
> music signals.
> We re right now stuck with finding out the fundamental
> frequency of the signal.

1. What is your definition of "fundamental frequency" ? What exactly are 
you looking for?

2. Are you sure there is such fundamental frequency in your signal? 
There very well could be none.

3. Dmitry Teres claims that he invented the ultimate pitch detector. 
Search the archives of this newsgroup.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Reply by robert bristow-johnson ●November 3, 20072007-11-03

On Nov 3, 5:19 pm, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:
> cyberaishu wrote:
> > HI,
>
> > We re working on a project dealing with south Indian
> > music signals.
> > We re right now stuck with finding out the fundamental
> > frequency of the signal.
>
> 1. What is your definition of "fundamental frequency" ? What exactly are
> you looking for?
>
> 2. Are you sure there is such fundamental frequency in your signal?
> There very well could be none.
>
> 3. Dmitry Teres claims that he invented the ultimate pitch detector.
> Search the archives of this newsgroup.

i have read Dmitry's paper when he first announced it to this group,
and, not counting his alternative method that made some use of SVD
(single-value decomposition) which he did not describe in sufficient
detail for me to understand, his published method is a sorta weird
twist of AMDF, with some non-invertable non-linear operation (a step
function) applied to intermediate data and using a histogram as an
additional method of summing errors (or goodness of fit) for the
different trial periods.  it's still a method that compares
(subtracts) the assumed quasi-periodic signal to a delayed copy of
itself for some number of samples.  these delays corresponding to
various trial periods and then deciding on which trial period is the
best pick (that results in the smallest difference function).

i couldn't get his MATLAB program to work for me (and am still willing
to, if i can get it to work on Octave, i don't have a current
implementation of MATLAB), so i dunno how well it works and will not
repeat what i've heard about that (i want to judge for myself).  but,
from what i read in his paper, it's another form of the AMDF (with a
significantly souped-up means of adding up the score), even though
Dmitry had not agreed with me about that assessment of the algorithm.

r b-j

Reply by Vladimir Vassilevsky ●November 3, 20072007-11-03

robert bristow-johnson wrote:

>>3. Dmitry Teres claims that he invented the ultimate pitch detector.
>>Search the archives of this newsgroup.
> 
> 
> i have read Dmitry's paper when he first announced it to this group,
> and, not counting his alternative method that made some use of SVD
> (single-value decomposition) which he did not describe in sufficient
> detail for me to understand, his published method is a sorta weird
> twist of AMDF, with some non-invertable non-linear operation (a step
> function) applied to intermediate data and using a histogram as an
> additional method of summing errors (or goodness of fit) for the
> different trial periods.  it's still a method that compares
> (subtracts) the assumed quasi-periodic signal to a delayed copy of
> itself for some number of samples.  these delays corresponding to
> various trial periods and then deciding on which trial period is the
> best pick (that results in the smallest difference function).

I read through his patent and got the same impression. What I didn't 
understand is if, how, why and when his method is supposed to be 
superior to the well known approaches and how big is the advantage.

IMO the optimal way of the pitch detection depends on the application; 
there can't be the universal approach. If the goal is the best perceived 
quality (speech coding, speed/pitch change, etc.) then the best solution 
is the closed loop search near the possible candidates. And the 
canditates can be sorted out by either method; there is not much of a 
difference.

> i couldn't get his MATLAB program to work for me (and am still willing
> to, if i can get it to work on Octave, i don't have a current
> implementation of MATLAB),

Fie. Matlab is for stupidents; real men do their 2+2=4 without it.

> so i dunno how well it works and will not
> repeat what i've heard about that (i want to judge for myself).   but,
> from what i read in his paper, it's another form of the AMDF (with a
> significantly souped-up means of adding up the score), even though
> Dmitry had not agreed with me about that assessment of the algorithm.

I think your assessment is right. Fortunately, Dmitry didn't fall into 
fractal wavelet fuzzy genetic neural crap pseudoscience...

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Reply by robert bristow-johnson ●November 3, 20072007-11-03

On Nov 3, 7:35 pm, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:
> robert bristow-johnson wrote:
> >>3. Dmitry Teres claims that he invented the ultimate pitch detector.
> >>Search the archives of this newsgroup.
>
> > i have read Dmitry's paper when he first announced it to this group,
> > and, not counting his alternative method that made some use of SVD
> > (single-value decomposition) which he did not describe in sufficient
> > detail for me to understand, his published method is a sorta weird
> > twist of AMDF, with some non-invertable non-linear operation (a step
> > function) applied to intermediate data and using a histogram as an
> > additional method of summing errors (or goodness of fit) for the
> > different trial periods.  it's still a method that compares
> > (subtracts) the assumed quasi-periodic signal to a delayed copy of
> > itself for some number of samples.  these delays corresponding to
> > various trial periods and then deciding on which trial period is the
> > best pick (that results in the smallest difference function).
>
> I read through his patent and got the same impression.

i think his paper (ICASSP or similar conference) is on his site.  if
not, maybe i can find my copy of it laying around and send it to you.

> What I didn't
> understand is if, how, why and when his method is supposed to be
> superior to the well known approaches and how big is the advantage.

Dmitry really kicked into "salesman mode" with all the confidence and
bluster associated with it, which made it hard for me to take his alg
as seriously as he does.

> IMO the optimal way of the pitch detection depends on the application;
> there can't be the universal approach.

in the case where we're detecting the audible pitch of harmonic or
quasi-periodic notes, which is the case for monophonic sounds coming
from a very large class of pitch musical instruments, i think there
*can* be a universal approach, that can get better and better, as we
work out most of these known issues (most notably, transient problems
and the "octave problem"), and if we can tolerate some latency in the
pitch detection.

the case where this quasi-periodic nature is not a given (some bells,
transients from note attacks, and percussive sounds that can be
described as short and sorta staccato bursts of filtered noise), then
it's likely that a completely different algorithm (not based on the
delay-difference signal), perhaps some kinda peak-picking in the
windowed spectra, may have to be used.  in that sense i agree with
you.

> If the goal is the best perceived
> quality (speech coding, speed/pitch change, etc.) then the best solution
> is the closed loop search near the possible candidates. And the
> canditates can be sorted out by either method; there is not much of a
> difference.

well, not much, if the candidates all come from examining the
difference signal.  maybe a little.

but the candidate picking is the big deal.  that's still like alchemy,
very AI-ish.  that's still where the patents and trade-secrets lie.
that's where some pitch-detection algs sound better than other pitch-
detection algs.  what do you do when no candidate looks very good
(during transients or other times the input is not sufficiently quasi-
periodic)?

or the "octave problem" (lotsa different candidates all look about
equally good)?  for the case when a 440 Hz tone has a very small
amplitude (like down by 70 dB) 220 Hz tone (a some other sub-harmonic)
added to it: is it A440 (or midi note 69) or A220 (midi note 57)?  how
would we hear such a pitch?  at what threshold do you stop ignoring
the sub-harmonic?

if you hear a synthesizer oscillator hooked to a pitch detector having
such a problem, the pitch of the oscillator will jump up and down from
one possible harmonic to another, during a single note, and will sound
like dog excrement.  Tuvan throat singers can really kill a pitch-
detector with the octave problem, but other singers, singing some note
but starting out with their mouth cavity tuned to the 2nd or 3rd
harmonic and backing off from that can also cause it.

how do you get this mindless pitch-detection algorithm that looks at,
what initially appears to be a 440 Hz waveform, but ends up as a 220
Hz waveform (without glissando from one note to the other), to say at
the outset, that it was 220 Hz?  and if you bias the threshold to
choose 1/220 second as the period over the nearly equally good 1/440
second period candidate, what are you gonna do if this super small 220
Hz subharmonic gets weaker (fades to silence)?  your pitch-detector
will say it's A220 when the person listening to the note thinks it's
A440.  *that* is the octave problem.

> > i couldn't get his MATLAB program to work for me (and am still willing
> > to, if i can get it to work on Octave, i don't have a current
> > implementation of MATLAB),
>
> Fie. Matlab is for stupidents; real men do their 2+2=4 without it.

naw, MATLAB (or Octave) can be useful.  do you actually design filters
or look at FFT data or such with your own C code?  i used to do that
in the early ninety's (i even wrote a few papers with graphics
generated with my own C code), but eventually got a little lazy using
MATLAB.  the usefulness i recognized to the extent that i was very
unhappy that The Math Works (and the inventor of MATLAB, Cleve Moler)
could see no benefit to extending the language (in a backward
compatible manner) so that we could define the base or origin to the
indices of every dimension in an array.  they are very subborn about
it, and i think foolishly so.  their resistance comes from arrogance,
obstinance, and lack of vision (and NIH, the "not-invented-here"
syndrome), not because of a defensible technical reason.

personally i wish that i knew C++ a little better (or a decent OOP
from which i've been told that Smalltalk is s'posed to be) and a real
nice set of classes for representing matrices (and arrays), complex
numbers, matricies with complex elements, and such (along with methods
performing the operations that we find handy in MATLAB including
display functions), would be better and more portable to
implementations.  that is code you write for concept development and
testing could slip right into a build of a real application or
embedded target.

> > so i dunno how well it works and will not
> > repeat what i've heard about that (i want to judge for myself).   but,
> > from what i read in his paper, it's another form of the AMDF (with a
> > significantly souped-up means of adding up the score), even though
> > Dmitry had not agreed with me about that assessment of the algorithm.
>
> I think your assessment is right. Fortunately, Dmitry didn't fall into
> fractal wavelet fuzzy genetic neural crap pseudoscience...

as far as i could tell, his histograms would hit peaks at the period
(and integer multiples of the period, so Dmitry's alg still has the
"octave problem") of a periodic or quasi-periodic function.

r b-j

Reply by Ron N. ●November 4, 20072007-11-04

On Nov 3, 1:37 pm, robert bristow-johnson <r...@audioimagination.com>
wrote:
> On Nov 3, 11:53 am, "mnentwig" <mnent...@elisanet.fi> wrote:
> > To give one example, the ear may be tricked into hearing a fundamental
> > that isn't there: If I filter the fundamental away from a piano note,
> > chances are that my ear will still hear it as the original note.
> > There are many older threads on this topic.
>
> i'm still of the opinion that the old AMDF (Average Magnitude
> Difference Function) or a variant (like ASDF with a window and perhaps
> a filter on the difference signal) is the method that makes the fewest
> assumptions.

Pitch transcription is usually measured against what
a trained human musician would decide.  Is how a human
ear processes music more like an AMDF search, or more
like overlapping filter banks fed into some sort of
pattern matching process?

As for the octave problem, my guess is that the human
ear/brain doesn't really solve it.  It may guess based
on the transient preceding the sustained periodicity,
and go with that decision, even if slightly wrong.

IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M

Reply by Vladimir Vassilevsky ●November 4, 20072007-11-04


robert bristow-johnson wrote:

>>>>3. Dmitry Teres claims that he invented the ultimate pitch detector.
>>
>>>i have read Dmitry's paper when he first announced it to this group,
>>>and, not counting his alternative method that made some use of SVD
>>>(single-value decomposition) which he did not describe in sufficient
>>>detail for me to understand, his published method is a sorta weird
>>>twist of AMDF

>>I read through his patent and got the same impression.
> 
> i think his paper (ICASSP or similar conference) is on his site.  if
> not, maybe i can find my copy of it laying around and send it to you.

I couldn't find the original paper. Can you please send it to me.

>>If the goal is the best perceived
>>quality (speech coding, speed/pitch change, etc.) then the best solution
>>is the closed loop search near the possible candidates. And the
>>canditates can be sorted out by either method; there is not much of a
>>difference.
> 
> 
> but the candidate picking is the big deal.  that's still like alchemy,
> very AI-ish.  that's still where the patents and trade-secrets lie.
> that's where some pitch-detection algs sound better than other pitch-
> detection algs.

I would start with the quantitative definition of what does it mean 
"better", i.e.  what is the goal. The error in the time or frequency 
domain can be weighted against a psychoacoustic model; the best pitch 
value is the one which minimizes the error.

>  what do you do when no candidate looks very good
> (during transients or other times the input is not sufficiently quasi-
> periodic)?

Probably the model of the signal is oversimplified, so it doesn't fit 
the reality. It is a known phenomena that if the model doesn't match, 
then the most likelihood solution is unstable, since it jumps on the 
random features.

> or the "octave problem" (lotsa different candidates all look about
> equally good)?

Pick the candidate which makes for the least weighted error.

>  for the case when a 440 Hz tone has a very small
> amplitude (like down by 70 dB) 220 Hz tone (a some other sub-harmonic)
> added to it: is it A440 (or midi note 69) or A220 (midi note 57)?  how
> would we hear such a pitch?  at what threshold do you stop ignoring
> the sub-harmonic?

I would start from the most likely candidate and its nearest neighbors. 
It is very unlikely that the far harmonics or subharmonics will produce 
the minimum weighted error.

> if you hear a synthesizer oscillator hooked to a pitch detector having
> such a problem, the pitch of the oscillator will jump up and down from
> one possible harmonic to another, during a single note, and will sound
> like dog excrement.  Tuvan throat singers can really kill a pitch-
> detector with the octave problem, but other singers, singing some note
> but starting out with their mouth cavity tuned to the 2nd or 3rd
> harmonic and backing off from that can also cause it.

The model of the signal should include the hidden Markov chain to 
predict the variations.


> how do you get this mindless pitch-detection algorithm that looks at,
> what initially appears to be a 440 Hz waveform, but ends up as a 220
> Hz waveform (without glissando from one note to the other), to say at
> the outset, that it was 220 Hz?

A mindless algorithm just supplies a set of candidates, i.e. defines the 
area for the deep search.


>>Fie. Matlab is for stupidents; real men do their 2+2=4 without it.
> naw, MATLAB (or Octave) can be useful.  do you actually design filters

Yes, indeed. Actually, only the doing the things by your own hand 
provides for the understanding of the subject and develops the skill.
Look at this newsgroup: the dummies are not asking of how to do the 
2+2=4, they are asking how to get everything already available in Matlab.

> or look at FFT data or such with your own C code?

I do the plots by importing the data to Excel.

>  i used to do that
> in the early ninety's (i even wrote a few papers with graphics
> generated with my own C code), but eventually got a little lazy using
> MATLAB.  the usefulness i recognized to the extent that i was very
> unhappy that The Math Works (and the inventor of MATLAB, Cleve Moler)
> could see no benefit to extending the language (in a backward
> compatible manner) so that we could define the base or origin to the
> indices of every dimension in an array.  they are very subborn about
> it, and i think foolishly so.  their resistance comes from arrogance,
> obstinance, and lack of vision (and NIH, the "not-invented-here"
> syndrome), not because of a defensible technical reason.

MatLab is too heavy to change anything. Any modification will inevitably 
introduce some incompatibility, and the crowds of helpless idiots will 
be running around and screaming. Mr. Moler doesn't want it to happen.


> personally i wish that i knew C++ a little better (or a decent OOP
> from which i've been told that Smalltalk is s'posed to be) and a real
> nice set of classes for representing matrices (and arrays), complex
> numbers, matricies with complex elements, and such (along with methods
> performing the operations that we find handy in MATLAB including
> display functions), would be better and more portable to
> implementations.  that is code you write for concept development and
> testing could slip right into a build of a real application or
> embedded target.

Matlab advertized the ability to generate the C code for TMS320x from 
the Matlab source. I don't know how well it works in reality although I 
have some doubts about it.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Reply by robert bristow-johnson ●November 4, 20072007-11-04

On Nov 4, 3:08 pm, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:
> robert bristow-johnson wrote:
>
...
>
> > i think his paper (ICASSP or similar conference) is on his site.  if
> > not, maybe i can find my copy of it laying around and send it to you.
>
> I couldn't find the original paper. Can you please send it to me.
>

yup, it looks like http://www.soundmathtech.com/ is outa business. so
i can't find it on the web anywhere either.  here is the thread where
he first announced (as far as i can tell):

http://groups.google.com/group/comp.speech.research/browse_frm/thread/47c863a624aac888/1897d8af3742992a#1897d8af3742992a

and you can see my initial response.

but Vlad, i can't find a copy here.  i have a copy on my computer at
my home which is 4 hours away from where i am now (i work in the
Boston area, but my family is in Vermont).  so it will take a while
(and i hope i don't forget).

> >>If the goal is the best perceived
> >>quality (speech coding, speed/pitch change, etc.) then the best solution
> >>is the closed loop search near the possible candidates. And the
> >>canditates can be sorted out by either method; there is not much of a
> >>difference.
>
> > but the candidate picking is the big deal.  that's still like alchemy,
> > very AI-ish.  that's still where the patents and trade-secrets lie.
> > that's where some pitch-detection algs sound better than other pitch-
> > detection algs.
>
> I would start with the quantitative definition of what does it mean
> "better", i.e.  what is the goal. The error in the time or frequency
> domain can be weighted against a psychoacoustic model; the best pitch
> value is the one which minimizes the error.

the problem is, we don't know precisely what the psychoacoustic model
is.  we do not know precisely how humans judge the pitch of a sound
(if they judge it even *has* pitch).  normally there is a pretty high
correlation of perceived pitch to the (highest possible) fundamental
frequency if the note is a quasi-periodic function of time.  but even
though mathematically it may be a 220 Hz tone, i'll bet that if you
add a 220 Hz tone, with amplitude reduced by 70 dB, to a 440 Hz tone,
everyone (but some pitch detectors) will say the note is clearly
A440.  but mathematically it is a 220 Hz tone.  so at what threshold
do we say that attenuated odd harmonics don't count?

> >  what do you do when no candidate looks very good
> > (during transients or other times the input is not sufficiently quasi-
> > periodic)?
>
> Probably the model of the signal is oversimplified, so it doesn't fit
> the reality. It is a known phenomena that if the model doesn't match,
> then the most likelihood solution is unstable, since it jumps on the
> random features.
>
> > or the "octave problem" (lotsa different candidates all look about
> > equally good)?
>
> Pick the candidate which makes for the least weighted error.

what if that candidate is the wrong octave (as people perceive the
pitch)?

> >  for the case when a 440 Hz tone has a very small
> > amplitude (like down by 70 dB) 220 Hz tone (a some other sub-harmonic)
> > added to it: is it A440 (or midi note 69) or A220 (midi note 57)?  how
> > would we hear such a pitch?  at what threshold do you stop ignoring
> > the sub-harmonic?
>
> I would start from the most likely candidate and its nearest neighbors.
> It is very unlikely that the far harmonics or subharmonics will produce
> the minimum weighted error.

if you add a synchronous 220 Hz tone (of very low amplitude) to a 440
Hz tone, *any* mathematical measure of the candidates will show the
1/220 period to be better than the candidate at 440.

> > if you hear a synthesizer oscillator hooked to a pitch detector having
> > such a problem, the pitch of the oscillator will jump up and down from
> > one possible harmonic to another, during a single note, and will sound
> > like dog excrement.  Tuvan throat singers can really kill a pitch-
> > detector with the octave problem, but other singers, singing some note
> > but starting out with their mouth cavity tuned to the 2nd or 3rd
> > harmonic and backing off from that can also cause it.
>
> The model of the signal should include the hidden Markov chain to
> predict the variations.
>
> > how do you get this mindless pitch-detection algorithm that looks at,
> > what initially appears to be a 440 Hz waveform, but ends up as a 220
> > Hz waveform (without glissando from one note to the other), to say at
> > the outset, that it was 220 Hz?
>
> A mindless algorithm just supplies a set of candidates, i.e. defines the
> area for the deep search.
>
> >>Fie. Matlab is for stupidents; real men do their 2+2=4 without it.
> > naw, MATLAB (or Octave) can be useful.  do you actually design filters
>
> Yes, indeed. Actually, only the doing the things by your own hand
> provides for the understanding of the subject and develops the skill.
> Look at this newsgroup: the dummies are not asking of how to do the
> 2+2=4, they are asking how to get everything already available in Matlab.
>
> > or look at FFT data or such with your own C code?
>
> I do the plots by importing the data to Excel.
>
> >  i used to do that
> > in the early ninety's (i even wrote a few papers with graphics
> > generated with my own C code), but eventually got a little lazy using
> > MATLAB.  the usefulness i recognized to the extent that i was very
> > unhappy that The Math Works (and the inventor of MATLAB, Cleve Moler)
> > could see no benefit to extending the language (in a backward
> > compatible manner) so that we could define the base or origin to the
> > indices of every dimension in an array.  they are very subborn about
> > it, and i think foolishly so.  their resistance comes from arrogance,
> > obstinance, and lack of vision (and NIH, the "not-invented-here"
> > syndrome), not because of a defensible technical reason.
>
> MatLab is too heavy to change anything. Any modification will inevitably
> introduce some incompatibility,

no.  not true.  you can make mods that are guaranteed to be backward
compatible.  now if a user changes their default array (that is always
1-origin when it is first created) to an array with some other base or
origin, then that code could not have existed in the days preceding
the mod.  there is no problem with creating another structure inside a
MATLAB variable that defines the origin for every dimension (usually
it's two dimensions).

> and the crowds of helpless idiots will
> be running around and screaming. Mr. Moler doesn't want it to happen.

some of us have been screaming here.

r b-j

Previous12 3 Next

funding fundamental frequency(pitch)

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group