DSPRelated.com
Forums

Analyzing Cross Spectrum for Pitch.

Started by MatthewA May 17, 2016
I had absolutely no clue I'd set off such an antagonistic firestorm over something as esoteric as pitch detection/estimation.

Anyway thanks for the links.  I'm still having a hell of a time, but hey.
>I had absolutely no clue I'd set off such an antagonistic firestorm over >something as esoteric as pitch detection/estimation. > >Anyway thanks for the links. I'm still having a hell of a time, but
hey. Hope you have enjoyed the show. It's really rather a fundamental topic in relation to the Fourier Transform, not esoteric at all, even though some of the responses may seem so. I've got some more things to say, but I've been kind of busy. Stay tuned (Pun intended). Please don't be intimidated into not asking any more questions. Out of curiosity, in your original post, when you said "second harmonic", did you mean a tone of twice the frequency or one of three times the frequency of your base tone? Ced --------------------------------------- Posted through http://www.DSPRelated.com
On Wed, 25 May 2016 14:33:10 -0700 (PDT), MatthewA
<matthewaudio@gmail.com> wrote:

>I had absolutely no clue I'd set off such an antagonistic firestorm over something as esoteric as pitch detection/estimation. > >Anyway thanks for the links. I'm still having a hell of a time, but hey.
It's like that around here sometimes. There's always a tradeoff between tolerating misinformation and cluttering things up with debates. Sometimes the debates are pretty educational, though. If you're still struggling with the interpolation, this link might help. There are a lot of techniques to interpolate pitch from the DFT samples around the peak, and many are very accurate. You can often trade complexity for accuracy or bias reduction. http://www.ingelec.uns.edu.ar/pds2803/Materiales/Articulos/AnalisisFrecuencial/04205098.pdf Many of the two- or three-sample interpolation algorithm work on the assumption that there is only one tone present. The presence of other tones (e.g., harmonics) may degrade performance, and the further away the other tone is the less degradation may be present.
On Wednesday, May 25, 2016 at 5:33:14 PM UTC-4, MatthewA wrote:
> I had absolutely no clue I'd set off such an antagonistic firestorm over something as esoteric as pitch detection/estimation. >
might be useful to know if the problem you're solving is that of "pitch detection", which is finding the fundamental frequency of a periodic or nearly periodic tone. or is it the problem of identifying the frequencies of sinsoidal components in signal. usually called "sinusoidal modeling". it's all a friendly slugfest we have at comp.dsp . no moderator here, so the civility you see is all authentic. none of it is forced. r b-j
>Cedron <103185@DSPRelated> wrote: > >>What is clearly missing from your replies is any reference to the
equation
>>I derived already being known. If you have any reference to this >>equation, you should still post it. > >>If anybody else has one, it would be nice if you posted as well. > >I think that would be nice too. >
Well, I found a site with the equation, sort of: http://flylib.com/books/en/2.729.1.40/1/ Equation 3-43 To recap: The Sinc function is the result of applying the continuous FT on a rectangular continuous signal centered at the origin. The Dirichlet kernel is the result of applying a DFT on a rectangular discrete signal centered at the origin, with a sampling window the width of the rectangle. The General Form of the Dirichlet kernel is the result of applying the DFT on a rectangular signal with an arbitrary offset of the rectangle within the sampling window and an arbitrary width. This is the equation in the cited link. You can easily find web pages on the others. In Jacobsen's final version (so far) of his PDF write-up, he used the general form of the equation with the signal filling the entire sampling window and the indexing being the standard 0 to N-1. With these settings, he was able to reproduce my results. My results were achieved by deriving the formula for bin values for a complex pure tone. I actually did this the first time last year. What was new this time was deriving the magnitude function and converting it into a trigonometric (the Sine functions) form, rather than the exponential form. These two approaches may seem very different, but really they aren't. Maybe conceptually, but not mathematically. The first thing to realize is that the complex signal case is much simpler than the real signal case. In the complex signal case, the drop off in magnitudes from the main lobe are independent of the phase, and they are independent of the frequency bin. This is not true for the real case. Therefore, instead of thinking of the rectangle signal that fills the entire window as some sort of convolution window, it can be considered a pure tone with frequency zero. Centering the "convolution function" on the frequency of interest can be considered a frequency shift. Therefore, the magnitude equation I derived can properly be called the magnitude of a special case of the generalized form of the Dirichlet kernel, though I prefer the magnitude of the bin values of a complex pure tone.
>I also think that both mathematically exact derivations, >and approximate derivations, are widely used in signal processing >and in every other branch of science, and any scientist >needs to know the difference between these two, and not >try to lump them together as the same thing. >
Well said.
>I therefore think it is rather spurious to try to dismiss the >role of exact derivations, just because approximations occur >elsewhere. > >Steve
It's a qualitative distinction, even if it isn't a quantitative one. Ced --------------------------------------- Posted through http://www.DSPRelated.com
MatthewA wrote:
> I had absolutely no clue I'd set off such an antagonistic firestorm > over something as esoteric as pitch detection/estimation. >
There's no real antagonism here. Just bringing that up in the interest of calibration.
> Anyway thanks for the links. I'm still having a hell of a time, but > hey. >
:) -- Les Cargill
> Hope you have enjoyed the show.
Absolutely.
> Stay tuned
Heh!
> Out of curiosity, in your original post, when you said "second harmonic", > did you mean a tone of twice the frequency or one of three times the > frequency of your base tone?
> might be useful to know if the problem you're solving is that of "pitch detection" or is it the problem of identifying the frequencies of sinsoidal components in signal.
First off, I used sum(kf)/sum(f) with a 10 bin margin for accuracy and it's working fine. Now, I should have been a bit more specific. I have a plugin written that takes the peak frequencies of tabla (a drum where the player, more often than not, omits the &fnof;1 (mode 0,1) ) I'm trying to write an analysis algorithm that can extract the first few peaks while they are playing. The problem is that, when the fundamental is omitted, the second harmonic (2f) gets loud as if it's the fundamental. What's curious to me is that this causes the fundamental to nearly disappear from the cross spectrum despite the rest of the harmonics being &fnof;1 away from each other. I thought the cross spectrum was for finding the most common distance between peaks but I'm not very smart at this. I can *see* that the distance between the peaks is the fundamental, but writing the algorithm has stumped me. Here's two images, the top is with the &fnof;1 omitted (na), the bottom is with &fnof;1 (tun) http://imgur.com/a/o4BD1
Also, I wanted to add a huge thank you to all of you for all of this.  Cedron, that was very succinct.
On Thu, 26 May 2016 10:27:17 -0700 (PDT), MatthewA
<matthewaudio@gmail.com> wrote:

>> Hope you have enjoyed the show. =20 >Absolutely.=20 > >> Stay tuned >Heh!=20 > > >> Out of curiosity, in your original post, when you said "second harmonic", >> did you mean a tone of twice the frequency or one of three times the >> frequency of your base tone? > >> might be useful to know if the problem you're solving is that of "pitch d= >etection" or is it the problem of identifying the frequencies of sinsoidal= > components in signal.=20 > >First off, I used sum(kf)/sum(f) with a 10 bin margin for accuracy and it's= > working fine. > >Now,=20 > >I should have been a bit more specific. I have a plugin written that takes= > the peak frequencies of tabla (a drum where the player, more often than no= >t, omits the =C6=921 (mode 0,1) ) I'm trying to write an analysis algorith= >m that can extract the first few peaks while they are playing. The problem= > is that, when the fundamental is omitted, the second harmonic (2f) gets lo= >ud as if it's the fundamental. =20 > >What's curious to me is that this causes the fundamental to nearly disappea= >r from the cross spectrum despite the rest of the harmonics being =C6=921 a= >way from each other. I thought the cross spectrum was for finding the most= > common distance between peaks but I'm not very smart at this. I can *see*= > that the distance between the peaks is the fundamental, but writing the al= >gorithm has stumped me. > >Here's two images, the top is with the =C6=921 omitted (na), the bottom is = >with =C6=921 (tun) >http://imgur.com/a/o4BD1
I'm not clear on how you're getting the cross spectrum, but you should be able to improve the peak location estimate with one of the three-sample peak interpolating techniques. You should, however, calibrate the results and see how much the other harmonics reduce the reliability before putting too much weight on it. The distances between the peaks and the height of the peaks above the surrounding energy seem adequate for getting a decent amount of improvement with an interpolator, though.
On Thursday, May 26, 2016 at 1:38:24 PM UTC-4, Eric Jacobsen wrote:
> On Thu, 26 May 2016 10:27:17 -0700 (PDT), MatthewA > <matthewaudio@gmail.com> wrote: > > >> Hope you have enjoyed the show. =20 > >Absolutely.=20 > > > >> Stay tuned > >Heh!=20 > > > > > >> Out of curiosity, in your original post, when you said "second harmonic", > >> did you mean a tone of twice the frequency or one of three times the > >> frequency of your base tone? > > > >> might be useful to know if the problem you're solving is that of "pitch d= > >etection" or is it the problem of identifying the frequencies of sinsoidal= > > components in signal.=20 > > > >First off, I used sum(kf)/sum(f) with a 10 bin margin for accuracy and it's= > > working fine. > > > >Now,=20 > > > >I should have been a bit more specific. I have a plugin written that takes= > > the peak frequencies of tabla (a drum where the player, more often than no= > >t, omits the =C6=921 (mode 0,1) ) I'm trying to write an analysis algorith= > >m that can extract the first few peaks while they are playing. The problem= > > is that, when the fundamental is omitted, the second harmonic (2f) gets lo= > >ud as if it's the fundamental. =20 > > > >What's curious to me is that this causes the fundamental to nearly disappea= > >r from the cross spectrum despite the rest of the harmonics being =C6=921 a= > >way from each other. I thought the cross spectrum was for finding the most= > > common distance between peaks but I'm not very smart at this. I can *see*= > > that the distance between the peaks is the fundamental, but writing the al= > >gorithm has stumped me. > > > >Here's two images, the top is with the =C6=921 omitted (na), the bottom is = > >with =C6=921 (tun) > >http://imgur.com/a/o4BD1 > > I'm not clear on how you're getting the cross spectrum, but you should > be able to improve the peak location estimate with one of the > three-sample peak interpolating techniques. You should, however, > calibrate the results and see how much the other harmonics reduce the > reliability before putting too much weight on it. The distances > between the peaks and the height of the peaks above the surrounding > energy seem adequate for getting a decent amount of improvement with > an interpolator, though.
Not sure if this is understandable to a nonmaxer but here's my cross spectrum code. http://imgur.com/wTiXmLA I'll definitely look farther into the interpolation.