Reply by MatthewA June 29, 20162016-06-29
Quick update just so everyone knows how totally stupid this was:  Tabla is completely enharmonic making pitch detection using multiple peaks virtually useless.

I was fooled because A: You use a finger to get higher over tones similar to a guitar string and B: the Peaks 3,4 and 5 are harmonic.

DUH.  But glad it lead to a good discussion, now the problem has become clearer yet more dense.
Reply by June 1, 20162016-06-01
On Monday, May 23, 2016 at 1:31:30 PM UTC-7, robert bristow-johnson wrote:
> On Sunday, May 22, 2016 at 10:28:21 PM UTC-4, Eric Jacobsen wrote:
(snip)
> this premise needs to be clarified, or else i will say it's not true. > >Eric, i *know* we have (in this newsgroup) been over this time and again. > it's repeating, but maybe not a periodic repeating argument we have here at comp.dsp.
> the type of signal being tranformed (discrete or continuous, finite or infinite length, > periodic or aperiodic) decides which of the "any Fourier Transform[s]" are to be used. > if the signal is uniformly-sampled discrete-time, then it's the DTFT if the length is > infinite (and it's hard to program a computer to get a job done with an infinite > number of computations) and it's the DFT if the length is finite.
> now, again our bone of contention is that i fascistically insist that when you pass > that finite-length discrete-time data to the DFT, the DFT periodically extends it and > we know in the past 2 decades, that you don't seem to agree with that.
I learned about Fourier series pretty early. I still remember my college physics TA explaining the Fourier transform as the limit of the Fourier series as the period goes to infinity, and me having no idea what he meant. Then when I finally understood, it seemed so obvious. (At 9:00 on a Monday morning, and not being awake yet, might have been partly why it wasn't so obvious.) But yes, there are two kinds of signals you can use with Fourier transforms, periodic or infinite length, the latter being the limit of the former. Note, for example, that Nyquist sampling depends on either an infinite time or periodic time, but is usually close enough for a finite number of points, if the number of points is reasonably large.
Reply by Randy Yates May 28, 20162016-05-28
MatthewA <matthewaudio@gmail.com> writes:

>> Hope you have enjoyed the show. > Absolutely. > >> Stay tuned > Heh! > > >> Out of curiosity, in your original post, when you said "second harmonic", >> did you mean a tone of twice the frequency or one of three times the >> frequency of your base tone? > >> might be useful to know if the problem you're solving is that of >> "pitch detection" or is it the problem of identifying the >> frequencies of sinsoidal components in signal. > > First off, I used sum(kf)/sum(f) with a 10 bin margin for accuracy and it's working fine. > > Now, > > I should have been a bit more specific. I have a plugin written that > takes the peak frequencies of tabla (a drum where the player, more > often than not, omits the &fnof;1 (mode 0,1) ) I'm trying to write an > analysis algorithm that can extract the first few peaks while they are > playing. The problem is that, when the fundamental is omitted, the > second harmonic (2f) gets loud as if it's the fundamental. > [...]
Matthew, I have a basic question: The cross-spectrum operates on two inputs, right? For a tabla drum, what are the two inputs? -- Randy Yates, DSP/Embedded Firmware Developer Digital Signal Labs http://www.digitalsignallabs.com
Reply by MatthewA May 27, 20162016-05-27
On Thursday, May 26, 2016 at 6:03:13 PM UTC-4, Cedron wrote:
> > > >Not sure if this is understandable to a nonmaxer but here's my cross > >spectrum code. http://imgur.com/wTiXmLA > > > > > >I'll definitely look farther into the interpolation. > > Here are some observations about the images you posted. > > First, most of what we discussed in this thread is not very relevant to > your situation. Hope you learned a little bit anyway, I think Jacobsen > did.
did very much
> Second, that looks like an awfully big DFT if all we are seeing is the > first eighth. What is your N and sampling rate?
huge. 2^14 or so at 44100. Thankfully I only plan on doing it once every 10 seconds at low priority. The idea being if the drum changes pitch, I catch it.
> Third, those drums are rather tonal. Typically, drums are the least > tonal.
Tabla is very tonal https://www.youtube.com/watch?v=xbDofgD04dc
> Fourth, approximations should be adequate for your task.
word.
> Fifth, looking at your flow chart, it looks like you are doing a lot of > extra calculations that aren't gaining you much. You should be able to > accomplish what you want from just the DFT. If you can identify at least > two or three peaks in a row, you can index them and build a data set of > (index number, estimated frequency), then do a standard linear regression > best fit on that data. The slope of the line should then be the > fundamental frequency value you are seeking.
Yes, I thought the cross spectrum would give me a peak at most common distance between peaks in the DFT but no luck. so the spectrum seems like the proper domain to be doing the analysis.
> Sixth, it will still be a challenge to mechanically figure out when a drum > hit occurs so your can frame the window.
This isn't really necessary due to the application. I've essentially knocked out the transients from the analysis signal this gives me very clear peaks.
> Good luck. Please let us know how it turns out.
will do. Thanks again.
Reply by Cedron May 26, 20162016-05-26
> >Not sure if this is understandable to a nonmaxer but here's my cross >spectrum code. http://imgur.com/wTiXmLA > > >I'll definitely look farther into the interpolation.
Here are some observations about the images you posted. First, most of what we discussed in this thread is not very relevant to your situation. Hope you learned a little bit anyway, I think Jacobsen did. Second, that looks like an awfully big DFT if all we are seeing is the first eighth. What is your N and sampling rate? Third, those drums are rather tonal. Typically, drums are the least tonal. Fourth, approximations should be adequate for your task. Fifth, looking at your flow chart, it looks like you are doing a lot of extra calculations that aren't gaining you much. You should be able to accomplish what you want from just the DFT. If you can identify at least two or three peaks in a row, you can index them and build a data set of (index number, estimated frequency), then do a standard linear regression best fit on that data. The slope of the line should then be the fundamental frequency value you are seeking. Sixth, it will still be a challenge to mechanically figure out when a drum hit occurs so your can frame the window. Good luck. Please let us know how it turns out. Ced --------------------------------------- Posted through http://www.DSPRelated.com
Reply by MatthewA May 26, 20162016-05-26
On Thursday, May 26, 2016 at 1:38:24 PM UTC-4, Eric Jacobsen wrote:
> On Thu, 26 May 2016 10:27:17 -0700 (PDT), MatthewA > <matthewaudio@gmail.com> wrote: > > >> Hope you have enjoyed the show. =20 > >Absolutely.=20 > > > >> Stay tuned > >Heh!=20 > > > > > >> Out of curiosity, in your original post, when you said "second harmonic", > >> did you mean a tone of twice the frequency or one of three times the > >> frequency of your base tone? > > > >> might be useful to know if the problem you're solving is that of "pitch d= > >etection" or is it the problem of identifying the frequencies of sinsoidal= > > components in signal.=20 > > > >First off, I used sum(kf)/sum(f) with a 10 bin margin for accuracy and it's= > > working fine. > > > >Now,=20 > > > >I should have been a bit more specific. I have a plugin written that takes= > > the peak frequencies of tabla (a drum where the player, more often than no= > >t, omits the =C6=921 (mode 0,1) ) I'm trying to write an analysis algorith= > >m that can extract the first few peaks while they are playing. The problem= > > is that, when the fundamental is omitted, the second harmonic (2f) gets lo= > >ud as if it's the fundamental. =20 > > > >What's curious to me is that this causes the fundamental to nearly disappea= > >r from the cross spectrum despite the rest of the harmonics being =C6=921 a= > >way from each other. I thought the cross spectrum was for finding the most= > > common distance between peaks but I'm not very smart at this. I can *see*= > > that the distance between the peaks is the fundamental, but writing the al= > >gorithm has stumped me. > > > >Here's two images, the top is with the =C6=921 omitted (na), the bottom is = > >with =C6=921 (tun) > >http://imgur.com/a/o4BD1 > > I'm not clear on how you're getting the cross spectrum, but you should > be able to improve the peak location estimate with one of the > three-sample peak interpolating techniques. You should, however, > calibrate the results and see how much the other harmonics reduce the > reliability before putting too much weight on it. The distances > between the peaks and the height of the peaks above the surrounding > energy seem adequate for getting a decent amount of improvement with > an interpolator, though.
Not sure if this is understandable to a nonmaxer but here's my cross spectrum code. http://imgur.com/wTiXmLA I'll definitely look farther into the interpolation.
Reply by Eric Jacobsen May 26, 20162016-05-26
On Thu, 26 May 2016 10:27:17 -0700 (PDT), MatthewA
<matthewaudio@gmail.com> wrote:

>> Hope you have enjoyed the show. =20 >Absolutely.=20 > >> Stay tuned >Heh!=20 > > >> Out of curiosity, in your original post, when you said "second harmonic", >> did you mean a tone of twice the frequency or one of three times the >> frequency of your base tone? > >> might be useful to know if the problem you're solving is that of "pitch d= >etection" or is it the problem of identifying the frequencies of sinsoidal= > components in signal.=20 > >First off, I used sum(kf)/sum(f) with a 10 bin margin for accuracy and it's= > working fine. > >Now,=20 > >I should have been a bit more specific. I have a plugin written that takes= > the peak frequencies of tabla (a drum where the player, more often than no= >t, omits the =C6=921 (mode 0,1) ) I'm trying to write an analysis algorith= >m that can extract the first few peaks while they are playing. The problem= > is that, when the fundamental is omitted, the second harmonic (2f) gets lo= >ud as if it's the fundamental. =20 > >What's curious to me is that this causes the fundamental to nearly disappea= >r from the cross spectrum despite the rest of the harmonics being =C6=921 a= >way from each other. I thought the cross spectrum was for finding the most= > common distance between peaks but I'm not very smart at this. I can *see*= > that the distance between the peaks is the fundamental, but writing the al= >gorithm has stumped me. > >Here's two images, the top is with the =C6=921 omitted (na), the bottom is = >with =C6=921 (tun) >http://imgur.com/a/o4BD1
I'm not clear on how you're getting the cross spectrum, but you should be able to improve the peak location estimate with one of the three-sample peak interpolating techniques. You should, however, calibrate the results and see how much the other harmonics reduce the reliability before putting too much weight on it. The distances between the peaks and the height of the peaks above the surrounding energy seem adequate for getting a decent amount of improvement with an interpolator, though.
Reply by MatthewA May 26, 20162016-05-26
Also, I wanted to add a huge thank you to all of you for all of this.  Cedron, that was very succinct.
Reply by MatthewA May 26, 20162016-05-26
> Hope you have enjoyed the show.
Absolutely.
> Stay tuned
Heh!
> Out of curiosity, in your original post, when you said "second harmonic", > did you mean a tone of twice the frequency or one of three times the > frequency of your base tone?
> might be useful to know if the problem you're solving is that of "pitch detection" or is it the problem of identifying the frequencies of sinsoidal components in signal.
First off, I used sum(kf)/sum(f) with a 10 bin margin for accuracy and it's working fine. Now, I should have been a bit more specific. I have a plugin written that takes the peak frequencies of tabla (a drum where the player, more often than not, omits the &fnof;1 (mode 0,1) ) I'm trying to write an analysis algorithm that can extract the first few peaks while they are playing. The problem is that, when the fundamental is omitted, the second harmonic (2f) gets loud as if it's the fundamental. What's curious to me is that this causes the fundamental to nearly disappear from the cross spectrum despite the rest of the harmonics being &fnof;1 away from each other. I thought the cross spectrum was for finding the most common distance between peaks but I'm not very smart at this. I can *see* that the distance between the peaks is the fundamental, but writing the algorithm has stumped me. Here's two images, the top is with the &fnof;1 omitted (na), the bottom is with &fnof;1 (tun) http://imgur.com/a/o4BD1
Reply by Les Cargill May 26, 20162016-05-26
MatthewA wrote:
> I had absolutely no clue I'd set off such an antagonistic firestorm > over something as esoteric as pitch detection/estimation. >
There's no real antagonism here. Just bringing that up in the interest of calibration.
> Anyway thanks for the links. I'm still having a hell of a time, but > hey. >
:) -- Les Cargill