DSPRelated.com
Forums

Naive pitch detection.

Started by Les Cargill January 18, 2014
Eric Jacobsen wrote:
> On Sun, 19 Jan 2014 12:14:50 -0600, Les Cargill > <lcargill99@comcast.com> wrote: > >> robert bristow-johnson wrote: >>> On 1/19/14 12:28 AM, Les Cargill wrote: >>>> robert bristow-johnson wrote: >>>>> On 1/18/14 7:15 PM, Les Cargill wrote: >>>>>> Eric Jacobsen wrote: >>>>>>> On Sat, 18 Jan 2014 15:14:01 -0600, Les Cargill >>>>>>> <lcargill99@comcast.com> wrote: >>>>>>> >>>>>>>> >>>>>>>> So I cut .wav files down to chunks of a certain window >>>>>>>> size/length. Read them in; trigger on some RMS leel, >>>>>>>> just make an FFT of them, compare the magnitude at the >>>>>>>> center frequency for notes of the chromatic scale, and >>>>>>>> this seems to be able to "parse" chords reasonably >>>>>>>> well. You get spurious notes ( probably harmonics or >>>>>>>> something *like* IM products ). >>>>>>>> >>>>>>>> I would have thought this to be unreliable at best. it >>>>>>>> works too well; I must be "cheating" somehow. >>>>>>>> >>>>>>>> It would be quite slow; my window size is basically >>>>>>>> 100 msec for now. that's okay for what I'm using this >>>>>>>> for. >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>>> -- Les Cargill >>>>>>> >>>>>>> What's an "RMS leel", >>>>>> >>>>>> Bads typyngs - should be "level". >>>>>> >>>>>> For a set of samples of length L ( say , 100 msec / 10 ) >>>>>> calculate the RMS of those samples using the classic >>>>>> sqrt(SUM(samp[0]...samp[k])). >>>>>> >>>>>> If said RMS is above some magic threshold, there's signal >>>>>> there. I can do this because the input signal has as much >>>>>> as -70dB noise floor so the threshold choice is pretty >>>>>> easy. >>>>>> >>>>>> Works basically like a guitar stompbox "gate" or a squelch >>>>>> control on a CB. Not as a downward expander; as a "gate". >>>>>> >>>>>>> and how do you determine "center frequency"? >>>>>>> >>>>>> >>>>>> There a wikipedia entry for musical pitches that says what >>>>>> frequency each one is centered on. You convert that to the >>>>>> index within the FFT result, based on the FFT size. >>>>>> >>>>>> Calculate the magnitude of the FFT bucket at that index, >>>>>> and presto. If it's above a different threshold, the that >>>>>> note gets a "1" for "present". >>>>>> >>>>>> For some reason, I wouldn't think this would work at all. >>>>> >>>>> so you're adding the magnitude (or squared) >>>> >>>> Magnitude - sqrt((x*x)+(j*j)) where one FFT bucket is the pair >>>> {x,j} >>>> >>>>> of adjacent FFT bins that sorta lie under each semitone >>>>> range? or is it just the FFT bin that is closest to the >>>>> center of the semitone range? >>>> >>>> Just the one bin. Yes, I was surprised, too. >>>> >>>> >>> >>> so you're not adding up anything, and then it doesn't really >>> matter if it's magnitude or magnitude-squared because you can >>> just change the threshold value. so the sqrt() is not necessary >>> for your alg. >>> >> >> True enough. >> >>> so, at the higher notes, there are more bins living between the >>> single bins you're using to represent notes. what if the >>> instrument is detuned? >> >> I am thinking I want to encourage proper tuning. >> >>> is it possible to get a peak in between those single bins and >>> that the value that "leaks" or "bleeds" into the bins is too >>> small to reach the threshold? >>> >> >> (assumption warning!) I think it doesn't matter - you'll get >> "bleed" from adjacent buckets enough in the present windowing >> strategy that a little bit of detuning seems to not hurt. >> >> Obviously, I need more data, and I'll address that. If I need to >> sum a small set of buckets, it's relatively cheap. Right now, the >> evaluation method for this assumption is too slow and I don't have >> enough trials. I have rtAudio downloaded and will buid something to >> do this much closer to real time. >> >>> to where it reached the threshold. and then how do you keep from >>> recognizing harmonics as note? >>> >> >> good question. I am pretty sure I've already seen that. I think >> that's less a concern. It may also be that IM products ca be >> detected and filtered. Dunno yet. >> >> Here's what I think I am doing - I play pedal steel, and while >> there are instructional materials, the "copedents" of pedal steel >> guitars vary enough that the meaning of the levers and pedals are >> inconsistent. >> >> Copedents are basically "which lever does what to which string?" >> >> A guy named Mickey Adams put up a bevy of instructional videos >> outlining certain patterns. these are of immense value, but it's >> still hard "parsing" what he's doing from a video. >> >> I am more or less "building" a steel right now, and have written >> some stuff to evaluate the impact of one "pull" on a given >> copedent. Basically,how many chords does this pull make available >> that were not there without it? Which strings you hit on steel is >> called a "grip", and some grips are more ... canonical than >> others. >> >> I've also written stuff that, given a sequence of notes and a cost >> function for each move, helps plan the next move for a given "song" >> or excersize. For a given steel-state, you can assign a cost to >> hitting a different string, a knee lever change, a floor pedal >> change or moving the bar. >> >> What I'd like to do is be able to have a MIDI file that describes a >> song or exercise, then have it play one chord or note at a time on >> a MIDI instrument ( perhaps a VST host thing later on ) and wait >> for you to play that set of notes, then move on. >> >> -- Les Cargill > > That's very cool. If you get it working well you might consider > talking to some harpists to see if it would be worth adapting to > harps, as I think they have similar pedal configuration issues. >
I might. Problem there is room tone. Harps tend to have fewer pedals and a lot more strings. But its cool to see somebody else who recognizes that the pedal steel is the rural/industrial cousin to the harp! Harpists are totally hardcore. It's a very physically demanding instrument. http://www.youtube.com/watch?v=I_ImURf8KUE Steel players are merely Aspie or OCD :). Look at Narvel Blackstock for what the "promotion" path for steel players is. If anybody ever had a "steel player name" it's Narvel.
> The fundamental/harmonic power ratios for harps is probably less > favorable than in a steel, though, which might be interesting to > investigate as well. > > I take it you're ignoring slides, too... ;) >
You mean slide Spanish guitar ( as in Duane Allman ), dobro/resonator ( as in Jerry Douglas ) or lap steel? So many heresies... You could do this with lap or console ( lap with legs, usually multiple necks ala Eddie Rivers of Asleep At The Wheel ) , but the real value added is disentangling all the relationships between pedals/knees/grips and bar position. The hard part will be figuring out whether to just build this for myself, open source it, or try to establish a commercial product with a "forum" or repository of songs.
> > Eric Jacobsen Anchor Hill Communications http://www.anchorhill.com >
-- Les Cargill
On Sun, 19 Jan 2014 12:53:46 -0600, Les Cargill
<lcargill99@comcast.com> wrote:

>Eric Jacobsen wrote: >> On Sun, 19 Jan 2014 12:14:50 -0600, Les Cargill >> <lcargill99@comcast.com> wrote: >> >>> robert bristow-johnson wrote: >>>> On 1/19/14 12:28 AM, Les Cargill wrote: >>>>> robert bristow-johnson wrote: >>>>>> On 1/18/14 7:15 PM, Les Cargill wrote: >>>>>>> Eric Jacobsen wrote: >>>>>>>> On Sat, 18 Jan 2014 15:14:01 -0600, Les Cargill >>>>>>>> <lcargill99@comcast.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> So I cut .wav files down to chunks of a certain window >>>>>>>>> size/length. Read them in; trigger on some RMS leel, >>>>>>>>> just make an FFT of them, compare the magnitude at the >>>>>>>>> center frequency for notes of the chromatic scale, and >>>>>>>>> this seems to be able to "parse" chords reasonably >>>>>>>>> well. You get spurious notes ( probably harmonics or >>>>>>>>> something *like* IM products ). >>>>>>>>> >>>>>>>>> I would have thought this to be unreliable at best. it >>>>>>>>> works too well; I must be "cheating" somehow. >>>>>>>>> >>>>>>>>> It would be quite slow; my window size is basically >>>>>>>>> 100 msec for now. that's okay for what I'm using this >>>>>>>>> for. >>>>>>>>> >>>>>>>>> Thoughts? >>>>>>>>> >>>>>>>>> -- Les Cargill >>>>>>>> >>>>>>>> What's an "RMS leel", >>>>>>> >>>>>>> Bads typyngs - should be "level". >>>>>>> >>>>>>> For a set of samples of length L ( say , 100 msec / 10 ) >>>>>>> calculate the RMS of those samples using the classic >>>>>>> sqrt(SUM(samp[0]...samp[k])). >>>>>>> >>>>>>> If said RMS is above some magic threshold, there's signal >>>>>>> there. I can do this because the input signal has as much >>>>>>> as -70dB noise floor so the threshold choice is pretty >>>>>>> easy. >>>>>>> >>>>>>> Works basically like a guitar stompbox "gate" or a squelch >>>>>>> control on a CB. Not as a downward expander; as a "gate". >>>>>>> >>>>>>>> and how do you determine "center frequency"? >>>>>>>> >>>>>>> >>>>>>> There a wikipedia entry for musical pitches that says what >>>>>>> frequency each one is centered on. You convert that to the >>>>>>> index within the FFT result, based on the FFT size. >>>>>>> >>>>>>> Calculate the magnitude of the FFT bucket at that index, >>>>>>> and presto. If it's above a different threshold, the that >>>>>>> note gets a "1" for "present". >>>>>>> >>>>>>> For some reason, I wouldn't think this would work at all. >>>>>> >>>>>> so you're adding the magnitude (or squared) >>>>> >>>>> Magnitude - sqrt((x*x)+(j*j)) where one FFT bucket is the pair >>>>> {x,j} >>>>> >>>>>> of adjacent FFT bins that sorta lie under each semitone >>>>>> range? or is it just the FFT bin that is closest to the >>>>>> center of the semitone range? >>>>> >>>>> Just the one bin. Yes, I was surprised, too. >>>>> >>>>> >>>> >>>> so you're not adding up anything, and then it doesn't really >>>> matter if it's magnitude or magnitude-squared because you can >>>> just change the threshold value. so the sqrt() is not necessary >>>> for your alg. >>>> >>> >>> True enough. >>> >>>> so, at the higher notes, there are more bins living between the >>>> single bins you're using to represent notes. what if the >>>> instrument is detuned? >>> >>> I am thinking I want to encourage proper tuning. >>> >>>> is it possible to get a peak in between those single bins and >>>> that the value that "leaks" or "bleeds" into the bins is too >>>> small to reach the threshold? >>>> >>> >>> (assumption warning!) I think it doesn't matter - you'll get >>> "bleed" from adjacent buckets enough in the present windowing >>> strategy that a little bit of detuning seems to not hurt. >>> >>> Obviously, I need more data, and I'll address that. If I need to >>> sum a small set of buckets, it's relatively cheap. Right now, the >>> evaluation method for this assumption is too slow and I don't have >>> enough trials. I have rtAudio downloaded and will buid something to >>> do this much closer to real time. >>> >>>> to where it reached the threshold. and then how do you keep from >>>> recognizing harmonics as note? >>>> >>> >>> good question. I am pretty sure I've already seen that. I think >>> that's less a concern. It may also be that IM products ca be >>> detected and filtered. Dunno yet. >>> >>> Here's what I think I am doing - I play pedal steel, and while >>> there are instructional materials, the "copedents" of pedal steel >>> guitars vary enough that the meaning of the levers and pedals are >>> inconsistent. >>> >>> Copedents are basically "which lever does what to which string?" >>> >>> A guy named Mickey Adams put up a bevy of instructional videos >>> outlining certain patterns. these are of immense value, but it's >>> still hard "parsing" what he's doing from a video. >>> >>> I am more or less "building" a steel right now, and have written >>> some stuff to evaluate the impact of one "pull" on a given >>> copedent. Basically,how many chords does this pull make available >>> that were not there without it? Which strings you hit on steel is >>> called a "grip", and some grips are more ... canonical than >>> others. >>> >>> I've also written stuff that, given a sequence of notes and a cost >>> function for each move, helps plan the next move for a given "song" >>> or excersize. For a given steel-state, you can assign a cost to >>> hitting a different string, a knee lever change, a floor pedal >>> change or moving the bar. >>> >>> What I'd like to do is be able to have a MIDI file that describes a >>> song or exercise, then have it play one chord or note at a time on >>> a MIDI instrument ( perhaps a VST host thing later on ) and wait >>> for you to play that set of notes, then move on. >>> >>> -- Les Cargill >> >> That's very cool. If you get it working well you might consider >> talking to some harpists to see if it would be worth adapting to >> harps, as I think they have similar pedal configuration issues. >> > >I might. Problem there is room tone. Harps tend to have fewer pedals >and a lot more strings. > >But its cool to see somebody else who recognizes that the pedal >steel is the rural/industrial cousin to the harp! > >Harpists are totally hardcore. It's a very physically demanding >instrument. http://www.youtube.com/watch?v=I_ImURf8KUE
When I was in grad school my wife played in the local orchestra. Frequently I'd take papers to grade or research or something to do and pick a primo seat in the auditorium during rehearsal and listen while I worked on whatever it was that I'd brought along that day. My favorite live performance ever was a harp quartet that visited for a performance. Truly amazing and very inspiring.
>Steel players are merely Aspie or OCD :). Look at Narvel Blackstock >for what the "promotion" path for steel players is. If anybody >ever had a "steel player name" it's Narvel. > > >> The fundamental/harmonic power ratios for harps is probably less >> favorable than in a steel, though, which might be interesting to >> investigate as well. >> >> I take it you're ignoring slides, too... ;) >> > >You mean slide Spanish guitar ( as in Duane Allman ), dobro/resonator >( as in Jerry Douglas ) or lap steel? So many heresies... > >You could do this with lap or console ( lap with legs, usually multiple >necks ala Eddie Rivers of Asleep At The Wheel ) , but the real value >added is disentangling all the relationships between pedals/knees/grips >and bar position. > >The hard part will be figuring out whether to just build this for >myself, open source it, or try to establish a commercial product with a >"forum" or repository of songs.
These days you can do both open source and commercial product if you really want to. You'll hurt a little of your own sales but potentially gain free developers/maintenance.
> >> >> Eric Jacobsen Anchor Hill Communications http://www.anchorhill.com >> > >-- >Les Cargill
Eric Jacobsen Anchor Hill Communications http://www.anchorhill.com