DSPRelated.com
Forums

Clipped before FFT

Started by Robert Scott May 10, 2013
I have been experimenting with some time-domain pre-processing to
improve the reliability of an FFT application.  The application is
extracting the RPM of an engine from the sound it makes.  Prior to
these experiements I was using straight FFT of blocks of time-series
sound data.  An engine generally makes a sound rich in harmonics, so
the FFT should have evenly spaced peaks.  When the peaks are found,
the fundamental frequency can be inferred and the RPM can be
calculated from that.  However the real-world sounds made by an engine
are far from ideal.  There are complex sounds not strictly related to
the fundamental of the engine rotation.  The result is an FFT whose
graph looks visually quite a mess.  Often I do see identifyable peaks
corresponding to harmonics of the engine rotation, but just as often
it is not even clear to visual inspection of the graph which peaks are
relevant and which are junk.

So here is the experiment.  When looking at the graph of the time
series of this sound, it seems periodic extremes in the + and -
direction are more predictable than the junk inbetween.  So I decided
to clip the time series before applying the FFT.  Actually, I did more
than clip the data.  I established an upper threshold (A) and a lower
threshold (B) set at 70% of the running maximum and minimum time
series values.  Then for each time series value, y, I replaced y by
+100 if it was > A, by -100 if it was < B, and 0 otherwise.  So the
time series was forced into a tri-level version of what it was before.
Then applying the same analyis as before, the results of extracting
peaks in the frequency domain appeared a little better when using the
clipped data.

What I want to know is if this method is part of some established and
studied theory.  If so, I would like some pointers to previous similar
work to see if others have come up with even better implementations of
this sort of pre-processing in applications like this.

Robert Scott
Hopkins, MN

On 5/10/2013 6:17 PM, Robert Scott wrote:
> I have been experimenting with some time-domain pre-processing to > improve the reliability of an FFT application. The application is > extracting the RPM of an engine from the sound it makes.
[...]
> +100 if it was > A, by -100 if it was < B, and 0 otherwise. So the > time series was forced into a tri-level version of what it was before. > Then applying the same analyis as before, the results of extracting > peaks in the frequency domain appeared a little better when using the > clipped data.
> What I want to know is if this method is part of some established and > studied theory. If so, I would like some pointers to previous similar > work to see if others have come up with even better implementations of > this sort of pre-processing in applications like this.
Non-linear centre clipping used to be very common in pitch detection algorithms. Rabiner's book on speech processing has a chapter about that. However, those methods are largely out of fashion; as normalized autocorrelation generally makes for better results. Vladimir Vassilevsky DSP and Mixed Signal Designs www.abvolt.com
Robert Scott <no-one@notreal.invalid> wrote:
> I have been experimenting with some time-domain pre-processing to > improve the reliability of an FFT application. The application is > extracting the RPM of an engine from the sound it makes. Prior to > these experiements I was using straight FFT of blocks of time-series > sound data. An engine generally makes a sound rich in harmonics, so > the FFT should have evenly spaced peaks.
Seems to me that you could easily have a different periodic signal high in amplitude. Say a squeaky alternator pully, for example. I have seen emission inspection systems that want to measure the engine RPM that plug into the cigarette lighter (presumably some harmonics go into the engine electrical system) and also a vibration sensor that goes on the hood. (So, mechanically coupled, not air coupled, vibrations.) -- glen
On May 10, 7:49&#4294967295;pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> On 5/10/2013 6:17 PM, Robert Scott wrote: > > > I have been experimenting with some time-domain pre-processing to > > improve the reliability of an FFT application. &#4294967295;The application is > > extracting the RPM of an engine from the sound it makes. > > [...] > > > +100 if it was > A, by -100 if it was < B, and 0 otherwise. &#4294967295;So the > > time series was forced into a tri-level version of what it was before. > > Then applying the same analyis as before, the results of extracting > > peaks in the frequency domain appeared a little better when using the > > clipped data. > > What I want to know is if this method is part of some established and > > studied theory. &#4294967295;If so, I would like some pointers to previous similar > > work to see if others have come up with even better implementations of > > this sort of pre-processing in applications like this. > > Non-linear centre clipping used to be very common in pitch detection > algorithms. Rabiner's book on speech processing has a chapter about that. > > However, those methods are largely out of fashion; as normalized > autocorrelation generally makes for better results. >
i thought that they used center clipping with autocorrelation, not FFT. don't entirely remember. i didn't like any clipping or dead zone or nonlinearity that is not 1- to-1, because, if there is a flat dead zone in there and it's not invertible, you can fool the pitch detector with a strong second and fourth harmonic where the dead zone is hiding what is different in the first half and the second half. but as Dmitry will remember, as old fashioned as it sounds, i am a partisan for something like AMDF but it's squared (so it's "ASDF"), and i like offsetting both copies by t0-T/2 and t0+T/2 to get the data centered at t0, no matter what T is. and it's good to block DC. and Rabiner&Schafer (i thought it was) point out that, for a very wide window, the autocorrelation is just like the ASDF, but upside-down and biased upward by the autocorrelation at zero lag, which is a measure of power of the waveform. dunno what works best for an engine, but estimating f0 for a musical note is, for my money, best done with something that only assumes the waveform has a sufficiently amount of periodicity in it. pretty much the same for speech (have to track pitch changes pretty fast). might work for something else. r b-j
Robert Scott <no-...@notreal.invalid> wrote:
> time series was forced into a tri-level version of what it was before .....What I want to know is if this method is part of some established and studied theory.
Clipping would seem to generate harmonics that aren't really there. And your mention of large peaks could indicate a balance problem. There's a lot of vibration info on the web - people are often very interested if their million dollar equipment is about to break or burn up, so analysis of vibrations tends to be very important. A few sites: http://www.plant-maintenance.com/maintenance_articles_vibration.shtml http://www.unitechinc.com/pdf/IntroductiontoTimeWaveformAnalysis.pdf http://www.reliabilityweb.com/fa/vibration.htm http://commtest.com/media/downloads/Beginner_Guide__Machine_Vibration.pdf There's a lot more out there, and it can be very specific to particular types of machinery (engines, generators, pumps, etc.) and the type of analysis (acoustic, mechanical, etc.). Kevin McGee
On 5/10/2013 11:16 PM, robert bristow-johnson wrote:
> On May 10, 7:49 pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
>> Non-linear centre clipping used to be very common in pitch detection >> algorithms. Rabiner's book on speech processing has a chapter about that. >> >> However, those methods are largely out of fashion; as normalized >> autocorrelation generally makes for better results. >> > > i thought that they used center clipping with autocorrelation, not > FFT. don't entirely remember.
No FFT or autocorrelation; it was lot simpler then that. For pitch detection, they used just frequency counter preceeded by center clip nonlinearity. The idea was to cut everything leaving only the strongest components.
> i didn't like any clipping or dead zone or nonlinearity that is not 1- > to-1, because, if there is a flat dead zone in there and it's not > invertible, you can fool the pitch detector with a strong second and > fourth harmonic where the dead zone is hiding what is different in the > first half and the second half.
Their goal was getting sensible result with minimal amount of processing. That was pre-DSP method.
> but as Dmitry will remember, as old fashioned as it sounds, i am a > partisan for something like AMDF but it's squared (so it's "ASDF"), > and i like offsetting both copies by t0-T/2 and t0+T/2 to get the data > centered at t0, no matter what T is. and it's good to block DC.
AFAIR studies demonstrated nearly identical performance for 1st, 2nd, and 3rd order correlation (= difference) methods for voice pitch detection; with 2nd order method been slightly better.
> dunno what works best for an engine, but estimating f0 for a musical > note is, for my money, best done with something that only assumes the > waveform has a sufficiently amount of periodicity in it. pretty much > the same for speech (have to track pitch changes pretty fast). might > work for something else.
BTW, I was surprised to see from tach data how irregular and nonuniform is rotation rate of conventional piston engine. Vladimir Vassilevsky DSP and Mixed Signal Designs www.abvolt.com
On May 11, 1:32&#4294967295;am, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> > BTW, I was surprised to see from tach data how irregular and nonuniform > is rotation rate of conventional piston engine. >
even for an engine that is not sick and with the accelerator pedal held at a fixed position? well the 4 cylinders are not positioned in a circle (like those old airplane engines), so there might be something in the 1/4th sub- harmonic. otherwise, the mechanics should be the same for every piston firing, no? r b-j
Am 11.05.2013 01:17, schrieb Robert Scott:
> I have been experimenting with some time-domain pre-processing to > improve the reliability of an FFT application. The application is > extracting the RPM of an engine from the sound it makes. Prior to > these experiements I was using straight FFT of blocks of time-series > sound data. An engine generally makes a sound rich in harmonics, so > the FFT should have evenly spaced peaks. When the peaks are found, > the fundamental frequency can be inferred and the RPM can be > calculated from that. However the real-world sounds made by an engine > are far from ideal. There are complex sounds not strictly related to > the fundamental of the engine rotation. The result is an FFT whose > graph looks visually quite a mess. Often I do see identifyable peaks > corresponding to harmonics of the engine rotation, but just as often > it is not even clear to visual inspection of the graph which peaks are > relevant and which are junk. > > So here is the experiment. When looking at the graph of the time > series of this sound, it seems periodic extremes in the + and - > direction are more predictable than the junk inbetween. So I decided > to clip the time series before applying the FFT. Actually, I did more > than clip the data. I established an upper threshold (A) and a lower > threshold (B) set at 70% of the running maximum and minimum time > series values. Then for each time series value, y, I replaced y by > +100 if it was > A, by -100 if it was < B, and 0 otherwise. So the > time series was forced into a tri-level version of what it was before. > Then applying the same analyis as before, the results of extracting > peaks in the frequency domain appeared a little better when using the > clipped data. > > What I want to know is if this method is part of some established and > studied theory. If so, I would like some pointers to previous similar > work to see if others have come up with even better implementations of > this sort of pre-processing in applications like this. > > Robert Scott > Hopkins, MN >
Hello, as others already mentioned, clipping in the time-domain will generate unwanted peaks in the frequency domain. If your application is offline and performance is no issue I would rather suggest using long FFT windows with heavy overlap and normalize each FFT result in the frequency domain. Best regards, Sebastian
I'll go out on a limb here and say this...
If you can 't judge the rpm by listening to the sound by ear, then no algorithm has a chance
I don't mean get an actual number by ear , I mean if your ear can't separate the desired rpm signal from the noise, then no algorithm will be able to
So listen to the sound first, and you may decide you need to get a better pickup point to start with
The ear/brain is very good
Mark
On 5/11/2013 1:59 AM, robert bristow-johnson wrote:
> On May 11, 1:32 am, Vladimir Vassilevsky <nos...@nowhere.com> wrote: >> >> BTW, I was surprised to see from tach data how irregular and nonuniform >> is rotation rate of conventional piston engine. >> > > even for an engine that is not sick and with the accelerator pedal > held at a fixed position? > > well the 4 cylinders are not positioned in a circle (like those old > airplane engines), so there might be something in the 1/4th sub- > harmonic. > > otherwise, the mechanics should be the same for every piston firing, > no?
Every firing is different. Vladimir Vassilevsky DSP and Mixed Signal Designs www.abvolt.com