I realized, I should also clarify. The waveform will consistently
start at a different phase from another radio, and will progress a
particular number of cycles, damping in a very similar ratio per
cycle, yet even after I decimate by 8 from the 44k1 wav file (the
frequencies in play are below 5k), the MSE seems to love to approach 0
(probably by design, yet they can't control the individual components
and how they contribute to the onset wave)
On Nov 2, 10:12�am, jleg...@proxime.net wrote:
> Well, I ended up using cohere() in octave, and it compares exactly
> what you mentioned. �The issue is, when I look at two waveforms that I
> know are "different", ie, the initial onset waveform starts at a
> different point in the cycle than on the other (one starts at about
> 90, the other at about 240 degrees). �Essentially, we are trying to
> fingerprint some transmitters, and the visual waveforms are indeed
> unique per radio, but the MSE between them approaches 0 (to the point
> that it's nearly equal to the MSE between two waveforms from the same
> radio on some samples). �The false positive rate is a little on the
> high side. �Is it acceptable to take sliding differentials on the
> waveform with sufficient overlap and use that as another datapoint?
>
> On Oct 31, 12:57�am, jleg...@proxime.net wrote:
>
> > Excellent. �Thanks! �I'll be progressing on this over the next few
> > weeks as a side project.
>
> > On Oct 31, 1:14�am, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
>
> > > On Oct 30, 11:32�pm, jleg...@proxime.net wrote:
>
> > > > Hi, I'm trying to address a similar, yet still different problem. �I
> > > > will have hundreds of recordings consisting of about 350ms worth of
> > > > data. �The phase between them will be different, since the beginning
> > > > of the recording is using a carrier operated squelch trigger, and
> > > > buffering the 100ms before the trigger, in addition to 250ms after the
> > > > trigger. �As such, the actual beginning of the recording could be off
> > > > by up to 30 or 40 ms. �The waveforms, if viewed on a scope are nearly
> > > > identical if coming from the same source. �The waveform from a
> > > > difference source will be visually different, and have a different
> > > > "fingerprint".
>
> > > > Would using what you describe below be able to address my scenario?
>
> > > > Thanks in advance,
> > > > Jason
>
> > > > On Oct 8, 10:48�am, Le Chaud Lapin <jaibudu...@gmail.com> wrote:
>
> > > > > On Oct 8, 10:15�am, Jerry Avins <j...@ieee.org> wrote:
>
> > > > > > kieran wrote:
> > > > > > > Hello,
> > > > > > > I am trying tocomparetwo similar audio files (WAV). From what i have
> > > > > > > read i need to sample both audio files at certain frequencies and run
> > > > > > > these through a FFT and thencomparethe results. Can anyone advise me
> > > > > > > if this is the correct approach and also describe the steps i need to
> > > > > > > take to get to the stage where I cancomparethe files.
>
> > > > > >WAVfiles contain sampled data (at any of a variety of rates). What
> > > > > > would sampling them involve?
>
> > > > > I think that is what he is trying to figure out. :)
>
> > > > > > What does it mean tocomparesimilar sounds? Can you define similarity
> > > > > > with software?
>
> > > > > Again, I think he is asking for help from someone to do that for
> > > > > him :)
>
> > > > > To OP:
>
> > > > > 1. Do whatever is necessary to convert .wavfiles to their discrete-
> > > > > time signals:
>
> > > > >http://www.sonicspot.com/guide/wavefiles.html
>
> > > > > 2. time-warping might or might not be necessary depending on
> > > > > difference between two sample rates:
>
> > > > >http://en.wikipedia.org/wiki/Dynamic_time_warping
>
> > > > > 3. After time warping, truncate both signals so that their durations
> > > > > are equivalent.
>
> > > > > 4. Compute normalized energy spectral density (ESD) from DFT's two
> > > > > signals:
>
> > > > >http://en.wikipedia.org/wiki/Power_spectrum.
>
> > > > > 6. Compute mean-square-error (MSE) between normalized ESD's of two
> > > > > signals:
>
> > > > >http://en.wikipedia.org/wiki/Mean_squared_error
>
> > > > > The MSE between the normalized ESD's of two signals is good metric of
> > > > > closeness. If you have say, 10 .wavfiles, and 2 of them are nearly
> > > > > the same, but the others are not, the two that are close should have a
> > > > > relatively low MSE. Two perfectly identical signals will obviously
> > > > > have MSE of zero. Ideally, two "equivalent" signals with different
> > > > > time scales, (20-second human talking versus 5-second chipmunk),
> > > > > different energies (soft-spoken human verus yelling chipmunk), and
> > > > > different phases (sampling began at slightly different instant against
> > > > > continuous time input); should still have MSE of zero, but
> > > > > quantization errors inherent in DSP will yield MSE slightly greater
> > > > > than zero.
>
> > > Hi,
>
> > > I just took a look at the cepstral method for the first time, and it
> > > seems that the results would be better, as indicated by other
> > > posters. �It makes sense, as it takes into account the logarithmic
> > > nature of "similarity" of two utterances, where as the straight MMSE
> > > method does not.
>
> > > Still, the MMSE method, with normalization, is a good place to start,
> > > as it is the swiss-army-knife of signal estimation. �In fact, it
> > > appears that cepstral method uses same concept of MMSE, but in a
> > > different domain, that domain being the PSD of signal that is log of
> > > PSD of original signal, which kind makes sense, as hearing/speech
> > > sensitity is physiologically logarithmic anyway.
>
> > > On a related note, one can regard the cepstral method as one of a
> > > class of algorithms where MMSE technique is �applied to PSD of signal,
> > > but some transformation thereof.
>
> > > So answer is yes, you should get some positive results, but cepstral
> > > method should definitely be investigated to see just how much better
> > > it is.
>
> > > -Le Chaud Lapin-