comp.dsp | Compare two audio signals

Hi, i am working on a concept where i want to compare two audio file
(content of audio file)is identical to some extent.For ex if it's two
different people saying the same words .I have to say they are equal.
I have done goggling but i not able to predict what is really needed
for my project.  I don't know ,where to start and how to start.How can
i compare audio files in that manner.Please help me out with any idea
about it.What is the correct approach for my concept.

Reply by Clay ●October 27, 20102010-10-27

On Oct 27, 8:47&#4294967295;am, nagarajan karunakaran <prassa...@gmail.com> wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.For ex if it's two
> different people saying the same words .I have to say they are equal.
> I have done goggling but i not able to predict what is really needed
> for my project. &#4294967295;I don't know ,where to start and how to start.How can
> i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

WOW! can you narrow down what you want? There is a big difference
between seeing if two different people are saying the same words or if
a musician is playing the same tune as a symphony and comparing two
different recordings of a bell.

I'm assuming that since you have to do this as a project, that you are
doing this alone. So see if you can winnow down your requirements to
something solvable by one person in a reasonable time period.

Clay

Reply by Tim Wescott ●October 27, 20102010-10-27

On 10/27/2010 05:47 AM, nagarajan karunakaran wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.For ex if it's two
> different people saying the same words .I have to say they are equal.
> I have done goggling but i not able to predict what is really needed
> for my project.  I don't know ,where to start and how to start.How can
> i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

Picking out two different recordings that have the same snippet patched 
in should be moderately easy.

Picking out two different recordings, made in two different places, of 
the same sound source may be possible, but you'd have all sorts of 
complications because of different acoustics causing different echos and 
reverberations.

But two _different_ people saying the same word?  Oh man -- that's a 
task that humans don't always get right; getting a machine that could do 
it reliably would require a team of people for a good amount of time. 
I'm not even sure it's been done, but if it is it's being done by a 
well-connected researcher who's good a writing grant proposals and at 
getting work out of his grad students.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

Reply by Vladimir Vassilevsky ●October 27, 20102010-10-27


nagarajan karunakaran wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.

Compute waterfall spectrograms, measure the normalized distance between 
them. Your professor will be more then happy.

> For ex if it's two
> different people saying the same words .I have to say they are equal.

Convert speech to text, compare the texts.

> I have done goggling but i not able to predict what is really needed
> for my project.   I don't know ,where to start and how to start.
> How can i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

To begin with, learn the basics. Fourier, Z-transform, FIR and IIR 
filters, etc.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Reply by tjc ●October 27, 20102010-10-27

At a broad level, you're looking at a classification problem here.  Whether
you're dealing with recorded speech, music, bird noises, or whatever else,
you need to define some set of signals which you're looking for, such as
words / phrases, songs, instruments, etc.  Once you have established your
"library" of possible signals, then it comes down to being able to take a
particular audio file and determine which of the library entries it is most
similar too.  If two audio files match the same library entry, you can say
that the two signals are a match.

The hard part (as previous posters have pointed out) is the classification
step.  Speech to text is a non-trivial problem, as is recognizing music,
musical instruments, etc.  If you can narrow your library down to a small
class of signals (say, single spoken words) and develop a good classifier,
then you're well on your way.  It's going to be tough otherwise.

--Tom



>Hi, i am working on a concept where i want to compare two audio file
>(content of audio file)is identical to some extent.For ex if it's two
>different people saying the same words .I have to say they are equal.
>I have done goggling but i not able to predict what is really needed
>for my project.  I don't know ,where to start and how to start.How can
>i compare audio files in that manner.Please help me out with any idea
>about it.What is the correct approach for my concept.
>

Reply by glen herrmannsfeldt ●October 27, 20102010-10-27

Tim Wescott <tim@seemywebsite.com> wrote:
(snip)

> Picking out two different recordings, made in two different places, of 
> the same sound source may be possible, but you'd have all sorts of 
> complications because of different acoustics causing different echos and 
> reverberations.

> But two _different_ people saying the same word?  Oh man -- that's a 
> task that humans don't always get right; getting a machine that could do 
> it reliably would require a team of people for a good amount of time. 
> I'm not even sure it's been done, but if it is it's being done by a 
> well-connected researcher who's good a writing grant proposals and at 
> getting work out of his grad students.

It seems that it is good enough for companies to (try to) use it.

More and more phone response systems, such as banks and airlines,
are using it.  I usually find it easier to put in the account
number or flight number using the keypad, but they expect one
to "say" the account or flight number.  Sometimes it gets it right,
other times not.

I remember one about 30 years agot that would do one digit math
problems, and ask for the answer.  Even with only ten choices,
it got it wrong fairly often.  

-- glen

Reply by Bryan ●October 27, 20102010-10-27

On Oct 27, 5:47&#4294967295;am, nagarajan karunakaran <prassa...@gmail.com> wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.For ex if it's two
> different people saying the same words .I have to say they are equal.
> I have done goggling but i not able to predict what is really needed
> for my project. &#4294967295;I don't know ,where to start and how to start.How can
> i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

As Vlad pointed out there appears to be a large knowledge gap you need
to fill if you plan on going this alone. Typically, speech recognition
(which is essentially what you're proposing) is approached from a
variety of ways. I propose after the topics Vlad suggested you look
into the following topics:

Cross-correlation
Dynamic time warping
Linear predictive coding
Formants
Markov model
Laplacian distribution

If you've never seen any of those, you may want to reconsider the
scope of your project.

Reply by Fred Marshall ●October 27, 20102010-10-27

On 10/27/2010 5:47 AM, nagarajan karunakaran wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.For ex if it's two
> different people saying the same words .I have to say they are equal.
> I have done goggling but i not able to predict what is really needed
> for my project.  I don't know ,where to start and how to start.How can
> i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

Even programs that do this like Dragon Speaking have to be "trained". 
But here you didn't say anything like "the same two people" nor did you 
mention a training step in the process.  So, I think this may be too 
ambitious for words!  :-)

Or, maybe talk to the NSA.

Fred

Reply by Tim Wescott ●October 27, 20102010-10-27

On 10/27/2010 11:26 AM, glen herrmannsfeldt wrote:
> Tim Wescott<tim@seemywebsite.com>  wrote:
> (snip)
>
>> Picking out two different recordings, made in two different places, of
>> the same sound source may be possible, but you'd have all sorts of
>> complications because of different acoustics causing different echos and
>> reverberations.
>
>> But two _different_ people saying the same word?  Oh man -- that's a
>> task that humans don't always get right; getting a machine that could do
>> it reliably would require a team of people for a good amount of time.
>> I'm not even sure it's been done, but if it is it's being done by a
>> well-connected researcher who's good a writing grant proposals and at
>> getting work out of his grad students.
>
> It seems that it is good enough for companies to (try to) use it.
>
> More and more phone response systems, such as banks and airlines,
> are using it.  I usually find it easier to put in the account
> number or flight number using the keypad, but they expect one
> to "say" the account or flight number.  Sometimes it gets it right,
> other times not.
>
> I remember one about 30 years agot that would do one digit math
> problems, and ask for the answer.  Even with only ten choices,
> it got it wrong fairly often.

Independent speaker recognition of just a few words in one language is 
heaps more reliable than independent speaker recognition of any random 
utterance in any arbitrary language.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

Reply by HardySpicer ●October 27, 20102010-10-27

On Oct 28, 1:47&#4294967295;am, nagarajan karunakaran <prassa...@gmail.com> wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.For ex if it's two
> different people saying the same words .I have to say they are equal.
> I have done goggling but i not able to predict what is really needed
> for my project. &#4294967295;I don't know ,where to start and how to start.How can
> i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

Like "kill the president" I assume or "lets bomb americans"!
Yes, could have lots of applications.


hardy

Previous12 Next

Compare two audio signals

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group