DSPRelated.com
Forums

Compare two audio signals

Started by nagarajan karunakaran October 27, 2010
Hi, i am working on a concept where i want to compare two audio file
(content of audio file)is identical to some extent.For ex if it's two
different people saying the same words .I have to say they are equal.
I have done goggling but i not able to predict what is really needed
for my project.  I don't know ,where to start and how to start.How can
i compare audio files in that manner.Please help me out with any idea
about it.What is the correct approach for my concept.
On Oct 27, 8:47&#4294967295;am, nagarajan karunakaran <prassa...@gmail.com> wrote:
> Hi, i am working on a concept where i want to compare two audio file > (content of audio file)is identical to some extent.For ex if it's two > different people saying the same words .I have to say they are equal. > I have done goggling but i not able to predict what is really needed > for my project. &#4294967295;I don't know ,where to start and how to start.How can > i compare audio files in that manner.Please help me out with any idea > about it.What is the correct approach for my concept.
WOW! can you narrow down what you want? There is a big difference between seeing if two different people are saying the same words or if a musician is playing the same tune as a symphony and comparing two different recordings of a bell. I'm assuming that since you have to do this as a project, that you are doing this alone. So see if you can winnow down your requirements to something solvable by one person in a reasonable time period. Clay
On 10/27/2010 05:47 AM, nagarajan karunakaran wrote:
> Hi, i am working on a concept where i want to compare two audio file > (content of audio file)is identical to some extent.For ex if it's two > different people saying the same words .I have to say they are equal. > I have done goggling but i not able to predict what is really needed > for my project. I don't know ,where to start and how to start.How can > i compare audio files in that manner.Please help me out with any idea > about it.What is the correct approach for my concept.
Picking out two different recordings that have the same snippet patched in should be moderately easy. Picking out two different recordings, made in two different places, of the same sound source may be possible, but you'd have all sorts of complications because of different acoustics causing different echos and reverberations. But two _different_ people saying the same word? Oh man -- that's a task that humans don't always get right; getting a machine that could do it reliably would require a team of people for a good amount of time. I'm not even sure it's been done, but if it is it's being done by a well-connected researcher who's good a writing grant proposals and at getting work out of his grad students. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" was written for you. See details at http://www.wescottdesign.com/actfes/actfes.html

nagarajan karunakaran wrote:
> Hi, i am working on a concept where i want to compare two audio file > (content of audio file)is identical to some extent.
Compute waterfall spectrograms, measure the normalized distance between them. Your professor will be more then happy.
> For ex if it's two > different people saying the same words .I have to say they are equal.
Convert speech to text, compare the texts.
> I have done goggling but i not able to predict what is really needed > for my project. I don't know ,where to start and how to start. > How can i compare audio files in that manner.Please help me out with any idea > about it.What is the correct approach for my concept.
To begin with, learn the basics. Fourier, Z-transform, FIR and IIR filters, etc. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
At a broad level, you're looking at a classification problem here.  Whether
you're dealing with recorded speech, music, bird noises, or whatever else,
you need to define some set of signals which you're looking for, such as
words / phrases, songs, instruments, etc.  Once you have established your
"library" of possible signals, then it comes down to being able to take a
particular audio file and determine which of the library entries it is most
similar too.  If two audio files match the same library entry, you can say
that the two signals are a match.

The hard part (as previous posters have pointed out) is the classification
step.  Speech to text is a non-trivial problem, as is recognizing music,
musical instruments, etc.  If you can narrow your library down to a small
class of signals (say, single spoken words) and develop a good classifier,
then you're well on your way.  It's going to be tough otherwise.

--Tom



>Hi, i am working on a concept where i want to compare two audio file >(content of audio file)is identical to some extent.For ex if it's two >different people saying the same words .I have to say they are equal. >I have done goggling but i not able to predict what is really needed >for my project. I don't know ,where to start and how to start.How can >i compare audio files in that manner.Please help me out with any idea >about it.What is the correct approach for my concept. >
Tim Wescott <tim@seemywebsite.com> wrote:
(snip)

> Picking out two different recordings, made in two different places, of > the same sound source may be possible, but you'd have all sorts of > complications because of different acoustics causing different echos and > reverberations.
> But two _different_ people saying the same word? Oh man -- that's a > task that humans don't always get right; getting a machine that could do > it reliably would require a team of people for a good amount of time. > I'm not even sure it's been done, but if it is it's being done by a > well-connected researcher who's good a writing grant proposals and at > getting work out of his grad students.
It seems that it is good enough for companies to (try to) use it. More and more phone response systems, such as banks and airlines, are using it. I usually find it easier to put in the account number or flight number using the keypad, but they expect one to "say" the account or flight number. Sometimes it gets it right, other times not. I remember one about 30 years agot that would do one digit math problems, and ask for the answer. Even with only ten choices, it got it wrong fairly often. -- glen
On Oct 27, 5:47&#4294967295;am, nagarajan karunakaran <prassa...@gmail.com> wrote:
> Hi, i am working on a concept where i want to compare two audio file > (content of audio file)is identical to some extent.For ex if it's two > different people saying the same words .I have to say they are equal. > I have done goggling but i not able to predict what is really needed > for my project. &#4294967295;I don't know ,where to start and how to start.How can > i compare audio files in that manner.Please help me out with any idea > about it.What is the correct approach for my concept.
As Vlad pointed out there appears to be a large knowledge gap you need to fill if you plan on going this alone. Typically, speech recognition (which is essentially what you're proposing) is approached from a variety of ways. I propose after the topics Vlad suggested you look into the following topics: Cross-correlation Dynamic time warping Linear predictive coding Formants Markov model Laplacian distribution If you've never seen any of those, you may want to reconsider the scope of your project.
On 10/27/2010 5:47 AM, nagarajan karunakaran wrote:
> Hi, i am working on a concept where i want to compare two audio file > (content of audio file)is identical to some extent.For ex if it's two > different people saying the same words .I have to say they are equal. > I have done goggling but i not able to predict what is really needed > for my project. I don't know ,where to start and how to start.How can > i compare audio files in that manner.Please help me out with any idea > about it.What is the correct approach for my concept.
Even programs that do this like Dragon Speaking have to be "trained". But here you didn't say anything like "the same two people" nor did you mention a training step in the process. So, I think this may be too ambitious for words! :-) Or, maybe talk to the NSA. Fred
On 10/27/2010 11:26 AM, glen herrmannsfeldt wrote:
> Tim Wescott<tim@seemywebsite.com> wrote: > (snip) > >> Picking out two different recordings, made in two different places, of >> the same sound source may be possible, but you'd have all sorts of >> complications because of different acoustics causing different echos and >> reverberations. > >> But two _different_ people saying the same word? Oh man -- that's a >> task that humans don't always get right; getting a machine that could do >> it reliably would require a team of people for a good amount of time. >> I'm not even sure it's been done, but if it is it's being done by a >> well-connected researcher who's good a writing grant proposals and at >> getting work out of his grad students. > > It seems that it is good enough for companies to (try to) use it. > > More and more phone response systems, such as banks and airlines, > are using it. I usually find it easier to put in the account > number or flight number using the keypad, but they expect one > to "say" the account or flight number. Sometimes it gets it right, > other times not. > > I remember one about 30 years agot that would do one digit math > problems, and ask for the answer. Even with only ten choices, > it got it wrong fairly often.
Independent speaker recognition of just a few words in one language is heaps more reliable than independent speaker recognition of any random utterance in any arbitrary language. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" was written for you. See details at http://www.wescottdesign.com/actfes/actfes.html
On Oct 28, 1:47&#4294967295;am, nagarajan karunakaran <prassa...@gmail.com> wrote:
> Hi, i am working on a concept where i want to compare two audio file > (content of audio file)is identical to some extent.For ex if it's two > different people saying the same words .I have to say they are equal. > I have done goggling but i not able to predict what is really needed > for my project. &#4294967295;I don't know ,where to start and how to start.How can > i compare audio files in that manner.Please help me out with any idea > about it.What is the correct approach for my concept.
Like "kill the president" I assume or "lets bomb americans"! Yes, could have lots of applications. hardy