I have made a word rcognition system which works on taking
the speech samples and passing it via filter banks and then
doing the decimation. At the end of it, a 2 dimensional
template is generated which is time averaged (time averaging)
is done only during training phase.

When the actual speech sample comes in, its template is
computed and compared with averaged-reference-template,
the one which has min error is the closest match. 

Matlab and C code available on sale. Check for arithos designs
to get in touch.

Regards
Bharat

On Oct 28, 9:47&#4294967295;am, Tim Wescott <t...@seemywebsite.com> wrote:
> On 10/27/2010 11:26 AM, glen herrmannsfeldt wrote:
>
>
>
> > Tim Wescott<t...@seemywebsite.com> &#4294967295;wrote:
> > (snip)
>
> >> Picking out two different recordings, made in two different places, of
> >> the same sound source may be possible, but you'd have all sorts of
> >> complications because of different acoustics causing different echos and
> >> reverberations.
>
> >> But two _different_ people saying the same word? &#4294967295;Oh man -- that's a
> >> task that humans don't always get right; getting a machine that could do
> >> it reliably would require a team of people for a good amount of time.
> >> I'm not even sure it's been done, but if it is it's being done by a
> >> well-connected researcher who's good a writing grant proposals and at
> >> getting work out of his grad students.
>
> > It seems that it is good enough for companies to (try to) use it.
>
> > More and more phone response systems, such as banks and airlines,
> > are using it. &#4294967295;I usually find it easier to put in the account
> > number or flight number using the keypad, but they expect one
> > to "say" the account or flight number. &#4294967295;Sometimes it gets it right,
> > other times not.
>
> > I remember one about 30 years agot that would do one digit math
> > problems, and ask for the answer. &#4294967295;Even with only ten choices,
> > it got it wrong fairly often.
>
> Independent speaker recognition of just a few words in one language is
> heaps more reliable than independent speaker recognition of any random
> utterance in any arbitrary language.
>
> --
>
for a fixed vocab it could be made to work. Ex:
president,bomb,explode,meeting,kill and such like

On Oct 28, 1:47&#4294967295;am, nagarajan karunakaran <prassa...@gmail.com> wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.For ex if it's two
> different people saying the same words .I have to say they are equal.
> I have done goggling but i not able to predict what is really needed
> for my project. &#4294967295;I don't know ,where to start and how to start.How can
> i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

Tricky. You need speech recognition and compare the text 9as the
vampyre has already pointed out).
Speech recognition needs training to be good so such a system can
never be accurate unless it is people who have used the equipment
apriori.
Hardy to do with total strangers. You can rest Osama..

Hardy

On Oct 28, 1:47&#4294967295;am, nagarajan karunakaran <prassa...@gmail.com> wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.For ex if it's two
> different people saying the same words .I have to say they are equal.
> I have done goggling but i not able to predict what is really needed
> for my project. &#4294967295;I don't know ,where to start and how to start.How can
> i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

Like "kill the president" I assume or "lets bomb americans"!
Yes, could have lots of applications.


hardy

On 10/27/2010 11:26 AM, glen herrmannsfeldt wrote:
> Tim Wescott<tim@seemywebsite.com>  wrote:
> (snip)
>
>> Picking out two different recordings, made in two different places, of
>> the same sound source may be possible, but you'd have all sorts of
>> complications because of different acoustics causing different echos and
>> reverberations.
>
>> But two _different_ people saying the same word?  Oh man -- that's a
>> task that humans don't always get right; getting a machine that could do
>> it reliably would require a team of people for a good amount of time.
>> I'm not even sure it's been done, but if it is it's being done by a
>> well-connected researcher who's good a writing grant proposals and at
>> getting work out of his grad students.
>
> It seems that it is good enough for companies to (try to) use it.
>
> More and more phone response systems, such as banks and airlines,
> are using it.  I usually find it easier to put in the account
> number or flight number using the keypad, but they expect one
> to "say" the account or flight number.  Sometimes it gets it right,
> other times not.
>
> I remember one about 30 years agot that would do one digit math
> problems, and ask for the answer.  Even with only ten choices,
> it got it wrong fairly often.

Independent speaker recognition of just a few words in one language is 
heaps more reliable than independent speaker recognition of any random 
utterance in any arbitrary language.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

On 10/27/2010 5:47 AM, nagarajan karunakaran wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.For ex if it's two
> different people saying the same words .I have to say they are equal.
> I have done goggling but i not able to predict what is really needed
> for my project.  I don't know ,where to start and how to start.How can
> i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

Even programs that do this like Dragon Speaking have to be "trained". 
But here you didn't say anything like "the same two people" nor did you 
mention a training step in the process.  So, I think this may be too 
ambitious for words!  :-)

Or, maybe talk to the NSA.

Fred

On Oct 27, 5:47&#4294967295;am, nagarajan karunakaran <prassa...@gmail.com> wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.For ex if it's two
> different people saying the same words .I have to say they are equal.
> I have done goggling but i not able to predict what is really needed
> for my project. &#4294967295;I don't know ,where to start and how to start.How can
> i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

As Vlad pointed out there appears to be a large knowledge gap you need
to fill if you plan on going this alone. Typically, speech recognition
(which is essentially what you're proposing) is approached from a
variety of ways. I propose after the topics Vlad suggested you look
into the following topics:

Cross-correlation
Dynamic time warping
Linear predictive coding
Formants
Markov model
Laplacian distribution

If you've never seen any of those, you may want to reconsider the
scope of your project.

Tim Wescott <tim@seemywebsite.com> wrote:
(snip)

> Picking out two different recordings, made in two different places, of 
> the same sound source may be possible, but you'd have all sorts of 
> complications because of different acoustics causing different echos and 
> reverberations.

> But two _different_ people saying the same word?  Oh man -- that's a 
> task that humans don't always get right; getting a machine that could do 
> it reliably would require a team of people for a good amount of time. 
> I'm not even sure it's been done, but if it is it's being done by a 
> well-connected researcher who's good a writing grant proposals and at 
> getting work out of his grad students.

It seems that it is good enough for companies to (try to) use it.

More and more phone response systems, such as banks and airlines,
are using it.  I usually find it easier to put in the account
number or flight number using the keypad, but they expect one
to "say" the account or flight number.  Sometimes it gets it right,
other times not.

I remember one about 30 years agot that would do one digit math
problems, and ask for the answer.  Even with only ten choices,
it got it wrong fairly often.  

-- glen

At a broad level, you're looking at a classification problem here.  Whether
you're dealing with recorded speech, music, bird noises, or whatever else,
you need to define some set of signals which you're looking for, such as
words / phrases, songs, instruments, etc.  Once you have established your
"library" of possible signals, then it comes down to being able to take a
particular audio file and determine which of the library entries it is most
similar too.  If two audio files match the same library entry, you can say
that the two signals are a match.

The hard part (as previous posters have pointed out) is the classification
step.  Speech to text is a non-trivial problem, as is recognizing music,
musical instruments, etc.  If you can narrow your library down to a small
class of signals (say, single spoken words) and develop a good classifier,
then you're well on your way.  It's going to be tough otherwise.

--Tom



>Hi, i am working on a concept where i want to compare two audio file
>(content of audio file)is identical to some extent.For ex if it's two
>different people saying the same words .I have to say they are equal.
>I have done goggling but i not able to predict what is really needed
>for my project.  I don't know ,where to start and how to start.How can
>i compare audio files in that manner.Please help me out with any idea
>about it.What is the correct approach for my concept.
>


nagarajan karunakaran wrote:
> Hi, i am working on a concept where i want to compare two audio file
> (content of audio file)is identical to some extent.

Compute waterfall spectrograms, measure the normalized distance between 
them. Your professor will be more then happy.

> For ex if it's two
> different people saying the same words .I have to say they are equal.

Convert speech to text, compare the texts.

> I have done goggling but i not able to predict what is really needed
> for my project.   I don't know ,where to start and how to start.
> How can i compare audio files in that manner.Please help me out with any idea
> about it.What is the correct approach for my concept.

To begin with, learn the basics. Fourier, Z-transform, FIR and IIR 
filters, etc.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com