Hi everybody,
I am new to this group and could find that many of the group members know much
about speaker identification. I am a graduate student, now doing my final year
project on speaker recognition. The problem is to perform speaker recognition on
movie clips. I have no previous experience with speech processing.
I could eliminate silence, environment sounds etc. from the audio signal to a
satisfactory extent. The next step is MFCC calculation and the classification. I
tried a lot doing it. But the output MFCC vectors I get do not seem to be
correct (I am not sure actually). The vectors for different speakers do not seem
to be distinguishably different, and those belong to the same speaker dont seem
to be sufficiently similar even.
Has this problem ever occurred to anybody? Any suggestions are highly
welcomed.
I read somewhere that we should perform some cepstral normalization on the MFCC
vectors. But I dont know how. Can anybody help please?
Or is it the case that by just looking at the vectors we cannot determine the
similarity or dissimilarity of the MFCC vectors?
Anybody with an experience with MFCC please help.
Thanks in advance
jasine
mfcc calculation--please help
Started by ●November 8, 2005
Reply by ●November 21, 20052005-11-21
Hi Jasine,
Please visit, www.etsi.org There are source files available for Mel Cepstrum
calculation. Although I can not give you exact name and url on searching for
Distributed Speech Recognition Frontend you will get a floating point C code
base.
Please note that normalizing the variance of the cepstrum is for speaker independent recognition. Infact on normalization most of the speaker dependent features ae eliminated rendering the vector useless for you purpose. To normalize you have to devide the cepstrum by its variance. Mel coeficiants do produce substantial differences for different speakers. I advice you to plot vectors to have see visual difference to convince yourself about it. It will also help you to look at the feasibility in using Mel coefficinets. This is the only thing I can say about the topic owing to unfamiliarity in speaker recognition. Best of luck. Regards ~rAGU
speech-recognition@spee... wrote:
From: jasinekb@jasi... ________________________________________________________________________
________________________________________________________________________
Message: 1
Date: Tue, 08 Nov 2005 01:39:56 -0500
From: jasinekb@jasi...
Subject: mfcc calculation--please help
Hi everybody,
I am new to this group and could find that many of the group members know much about speaker identification. I am a graduate student, now doing my final year project on speaker recognition. The problem is to perform speaker recognition on movie clips. I have no previous experience with speech processing.
I could eliminate silence, environment sounds etc. from the audio signal to a satisfactory extent. The next step is MFCC calculation and the classification. I tried a lot doing it. But the output MFCC vectors I get do not seem to be correct (I am not sure actually). The vectors for different speakers do not seem to be distinguishably different, and those belong to the same speaker dont seem to be sufficiently similar even.
Has this problem ever occurred to anybody? Any suggestions are highly welcomed.
I read somewhere that we should perform some cepstral normalization on the MFCC vectors. But I dont know how. Can anybody help please?
Or is it the case that by just looking at the vectors we cannot determine the similarity or dissimilarity of the MFCC vectors?
Anybody with an experience with MFCC please help.
Thanks in advance
jasine
Please note that normalizing the variance of the cepstrum is for speaker independent recognition. Infact on normalization most of the speaker dependent features ae eliminated rendering the vector useless for you purpose. To normalize you have to devide the cepstrum by its variance. Mel coeficiants do produce substantial differences for different speakers. I advice you to plot vectors to have see visual difference to convince yourself about it. It will also help you to look at the feasibility in using Mel coefficinets. This is the only thing I can say about the topic owing to unfamiliarity in speaker recognition. Best of luck. Regards ~rAGU
speech-recognition@spee... wrote:
From: jasinekb@jasi... ________________________________________________________________________
________________________________________________________________________
Message: 1
Date: Tue, 08 Nov 2005 01:39:56 -0500
From: jasinekb@jasi...
Subject: mfcc calculation--please help
Hi everybody,
I am new to this group and could find that many of the group members know much about speaker identification. I am a graduate student, now doing my final year project on speaker recognition. The problem is to perform speaker recognition on movie clips. I have no previous experience with speech processing.
I could eliminate silence, environment sounds etc. from the audio signal to a satisfactory extent. The next step is MFCC calculation and the classification. I tried a lot doing it. But the output MFCC vectors I get do not seem to be correct (I am not sure actually). The vectors for different speakers do not seem to be distinguishably different, and those belong to the same speaker dont seem to be sufficiently similar even.
Has this problem ever occurred to anybody? Any suggestions are highly welcomed.
I read somewhere that we should perform some cepstral normalization on the MFCC vectors. But I dont know how. Can anybody help please?
Or is it the case that by just looking at the vectors we cannot determine the similarity or dissimilarity of the MFCC vectors?
Anybody with an experience with MFCC please help.
Thanks in advance
jasine