DSPRelated.com
Forums

mfcc calculation--please help

Started by jasi...@yahoo.com November 8, 2005
Hi everybody,
I am new to this group and could find that many of the group members know much about speaker identification. I am a graduate student, now doing my final year project on speaker recognition. The problem is to perform speaker recognition on movie clips. I have no previous experience with speech processing.
I could eliminate silence, environment sounds etc. from the audio signal to a satisfactory extent. The next step is MFCC calculation and the classification. I tried a lot doing it. But the output MFCC vectors I get do not seem to be correct (I am not sure actually). The vectors for different speakers do not seem to be distinguishably different, and those belong to the same speaker dont seem to be sufficiently similar even.

Has this problem ever occurred to anybody? Any suggestions are highly welcomed.

I read somewhere that we should perform some cepstral normalization on the MFCC vectors. But I dont know how. Can anybody help please?

Or is it the case that by just looking at the vectors we cannot determine the similarity or dissimilarity of the MFCC vectors?

Anybody with an experience with MFCC please help.

Thanks in advance
jasine


Hi Jasine, Please visit, www.etsi.org There are source files available for Mel Cepstrum calculation. Although I can not give you exact name and url on searching for Distributed Speech Recognition Frontend you will get a floating point C code base.
Please note that normalizing the variance of the cepstrum is for speaker independent recognition. Infact on normalization most of the speaker dependent features ae eliminated rendering the vector useless for you purpose. To normalize you have to devide the cepstrum by its variance. Mel coeficiants do produce substantial differences for different speakers. I advice you to plot vectors to have see visual difference to convince yourself about it. It will also help you to look at the feasibility in using Mel coefficinets. This is the only thing I can say about the topic owing to unfamiliarity in speaker recognition. Best of luck. Regards ~rAGU
speech-recognition@spee... wrote:

From: jasinekb@jasi... ________________________________________________________________________
________________________________________________________________________

Message: 1
Date: Tue, 08 Nov 2005 01:39:56 -0500
From: jasinekb@jasi...
Subject: mfcc calculation--please help

Hi everybody,
I am new to this group and could find that many of the group members know much about speaker identification. I am a graduate student, now doing my final year project on speaker recognition. The problem is to perform speaker recognition on movie clips. I have no previous experience with speech processing.
I could eliminate silence, environment sounds etc. from the audio signal to a satisfactory extent. The next step is MFCC calculation and the classification. I tried a lot doing it. But the output MFCC vectors I get do not seem to be correct (I am not sure actually). The vectors for different speakers do not seem to be distinguishably different, and those belong to the same speaker dont seem to be sufficiently similar even.

Has this problem ever occurred to anybody? Any suggestions are highly welcomed.

I read somewhere that we should perform some cepstral normalization on the MFCC vectors. But I dont know how. Can anybody help please?

Or is it the case that by just looking at the vectors we cannot determine the similarity or dissimilarity of the MFCC vectors?

Anybody with an experience with MFCC please help.

Thanks in advance
jasine