Hi All -
I am trying to write a simple speaker-independent speech recognition program in
C, and I am wondering if using MFCC to recognize the words is the way to go.
I plan to provide the program with a codebook of approximately 20 words spoken
by the same speaker, then input a sample of another speaker saying one of these
words, and the program will identify which of the words in the codebook was
spoken. Can I just compute the 12 mel-cepstral coefficients of a speech sample
containing an entire recorded word, or should I break it into small segments
first? (If so, what length segments are good?) Would comparing the mel-cepstral
coefficients of one word to those of the codebook words (and finding the one
that has the least squared-difference, for example) be a viable method for
speech recognition? I had tried this with LPC coefficients earlier, but that
seems to only work for speaker-dependent speech recognition.
Finally, does anyone know of a simple program in C to calculate the mel-cepstral
coefficients, given a speech sample?
I am a beginner to speech processing, so any help would be greatly
Thank you very much,