I am working on a small program for recognition the alphabet. The idea is to take in some sound sample, perform a FFT on the modulated curve, then compare the extreme points of magnitudes on the frequency plane, is this how its normally done? Since if i limit myself to the alphabet and the numbers, it should not be so many different combinations? I have tried testing this but im having some issues with the coding part, I have a sample, and a FFT code that passes some test vectors, then I perform FFT on the sample, and calculated each magnitude with sqrt(a^2 + b^2), but which magnitude is for which frequency? Also, I tried this idea by checking this curve in GoldWave, like saying "A", then somebody else say "A" and the curve is quite similar. Do you think this will work? Or is it wrong approach? Is there some better way? Main goal is to reconize the alphabet+numbers with a very low fail-rate.
Best Regards and thanks in advance