Technical discussions about the implementation and research of speech recognition algorithms.
Hi All, I recently found out that using the power spectrum instead of magnitude in order to compute MFCCs gives better results. It seems to me that this is generally true for speech recognition. I do not know how to explain the reason for the difference in performance. The fact that power spectrum is the square of the magnitude spectrum may have something to do with that. But I don't know what exactly the reason could be. I looked for published papers that explain this difference on the net but I could get none. Any explanation on this is highly appreciated! regards, metty