Hi, I am trying to extract MFCC coefficients for variable sample rate audio feed 1.Input Sample rate is dynamic and is not known in front ( an arbitrary media file ) though it is fixed per session and doesn�t change over time. 2.All Input feeds are being down sampled to a predefined sampling-rate ( say 8Khz or 16Khz ) 3.Re-sampling is done in the following way: a.2nd order Lowpass filter with cutoff frequency of ~72% of the Nyquist frequency of the destination sampling rate. b.Direct Linear interpolation of the source signal to the destination signal ( at the destination sampling rate ) Test scenario � Sample the exact same signal using 44.1Khz 22.05Khz and 8Khz � Down-sample the signals sampled with 44.1Khz and 22.05Khz to 8Khz in the manner described above � Extract MFCC coefficients for the signals sampled with the three different sample rates Potential problem The resulting feature vectors might present a certain degree of similarity though it may not satisfy our needs, this is because the LowPass filter will suppress differently high frequencies for different sampling rates. Taking in mind a cutoff frequency of 3.6Khz � The frequency range that will be suppressed for a 44.1Khz signal is 3.6Khz - Nyquist frequency � The frequency range that will be suppressed for a 22.1Khz signal is 3.6Khz - Nyquist frequency Taking the above in mind, for a signal sampled @ 44.1Khz signal @ frequency 3.7Khz will have a different suppression then for the same signal sampled with 22.1Khz An alternative solution Part of generating MFCC coefficients is calculating fft for a drifting window in time Will cutting the high frequencies in the frequency domain, before passing through Mel filters achieve better performance? Is there any need to cutoff high frequencies over the highest frequency for which MelFilter is applied?
Resampling audio
Started by ●March 2, 2009