DSPRelated.com
Forums

Resampling audio

Started by Nadav March 2, 2009
Hi,

I am trying to extract MFCC coefficients for variable sample rate
audio feed
1.Input Sample rate is dynamic and is not known in front ( an
arbitrary media file ) though it is fixed per session and doesn�t
change over time.
2.All Input feeds are being down sampled to a predefined sampling-rate
( say 8Khz or 16Khz )
3.Re-sampling is done in the following way:
   a.2nd order Lowpass filter with cutoff frequency of ~72% of the
Nyquist frequency of the destination sampling rate.
   b.Direct Linear interpolation  of the source signal to the
destination signal ( at the destination sampling rate )
Test scenario
� Sample the exact same signal using 44.1Khz 22.05Khz and 8Khz
� Down-sample the signals sampled with 44.1Khz and 22.05Khz to 8Khz in
the manner described above
� Extract MFCC coefficients for the signals sampled with the three
different sample rates
Potential problem
The resulting feature vectors might present a certain degree of
similarity though it may not satisfy our needs, this is because the
LowPass filter will suppress differently high frequencies for
different sampling rates.
Taking in mind a cutoff frequency of 3.6Khz
  � The frequency range that will be suppressed for a 44.1Khz signal
is 3.6Khz - Nyquist frequency
  � The frequency range that will be suppressed for a 22.1Khz signal
is 3.6Khz - Nyquist frequency
Taking the above in mind, for a signal sampled @ 44.1Khz signal @
frequency 3.7Khz will have a different suppression then for the same
signal sampled with 22.1Khz

An alternative solution
Part of generating MFCC coefficients is calculating fft for a drifting
window in time
Will cutting the high frequencies in the frequency domain, before
passing through Mel filters achieve better performance?
Is there any need to cutoff high frequencies over the highest
frequency for which MelFilter is applied?