FFmpeg Downsampling audio changes the spectral content

Started by Zeus101 5 years ago5 replieslatest reply 5 years ago543 views


I am new to the forum. Please pardon if I am not following forum rules. 

I am downsampling an audio using ffmpeg from 32KHz to 8Khz. The log frequency spectrogram of the original-32Khz shows no frequency content above 2048Hz. But after downsampling the audio to 8Khz the spectrogram show content above 2048hz. Is the shift expected because of down sampling? 

Original 32Khz 

Downsampled 8Khz.


ZeusOriginaldownsampled 8khz 

[ - ]
Reply by nelsonaNovember 30, 2019

Hello Zeus,

Just to be clear:

You state initially that you're downsampling from 32 KHz to 8 KHz (yet state that your original 32 KHz audio has no frequency content above 2.048 KHz, correct)?

If so, when downsampling to 8 KHz, you should have no frequency content over your downsample rate / 2. So if you downsample to 8 KHz, you should have no frequency content over 4 KHz.

If you're seeing something between 2.048 KHz and 4 KHz, my guess is that this was added by FFMpeg to avoid audio errors which may occur during the downsample process.

For instance, Audacity (a free audio editor/DAW), will add noise to an audio track when downsampling or dropping the bit-depth (I think it happens in both situations, but it may only occur in one, though I can't remember which off the top of my head).

My guess is that the added frequency content you're seeing, is from the noise that was added to your audio track. 

An easy way to find out for certain is to take a test tone (e.g. at 2 KHz - though the digital audio samplerate should be at 32 KHz) and downsample that to 8 KHz using FFMpeg. Once downsampled, look at the audio *.wav file in an audio editor and compare it to your original audio file. 

If you zoom in very closely to the audio waveform, you should see small differences between the two (this should be the added noise). 

By the way, these differences shouldn't be because the downsample, rather, you should notice that the test tone is more jagged for the downsampled version, than the original version. This jaggedness is due to the noise that was added.


[ - ]
Reply by jbrowerNovember 30, 2019


When you mention "Audacity will add noise to an audio track when downsampling" are you referring to dither ?  I have found a lot of online discussion on Audacity's use of dither, both pro and con.  I just wanted to make sure I'm researching the right thing.  Thanks.

[ - ]
Reply by nelsonaNovember 30, 2019

Hello Jeff,

You're absolutely correct, the noise Audacity added on a downsample/bit-depth drop is due to dithering.


[ - ]
Reply by chalilNovember 30, 2019

Hi, you may want to check your spectrogram setting, mainly the y axis setting.  this kind of display can come even come when you open the stream with different sampling rate settings. 

hope it helps


[ - ]
Reply by jbrowerNovember 30, 2019


First, your waveform displays do not match your description.  The first one (orig-spec_65570.png) should show 0 to 16 kHz, if the sampling rate was actually 32 kHz.  The second one (8k-spec_96191.png) should show 0 to 4 kHz, if the sampling rate was actually 8 kHz.  Are you cutting off some of the freq axis display when creating your images ?

Second, if I ignore your frequency axis labels, there is no noise added. The "upper area" (approx 8th out of 8 divisions) in the 1st plot exactly matches its corresponding freq axis area (6th out of 8 divisions) in the 2nd plot.

I have a feeling that if downsampling was in fact performed it was done right, but your freq axis labeling and display is causing confusion.