# How to decimate/nterpolate in frequency domain and recover a proper causal time signal ?

Started by November 20, 2006
```Hi everyone !

I'm new in this group and quite intermediate in DSP applications. I'm
facing the following problem in the framework of my study.

I've got some 4 seconds recorded sounds (44100Hz, 16bits) for which I'd
like to modify the FFT spectrum. Actually, I'd like to replace at a
constant rate (e.g. 3 values every four values, from the DC component)
some initial values with "new" values issued from the linear
interpolation between the remaining values, and get back the
corresponding modified sound, thus keeping the same sampling frequency,
i.e. 44100Hz.
I've thought about the following steps for the process :
- first, applying FFT to get the original spectrum.
- then, retaining 1 value every 4 values from sample 0 to
sample N/2, amounting in a way to frequency decimation,
- Interpolating linearly the real and imaginary parts of the spectrum
to recover values at the decimated frequency bins.
- reconstituting the complete spectrum [0-Fe[ by flipping complex
numbers 1 to N/2 of the vector obtained at the previous step, taking
their conjugate and concatenating them with the above vector,
- finally, recompute the time signal by processing an ifft on the
previous sequence and keeping the real part of the result.

I've tried this process but it seems it doesn't work : the obtained
time signal looks really distorted, probably owing to a phenomenon of
time aliasing.
especially in the time domain (and problems of frequency aliasing), but
I've found nothing about equivalent processes in the frequency domain.
I'm really getting stuck on this problem; can anyone help me to detect
where my reasonning is erroneous and indicate me how to process in a
proper way to obtain the expected result ?

Scoubi.

```
```scoubi wrote:
> Hi everyone !
>
> I'm new in this group and quite intermediate in DSP applications. I'm
> facing the following problem in the framework of my study.
>
> I've got some 4 seconds recorded sounds (44100Hz, 16bits) for which I'd
> like to modify the FFT spectrum. Actually, I'd like to replace at a
> constant rate (e.g. 3 values every four values, from the DC component)
> some initial values with "new" values issued from the linear
> interpolation between the remaining values, and get back the
> corresponding modified sound, thus keeping the same sampling frequency,
> i.e. 44100Hz.
> I've thought about the following steps for the process :
> - first, applying FFT to get the original spectrum.
> - then, retaining 1 value every 4 values from sample 0 to
> sample N/2, amounting in a way to frequency decimation,
> - Interpolating linearly the real and imaginary parts of the spectrum
> to recover values at the decimated frequency bins.
> - reconstituting the complete spectrum [0-Fe[ by flipping complex
> numbers 1 to N/2 of the vector obtained at the previous step, taking
> their conjugate and concatenating them with the above vector,
> - finally, recompute the time signal by processing an ifft on the
>  previous sequence and keeping the real part of the result.
>
> I've tried this process but it seems it doesn't work : the obtained
> time signal looks really distorted, probably owing to a phenomenon of
> time aliasing.
> especially in the time domain (and problems of frequency aliasing), but
> I've found nothing about equivalent processes in the frequency domain.
> I'm really getting stuck on this problem; can anyone help me to detect
> where my reasonning is erroneous and indicate me how to process in a
> proper way to obtain the expected result ?
>
>
> Scoubi.

It would help if you would state what the expected result is. What are
you trying to do?

John

```
```
> > Scoubi.
>
> It would help if you would state what the expected result is. What are
> you trying to do?
>
> John

Sorry,

The expected result is to obtain by IFFT a "properly" modified sound,
reflecting the mentioned modifications processed on the FFT spectum. By
"properly", I mean I had expected to get a sound distorted because of
these modifications, keeping nevertheless the structure of the original
sound. But, when testing on a musical sound 4s in length, the modified
sound I got has lost the structure of the original sound (time
aliasing-like phenomena apparently occur); the latter doesn't mean
anything at all, the temporal wrapping of the signal looks like a sinc
shape whose period seems to be especially linked to the decimation
factor.

Does this help ?
If not, don't hesitate to tell me.

Thanks,
Scoubi.

```
```> The expected result is to obtain by IFFT a "properly" modified sound,
> reflecting the mentioned modifications processed on the FFT spectum. By
> "properly", I mean I had expected to get a sound distorted because of
> these modifications, keeping nevertheless the structure of the original
> sound. But, when testing on a musical sound 4s in length, the modified
> sound I got has lost the structure of the original sound (time
> aliasing-like phenomena apparently occur); the latter doesn't mean
> anything at all, the temporal wrapping of the signal looks like a sinc
> shape whose period seems to be especially linked to the decimation
> factor.

It sounds like you may not be using enough frequency samples when doing
your IFFT. Do you perhaps have a programming error that is preventing
you from using the frequency samples that you interpolated? It sounds
as if your reconstruction of the signal is using fewer input points to
the IFFT than you got out of your original FFT. The dual of the
sampling theorem here applies; just as sampling too infrequently in the
time domain gives you aliasing in the frequency domain, sampling too
infrequently in the frequency domain gives you aliasing in the time
domain.

Jason

```
```scoubi wrote:
>
>>> Scoubi.
>> It would help if you would state what the expected result is. What are
>> you trying to do?
>>
>> John
>
> Sorry,
>
> The expected result is to obtain by IFFT a "properly" modified sound,
> reflecting the mentioned modifications processed on the FFT spectum. By
> "properly", I mean I had expected to get a sound distorted because of
> these modifications, keeping nevertheless the structure of the original
> sound. But, when testing on a musical sound 4s in length, the modified
> sound I got has lost the structure of the original sound (time
> aliasing-like phenomena apparently occur); the latter doesn't mean
> anything at all, the temporal wrapping of the signal looks like a sinc
> shape whose period seems to be especially linked to the decimation
> factor.
>
> Does this help ?
> If not, don't hesitate to tell me.
>
> Thanks,
> Scoubi.
>

Time domain decimation causes squeezing and replication in the frequency
domain. If the original signal was sufficiently oversampled, one can use
a LPF to eliminate the frequency domain replicas and clean up the time
signal. If it wasn't oversampled enough, the signal can't be cleaned up
because the replicas have overlapped each other.

Frequency domain decimation causes squeezing and replication in the time
domain. Your bin interpolator is the dual of the LPF mentioned above. It
seems to me that the questions to ask are do the time replicas overlap
is the filter good enough. Using a bigger FFT with some zero padding
might be necessary.

John
```
```> It sounds like you may not be using enough frequency samples when doing
> your IFFT. Do you perhaps have a programming error that is preventing
> you from using the frequency samples that you interpolated? It sounds
> as if your reconstruction of the signal is using fewer input points to
> the IFFT than you got out of your original FFT. The dual of the
> sampling theorem here applies; just as sampling too infrequently in the
> time domain gives you aliasing in the frequency domain, sampling too
> infrequently in the frequency domain gives you aliasing in the time
> domain.
>
> Jason

Hi jason,

I 've just had a look to the matlab routine I used for the process
> It sounds like you may not be using enough frequency samples when doing
Yet, before running the iFFT, the reconstructed complex spectrum has
the same number of bins (i.e. 176400) as the original one. I don't know
if the matter really stems from that fact; can you confirm me this at
the sight of the file .m mentioned below?

Scoubi.
--------------------------------------------
file .m

%%Reading of the original stereo music sample

%%FFT Calculation
FFTY=fft(Y);

%%Constant values
%Sampling Frequency
fs=44100;

%Time duration
length=4;

%Number of bins
N=length*fs;

%Frequency vector (frequencies until Nyquist frequency, included)
vecfnyquist=[0:1/length:N/(2*length)];

%%Spectrum decimation (for instance 1 value retained every four values,
from DC component)
FFTYdecim=FFTY(1:4:N/2+1,:);
ReFFTYdecim=real(FFTYdecim);
ImFFTYdecim=imag(FFTYdecim);

%%Linear interpolation of real and imaginary parts of the spectrum
within the range
%%[0;Fnyquist]
%%(i.e. replacement of the decimated values with the values issued from
the linear interp.
%%processed between the remaining values)

for i=1:2;
ReFFTYreshape(:,i)=interp1(vecfnyquist(1:4:end),ReFFTYdecim(:,i),vecfnyquist);
ImFFTYreshape(:,i)=interp1(vecfnyquist(1:4:end),ImFFTYdecim(:,i),vecfnyquist);
end;

%%Reconstruction of the modified complex spectrum within the range
[0;Fnyquist] from its %%real and  imaginary parts
FFTYreshape=ReFFTYreshape+i*ImFFTYreshape;

%%Reconstruction of the entire complex spectrum within the range [0;fs[
FFTYcomplete=cat(1,FFTYreshape,fliplr(conj(FFTYreshape(2:end-1,:))')')

%%Calculation of the corresponding modified signal by IFFT
Ymodif=real(ifft(FFTYcomplete));

%.Wav writing of the modified signal
wavwrite(Ymodif/(1.001*max(max(Ymodif))),fs,16,'ModifiedSample.wav');

```
```John Sampson a =E9crit :

> Time domain decimation causes squeezing and replication in the frequency
> domain. If the original signal was sufficiently oversampled, one can use
> a LPF to eliminate the frequency domain replicas and clean up the time
> signal. If it wasn't oversampled enough, the signal can't be cleaned up
> because the replicas have overlapped each other.
>
> Frequency domain decimation causes squeezing and replication in the time
> domain. Your bin interpolator is the dual of the LPF mentioned above. It
> seems to me that the questions to ask are do the time replicas overlap
> is the filter good enough. Using a bigger FFT with some zero padding
> might be necessary.
>
> John

I'm not sure to have understood what you meant by :
> It seems to me that the questions to ask are do the time replicas overlap
> is the filter good enough
Can you give me some precisions ?

Intuitively, I've added a number of zeros equal to three times the
initial number of bins (3*176400), for a decimation factor equal to 4.
I have implemented it in the M file (mentioned in my previous post) in
the following way:

FFTYcomplete=3Dcat(1,FFTYreshape,fliplr(conj(FFTYreshape(2:end-1,:))')',zer=
os(529200,2))

but it seems the problem remains: the signal is longer, but it is still
wrapped in a sinc-like shape. Do you see any other mistake ?

Scoubi.

```
```Now that I think about this some more I realize you're not doing
anything wrong. What you're doing (decimating in the frequency domain
followed by linear interpolation between the samples) is equivalent to
doing the following:

1. Setting 3 out of every 4 frequency samples to zero, but leaving them
in the vector (this differentiates this operation from decimation,
which would actually throw the samples away). This is actually
identical to a decimation step followed by an upsampling step (I'm
using upsampling to refer to the initial step in interpolation,
inserting extra zeros as needed). This performs the frequency domain
expansion/contraction operations that John talked about, so your
sampling rate could become an issue here.
2. Convolving the resulting vector with a triangle function; this is
the actual interpolation. If you don't see how this works, draw it out
and you'll see.

Therefore, convolution of the frequency domain samples with the
triangle function yields a multiplication in the time domain between
your original signal and the inverse transform of the triangle
function, which happens to be a sinc^2. That's what gives you the
envelope that you see, which, as you stated previously, is related to
the decimation factor (the decimation factor affects the length of the
triangle function, which affects the mainlobe width of the sinc).

What exactly are you trying to accomplish?

Jason

```
```> What exactly are you trying to accomplish?

Actually, I'm trying to make a parallel between the numerical and
experimental parts of my work that deals with the sound quality of
structures.
In short, I'm going in a future stage to calculate with a
vibro-acoustic tool (in the frequency domain) some sounds radiated by a
structure, from the knowledge of its vibratory velocities. But, owing
to computation limitations, I will not be able to calculate the
frequency responses (complex spectra of the noises) with a fine
frequency step (at best maybe 1 or 2 Hz), I will then have to linearly
interpolate the spectra to create additional "fictive" values and then
get a sufficient number of values (i.e. 176400) so that, when going
back to the time domain by IFFT, I can get a sound sample 4s in length
of quite good quality, i.e. involving a sampling rate compatible (i.e.
44100 Hz)  with the purpose of subjective evaluation I wanna run.
(the same as the ones I will calculate), I'd like to collect some
useful information about the value of the maximum frequency step I
could later use in simulations without significantly modyfing the
perceptual trends I would get from an evaluation of the recorded
noises, modified following the procedure we are discussing (and thus
involving a first stage of "decimation" to simulate the future coarse
frequency step used under the vibro-acoustic tool).

Sorry if I have dwelt too much, it may be a little boring or confuse !
:-))
But this is the why of the how :-)

Otherwise, I've started to implement what you mentioned in your post, I
think you hold the good way, you exactly described what I had observed.
Could you just advise me about some elements with which I do not feel
very comfortable, so that I can converge more quickly to the solution.

Notably:
>Convolving the resulting vector with a triangle function; this is
> the actual interpolation. If you don't see how this works, draw it out
> and you'll see.
I don't manage to obtain the results of the interpolation I previously
observed, when now using the convolution of the "decimated" spectrum
(in the way you specified) with the triangular function, what should be
the parameters of the triangle function (width, delay, sequence
length...) so that it could work, e.g.  for a decimation factor equal
to 4 ? Should I  perform the convolution for the whole spectrum (i.e.
[0;Fe[) or the partial spectrum ([0; Fe/2], then reconstruction), on
the complex values or separately on real and imaginary parts ?

> Therefore, convolution of the frequency domain samples with the
> triangle function yields a multiplication in the time domain between
> your original signal and the inverse transform of the triangle
> function, which happens to be a sinc^2. That's what gives you the
> envelope that you see, which, as you stated previously, is related to
> the decimation factor (the decimation factor affects the length of the
> triangle function, which affects the mainlobe width of the sinc).
>
To cancel afterwards the effects of multiplication in the time domain
of the signal by the function sinc^2, would the appropriate correction
consist in dividing the signal, that I've got by performing the IFFT on
the transformed spectrum (e.g. interpolated using the convolution
process), by this same function sinc^2 ? Nan values may occur for zeros
values of the function, don't they ? Or must this correction be applied
in another more appropriate way ?

Once again thank you very much for your help and your patience,

Scoubi.

```
```> I don't manage to obtain the results of the interpolation I previously
> observed, when now using the convolution of the "decimated" spectrum
> (in the way you specified) with the triangular function, what should be
> the parameters of the triangle function (width, delay, sequence
> length...) so that it could work, e.g.  for a decimation factor equal
> to 4 ? Should I  perform the convolution for the whole spectrum (i.e.
> [0;Fe[) or the partial spectrum ([0; Fe/2], then reconstruction), on
> the complex values or separately on real and imaginary parts ?
> I don't manage to obtain the results of the interpolation I previously
> observed, when now using the convolution of the "decimated" spectrum
> (in the way you specified) with the triangular function, what should be
> the parameters of the triangle function (width, delay, sequence
> length...) so that it could work,

For this sort of interpolation, you want your triangle waveform to ramp
from zero to one in 4 samples, so that when the triangle is at its
sample rate) are zeroed out. So, the vector would look like:

t = [0 0.25 0.5 0.75 1 0.75 0.5 0.25 0];

I haven't worked the math out fully, but you should be able to get the
expected result by performing the convolution on the entire spectrum.
It might be easier to do if you use fftshift() to put the frequencies
in a more intuitive order beforehand (you will have to undo it before
you ifft() though). If I get a chance, I'll try to work up this example
in MATLAB.

The zeros on the end are just for clarity and obviously can be omitted.
As you pointed out, dividing the sinc^2 out at the end isn't really an
option; you'll run into numerical stability issues when dividing by
numbers close to zero. You might try a "smarter" interpolation method;
since ideal interpolation uses sinc functions, you might use an