DSPRelated.com
Forums

Need help for some simple questions

Started by doggie January 1, 2006
Hi everyone, I�m quite new to DSP and matlab and have some questions here.
Hope you guys can help. Any advice would be appreciated. 

1) clearSpeech = wavread('clear_car_modified');                           
      
clearSpeech = clearSpeech(1:100);
y=fft(clearSpeech);
%soundsc(clearSpeech);
Y=real(ifft(y));
%soundsc(Y)

problem is I can hear the clear speech but when I use soundsc on Y, I hear
nothing. Shouldn�t I hear clearspeech when I play soundsc(Y)? 

2) I�m experimenting with weiner or (wiener)  filter,

Is it better to formulate the algorithm in the frequency domain or time
domain and why?

Using -y(k) = ∑wik x(k-i)
-e(k) = d(k) � y(k)
-wik+1 = wik + 2�e(k)x(k-i)

or 

Using H(f) = S(f)2 / S(f)2 + N(f)2



3) weiner filtering requires clean speech. Does that means we have to do
spectral subtraction to estimate the clean speech before weiner filtering?
If so, wouldn�t weiner filtering be more computational intensive then
spectral subtraction? 

4) What is the purpose of the overlap add method? Why are we doing it?


Thanks. 



doggie said the following on 01/01/2006 18:18:
> Hi everyone, I�m quite new to DSP and matlab and have some questions here. > Hope you guys can help. Any advice would be appreciated. > > 1) clearSpeech = wavread('clear_car_modified'); > > clearSpeech = clearSpeech(1:100); > y=fft(clearSpeech); > %soundsc(clearSpeech); > Y=real(ifft(y)); > %soundsc(Y) > > problem is I can hear the clear speech but when I use soundsc on Y, I hear > nothing. Shouldn�t I hear clearspeech when I play soundsc(Y)?
That code works fine when I try it (look at (clearSpeech - Y) to show that they're the same except for rounding errors). Although if you're only listening to 100 samples, it's not going to sound very impressive.
> 4) What is the purpose of the overlap add method? Why are we doing it?
I assume you're talking about overlap-and-add when using an FFT on successive blocks of a stream of data? If you just take the FFT of a block of data, you are effectively multiplying the original time-domain data by a rectangular window, which is equivalent to convolution by a sinc pulse in the frequency domain, i.e.: FFT{x[n].rect(n/N)} <==> (X[k] # sinc(kN)) [Where rect(n/N) represents a rectangular waveform of length N (FFT block length), X[k] = DFT{x[n]}, and using # to represent convolution. The scaling factors are undoubtedly wrong, so don't quote me on this, but hopefully you get the idea.] This leads to "spectral leakage" - peaks in the FFT data are smeared. To reduce this effect, each block is typically multiplied by another window shape (e.g. Hamming, triangular, etc.) before the FFT is applied, i.e. we now do: FFT{x[n].w[n]} The convolution problem above still occurs; however the Fourier Transform of each of these windows has lower sidelobes than that of the rectangular window, so the spectral leakage effect is greatly reduced. However, when you perform an IFFT on each block and string them back together in an attempt to recreate the original time-domain waveform, there's now a periodic amplitude modulation (we've effectively multiplied x[n] by a periodic version of w[n]) - not a good thing. To avoid this, you typically overlap successive FFTs by 50% (other values will work too) in the time-domain - when the IFFT outputs are added back together in a similar fashion, the amplitude modulation is eliminated. This works because we choose w[n] so that: w[n] + w[n - N/2] = 1 [NOTE: the maths here is *not* meant to be MATLAB notation.] -- Oli
doggie wrote:
> Hi everyone, I'm quite new to DSP and matlab and have some questions here. > Hope you guys can help. Any advice would be appreciated.
Lots of the people here do teaching for a living, and recognize a verbatim homework assignment when they see it. No one here do the homework for students. Why don't you have an attempt at the asignment first, and come back with the questions you have particular problems with? You will learn a lot more by doing that, and you will avoid grumpy responses like this one. Rune
> >doggie wrote: >> Hi everyone, I'm quite new to DSP and matlab and have some questions
here.
>> Hope you guys can help. Any advice would be appreciated. > >Lots of the people here do teaching for a living, and recognize a >verbatim >homework assignment when they see it. No one here do the homework >for students. Why don't you have an attempt at the asignment first, and > >come back with the questions you have particular problems with? > >You will learn a lot more by doing that, and you will avoid grumpy >responses like this one. > >Rune > >
Hi, this is really not a homework or any assignment. But its just that most textbook that i've read just say overlap add etc without explaining more in detail. The rest are just my thoughts. Maybe these questions are too trivial for you guys here so you guys thought its some sort of homework. :) no offence, just to clarify. By the way, thanks oli, I got what you mean. The overlap add will remove the effect of the window if we ensure that w(n) + w(n - N/2) =1. Did a few examples on paper, now the concept is clearer. Much thanks.
doggie wrote:
> > > >doggie wrote: > >> Hi everyone, I'm quite new to DSP and matlab and have some questions > here. > >> Hope you guys can help. Any advice would be appreciated. > > > >Lots of the people here do teaching for a living, and recognize a > >verbatim > >homework assignment when they see it. No one here do the homework > >for students. Why don't you have an attempt at the asignment first, and > > > >come back with the questions you have particular problems with? > > > >You will learn a lot more by doing that, and you will avoid grumpy > >responses like this one. > > > >Rune > > > > > > Hi, this is really not a homework or any assignment.
Yes, now that I read your post a lot more carefully I see that the first and third questions are your own. Questions 2 and 4 could easily be homework questions, though. Sorry.
> But its just that > most textbook that i've read just say overlap add etc without explaining > more in detail. The rest are just my thoughts. Maybe these questions are > too trivial for you guys here so you guys thought its some sort of > homework. :)
You would be surprised to see what questions spark the most intense (on topic) discussions here. Most of them could be asked in the first semester of the first intro course in DSP. It's the tendency of students posting homework here without having a go at things themselves first, that tend to make a mess of things. It is not very often one sees a post with the questions sorted out as yours, let alone with a question phrased like your question 4, and it is not homework. Again, my mistake.
> no offence,
None taken.
> just to clarify.
Likewise. By the way, the filter you are playing with is a Wiener filter, note the capital "W" and that the "i" comes before the "e". The filter has its name after Norbert Wiener who did just about enough for DSP to earn the honor of people spelling his name correctly. Rune
> >doggie wrote: >> > >> >doggie wrote: >> >> Hi everyone, I'm quite new to DSP and matlab and have some
questions
>> here. >> >> Hope you guys can help. Any advice would be appreciated. >> > >> >Lots of the people here do teaching for a living, and recognize a >> >verbatim >> >homework assignment when they see it. No one here do the homework >> >for students. Why don't you have an attempt at the asignment first,
and
>> > >> >come back with the questions you have particular problems with? >> > >> >You will learn a lot more by doing that, and you will avoid grumpy >> >responses like this one. >> > >> >Rune >> > >> > >> >> Hi, this is really not a homework or any assignment. > >Yes, now that I read your post a lot more carefully I see that the >first >and third questions are your own. Questions 2 and 4 could easily be >homework questions, though. > >Sorry. > >> But its just that >> most textbook that i've read just say overlap add etc without
explaining
>> more in detail. The rest are just my thoughts. Maybe these questions
are
>> too trivial for you guys here so you guys thought its some sort of >> homework. :) > >You would be surprised to see what questions spark the most intense >(on topic) discussions here. Most of them could be asked in the >first semester of the first intro course in DSP. > >It's the tendency of students posting homework here without having a go > >at things themselves first, that tend to make a mess of things. It is >not >very often one sees a post with the questions sorted out as yours, let >alone with a question phrased like your question 4, and it is not >homework. > >Again, my mistake. > >> no offence, > >None taken. > >> just to clarify. > >Likewise. > >By the way, the filter you are playing with is a Wiener filter, note >the capital "W" and that the "i" comes before the "e". The filter has >its name after Norbert Wiener who did just about enough for DSP >to earn the honor of people spelling his name correctly. > >Rune > >
Thanks for your advice. I do agree that we learn better by understanding by making mistakes. I don't really ask for full answer but some prompting or directions would really help. :)
doggie wrote:
...
> 4) What is the purpose of the overlap add method? Why are we doing it?
To avoid circular convolution, also called time-domain aliasing. Spectral leakage of the FFT has nothing to do with this. Indeed, if you perform frame windowing and frequency-domain filtering (as in FFT / multiply / IFFT), then windowing with anything but a rectangular window will result in a periodic modulation of the filter kernel, with the period equal to the overlap length. Regards, Andor
OK, I've been grumpy enough for one day. I'll try to help out.

doggie wrote:
> Hi everyone, I=E2=80=99m quite new to DSP and matlab and have some questi=
ons here.
> Hope you guys can help. Any advice would be appreciated. > > 1) clearSpeech =3D wavread('clear_car_modified'); > > clearSpeech =3D clearSpeech(1:100); > y=3Dfft(clearSpeech); > %soundsc(clearSpeech); > Y=3Dreal(ifft(y)); > %soundsc(Y) > > problem is I can hear the clear speech but when I use soundsc on Y, I hear > nothing. Shouldn=E2=80=99t I hear clearspeech when I play soundsc(Y)?
Yes, you should. From the above code, Y=3D=3DclearSpeech to within numerical precision. Try delta=3Dmax(abs(Y-clearSpeech)); delta/max(abs(Y)) should pe on the order of 10e-12 or so. If this really is the case, you have an issue with your sound card. Do you need to scale and round it to integers youself?
> 2) I=E2=80=99m experimenting with weiner or (wiener) filter, > > Is it better to formulate the algorithm in the frequency domain or time > domain and why? > > Using -y(k) =3D =E2=88=91wik x(k-i) > -e(k) =3D d(k) =E2=80=93 y(k) > -wik+1 =3D wik + 2=C2=B5e(k)x(k-i) > > or > > Using H(f) =3D S(f)2 / S(f)2 + N(f)2
It depends on the data. Do you have access to the spectrum estimates S(f)2 and N(f)2? If so, how reliable are they? Do you need to update the algorithm as you go? If so, how often? As a general rule of thumb, you do the filter design and analysis (i.e find out how the filter works or specify what you want the filter to do) in frequency domain, and the implementation (actually compute the numbers) in time domain.
> 3) weiner filtering requires clean speech.
Does it? What do you mean by that? How does a Wiener filter differ from an ordinary filter?
> Does that means we have to do > spectral subtraction to estimate the clean speech before weiner filtering? > If so, wouldn=E2=80=99t weiner filtering be more computational intensive =
then
> spectral subtraction?
Spectrum subtraction requires complete knowledge of the magnitude and phase of both signal and noise. Can you get that? Filtering requires estimates of relative bandwidths and magnitudes. Slightly easier to come by. Computational costs are irrelevant if you can't get your hands on the required input data.
> 4) What is the purpose of the overlap add method? Why are we doing it?
The overlap-add and overlap-save methods are used when you want to work in frequency domain with very long data sequences. Instead of computing the DFT of one huge data set, you chop it up in shorter batches, do your computations in frequency domain, and then splice the whole thing back together in time domain. Rune
>> Is it better to formulate the algorithm in the frequency domain or
time
>> domain and why? >> >> Using -y(k) =3D =E2=88=91wik x(k-i) >> -e(k) =3D d(k) =E2=80=93 y(k) >> -wik+1 =3D wik + 2=C2=B5e(k)x(k-i) >> >> or >> >> Using H(f) =3D S(f)2 / S(f)2 + N(f)2 > >It depends on the data. Do you have access to the spectrum estimates >S(f)2 and N(f)2? If so, how reliable are they? Do you need to update >the algorithm as you go? If so, how often? As a general rule of thumb, >you do the filter design and analysis (i.e find out how the filter >works >or specify what you want the filter to do) in frequency domain, and the > >implementation (actually compute the numbers) in time domain. > >> 3) weiner filtering requires clean speech. > >Does it? What do you mean by that? How does a Wiener filter differ from > >an ordinary filter? > >> Does that means we have to do >> spectral subtraction to estimate the clean speech before weiner
filtering?
>> If so, wouldn=E2=80=99t weiner filtering be more computational
intensive =
>then >> spectral subtraction? > >Spectrum subtraction requires complete knowledge of the magnitude >and phase of both signal and noise. Can you get that? Filtering >requires >estimates of relative bandwidths and magnitudes. Slightly easier to >come >by. Computational costs are irrelevant if you can't get your hands on >the >required input data. > >> 4) What is the purpose of the overlap add method? Why are we doing it? > >The overlap-add and overlap-save methods are used when you want >to work in frequency domain with very long data sequences. Instead of >computing the DFT of one huge data set, you chop it up in shorter >batches, do your computations in frequency domain, and then splice >the whole thing back together in time domain. > >Rune > >
Hi, I noticed that both the time and frequency domain implementation needs clean speech. So the easiest way that i can think of is to use an energy detection algorithm that treat the first few frames as noise and compute the mean noise energy. Subsequent frames above this threshold will be treated as signal and those below will be treat as noise and the mean noise power will be updated for noise frame. After which, clean speech will be obtain from subtracting the noisy speech from the mean noise and hence, we are able to compute the wiener equations. These are just some of my thoughts. However, i have tried hearing the effect of simple spectral subtraction. It isn't very good and i guess its because the spectral estimates are not very accurate. Thus, if we obtain the (not so accurate)clean speech through spectral subtraction and use it subsequently on wiener filtering, it won't really do us any good, will it? As for the overlap method, i try to see if what i interpreted is correct using a simple matlab code i've written myself. I feel that the main objective is to minimise the effect of the window when we do block processing. y= ones(1,3000); % our input signal window= hanning(256); i=1; k=1; for i=1:256:(length(y)-256) r(k,:) = (window').*y(i:(i+255)); k=k+1; end %================================================= % assuming we did something to their fft spectrum here % %================================================= %now we want to reconstruct the time domain signal without overlap add g=1; b=1; for b = 1: 256 : (length(y)-256) Y(1,b:b+255)=r(g,:); g=g+1; end %plot(Y)--- this will show us our signal amplitude modulated by the window % which is undesirable %Let's try overlap add to get rid of the windowing effect h=1; m=1; Z=zeros(1,3000); for m=1:128:length(y) - 256 Z(1,m:m+255) = Z(1,m:m+255)+ r(h,:); h=h+1; end plot(Z) % now we can see that the effect of the window has been minimised %pls ignore both the ends of the plot. %i know its a very amateurish code. :) Any comments are welcome % Thanks for all the advices
doggie wrote:
> >> Is it better to formulate the algorithm in the frequency domain or > time > >> domain and why? > >> > >> Using -y(k) =3D =E2=88=91wik x(k-i) > >> -e(k) =3D d(k) =E2=80=93 y(k) > >> -wik+1 =3D wik + 2=C2=B5e(k)x(k-i) > >> > >> or > >> > >> Using H(f) =3D S(f)2 / S(f)2 + N(f)2 > > > >It depends on the data. Do you have access to the spectrum estimates > >S(f)2 and N(f)2? If so, how reliable are they? Do you need to update > >the algorithm as you go? If so, how often? As a general rule of thumb, > >you do the filter design and analysis (i.e find out how the filter > >works > >or specify what you want the filter to do) in frequency domain, and the > > > >implementation (actually compute the numbers) in time domain. > > > >> 3) weiner filtering requires clean speech. > > > >Does it? What do you mean by that? How does a Wiener filter differ from > > > >an ordinary filter? > > > >> Does that means we have to do > >> spectral subtraction to estimate the clean speech before weiner > filtering? > >> If so, wouldn=E2=80=99t weiner filtering be more computational > intensive = > >then > >> spectral subtraction? > > > >Spectrum subtraction requires complete knowledge of the magnitude > >and phase of both signal and noise. Can you get that? Filtering > >requires > >estimates of relative bandwidths and magnitudes. Slightly easier to > >come > >by. Computational costs are irrelevant if you can't get your hands on > >the > >required input data. > > > >> 4) What is the purpose of the overlap add method? Why are we doing it? > > > >The overlap-add and overlap-save methods are used when you want > >to work in frequency domain with very long data sequences. Instead of > >computing the DFT of one huge data set, you chop it up in shorter > >batches, do your computations in frequency domain, and then splice > >the whole thing back together in time domain. > > > >Rune > > > > > > Hi, I noticed that both the time and frequency domain implementation needs > clean speech. So the easiest way that i can think of is to use an energy > detection algorithm that treat the first few frames as noise and compute > the mean noise energy. Subsequent frames above this threshold will be > treated as signal and those below will be treat as noise and the mean > noise power will be updated for noise frame. > > After which, clean speech will be obtain from subtracting the noisy speech > from the mean noise and hence, we are able to compute the wiener equations. > These are just some of my thoughts. However, i have tried hearing the > effect of simple spectral subtraction. It isn't very good and i guess its > because the spectral estimates are not very accurate. Thus, if we obtain > the (not so accurate)clean speech through spectral subtraction and use it > subsequently on wiener filtering, it won't really do us any good, will > it?
One of the problems with the Wiener filter is that you need the autocovariance of both the useful signal (is that what you mean by "clean"?) and also the noisy signal. One way of doing that, is to have a noise free reference data set measured in advance, and one set of noisy data measured in the field, during installation of the filter. This is a requirement that seriously affect where and how the Wiener filter can be used. One might try spectrum subtraction instead if the covariance of the noise is available, but...
> As for the overlap method, i try to see if what i interpreted is correct > using a simple matlab code i've written myself. I feel that the main > objective is to minimise the effect of the window when we do block > processing.
Yep, that's the purpose of the overlap-add method. Rune