Forums

Noise variance estimate

Started by DWT February 3, 2009
I'm trying to implement a simple denoising module based on wavelet
filter-banks (DWT) with VAD. Noise is white, but I'm going to use other
noises as well (car, babble, pink,..)

1) How can one estimate the noise variance on each subband?

2) Would it be better to build VAD on the base of decomposition or as
stand-alone module (energy feature)? The main problem that I'm not sure how
to use standard frame-by-frame (frame=30ms, overlap=15ms) energy
calculation for the DWT decomposition as long as each level contains 2
times less samples  than the previous.

Thanks for taking the time reading my questions :)

On Feb 4, 12:53&#2013266080;am, "DWT" <zwfilte...@yahoo.com> wrote:
> I'm trying to implement a simple denoising module based on wavelet > filter-banks (DWT) with VAD. Noise is white, but I'm going to use other > noises as well (car, babble, pink,..) > > 1) How can one estimate the noise variance on each subband? > > 2) Would it be better to build VAD on the base of decomposition or as > stand-alone module (energy feature)? The main problem that I'm not sure how > to use standard frame-by-frame (frame=30ms, overlap=15ms) energy > calculation for the DWT decomposition as long as each level contains 2 > times less samples &#2013266080;than the previous. > > Thanks for taking the time reading my questions :)
Sum of the squares divded by the number of points! You can also do this recursively - there are a number of methods. Hardy
>Sum of the squares divded by the number of points! You can also do >this recursively - there are a number of methods. > > >Hardy
Hardy, Thanks for the prompt reply. Is it possible to estimate noise variance in the frames with 'speech' data or only in the 'non-speech\silence' frames? Serge
On Feb 4, 8:33 pm, "DWT" <zwfilte...@yahoo.com> wrote:
> >Sum of the squares divded by the number of points! You can also do > >this recursively - there are a number of methods. > > >Hardy > > Hardy, > > Thanks for the prompt reply. Is it possible to estimate noise variance in > the frames with 'speech' data or only in the 'non-speech\silence' frames? > > Serge
Normally noise would be present on its own when speech is absent. Therefore you would need a voice activity detector. Hardy
On Feb 4, 8:33&#2013266080;am, "DWT" <zwfilte...@yahoo.com> wrote:
> >Sum of the squares divded by the number of points! You can also do > >this recursively - there are a number of methods. > > >Hardy > > Hardy, > > Thanks for the prompt reply. Is it possible to estimate noise variance in > the frames with 'speech' data or only in the 'non-speech\silence' frames? > > Serge
What you need is a model, not ad hoc procedures. illywhacker;
On 3 Feb, 12:53, "DWT" <zwfilte...@yahoo.com> wrote:
> I'm trying to implement a simple denoising module based on wavelet > filter-banks (DWT) with VAD. Noise is white, but I'm going to use other > noises as well (car, babble, pink,..)
You need to first of all contemplate very carefullt what you mean by 'noise'. In the context of DSP the term is usually understood as a signal that has certain statistical properties, like with AWGN and pink noise. However, in day-to-day language 'noise' also means 'unwanted' or 'uninteresting' sound, which is interpreted subjectively according to the context: If your main interest is to talk with John, Jack's speaking on the phone in the corner of the room is 'noise', even if Jack calls somebody to make a deal on your behalf. If you and Jack were alone in the room, the same phone call of Jack's would be very interesting to you. So you need to discriminate between the 'signal of interest', which usually have one set of statistical properties, 'noise' in the DSP sense, which have other statistical properties, and 'interfering sources' which have the same kind of statistical properties as the signal of interest, but which are of no interest to you. It's usually not too difficult to get at least an impression of the amount of AWGN noise is present. You just have to look around your formulas and find some error terms here and there. These are usually related to noise. They are related to both numerical inaccuracy and model mis-match as well, so be careful not to over-interpret such factors, but you can get the overall impression. The difficult part is how to handle interfering sources, like the 'cars' and 'babble' you mention above. (I assume you by 'babble' mean people talking in the background.) These sources are very different, so it might be possible to separate sound from a car engine from a speach signal, but you will have problems separating background babble from a speech signal, as the only difference between them is which speaker you are interested in listening to. Similarly, speech ought to be separable from steady-state engine noise, but the sounds from two different engines or motors might not be separable. Rune