comp.dsp | Question regarding the validity of a SNR calculation

Hello,

I have a question regarding the computation of SNR.

The case I am working on is a speech processing algorithm which
suppresses background noise. The input signal is a noisy speech
sequence defined as x[k]=s[k]+n[k] where s is clean speech and
n is quasi-stationary pink noise.

I would like to know how well the algorithm performs.

The SNR of the input signal is calculated as:

SNR=10*log10(var(s)/var(n))

where var() stands for variance.

To obtain an estimate of how well the algorithm performs,
I also need to calculate the SNR of the output.

I do this by setting x[k]=n[k]. In that way I obtain the
processed background noise Yn[k].

I then set x[k]=s[k]+n[k] and obtain the estimated
clean speech Ys[k].

Then I calculate:

SNR=10*log10(var(Ys)/var(Yn))


Any comments to this approach?


Cheers...

Reply by PeteS ●September 24, 20062006-09-24

John wrote:
> Hello,
>
> I have a question regarding the computation of SNR.
>
> The case I am working on is a speech processing algorithm which
> suppresses background noise. The input signal is a noisy speech
> sequence defined as x[k]=s[k]+n[k] where s is clean speech and
> n is quasi-stationary pink noise.
>
> I would like to know how well the algorithm performs.
>
> The SNR of the input signal is calculated as:
>
> SNR=10*log10(var(s)/var(n))
>
> where var() stands for variance.
>
> To obtain an estimate of how well the algorithm performs,
> I also need to calculate the SNR of the output.
>
> I do this by setting x[k]=n[k]. In that way I obtain the
> processed background noise Yn[k].
>
> I then set x[k]=s[k]+n[k] and obtain the estimated
> clean speech Ys[k].
>
> Then I calculate:
>
> SNR=10*log10(var(Ys)/var(Yn))
>
>
> Any comments to this approach?
>
>
> Cheers...

I have a practical comment. The 10log10 statements will only be true if
you are measuring power. If you are measuring voltage (far more common)
or even current then you'll need to use 20log10 [x]

(I know that only really true if the load resistance for each element
[the signal source and the signal + noise source] is the same, but in
this case it should work perfectly.)

Cheers

PeteS

Reply by John ●September 24, 20062006-09-24

> I have a practical comment. The 10log10 statements will only be true if
> you are measuring power. If you are measuring voltage (far more common)
> or even current then you'll need to use 20log10 [x]
>

Thanks for replying :-)

Is variance not the same as power? The power spectrum shows the distribution
of variance over a set of frequencies. Right?

As far as I remember you use a multiplication factor of 20 when you are
measuring the ratio of squared amplitude, that is :

SNR= 20 log10 (S_amplitude^2 / N_amplitude^2)

Correct me if I am mistaken :-)

Reply by Mark Borgerding ●September 24, 20062006-09-24

John wrote:
>> I have a practical comment. The 10log10 statements will only be true if
>> you are measuring power. If you are measuring voltage (far more common)
>> or even current then you'll need to use 20log10 [x]
>>
> 
> Thanks for replying :-)
> 
> Is variance not the same as power? The power spectrum shows the distribution
> of variance over a set of frequencies. Right?

If the average value is zero, yes.  Otherwise the power is greater than
the variance.

--
Mark Borgerding

Reply by John ●September 24, 20062006-09-24

> If the average value is zero, yes.  Otherwise the power is greater than
> the variance.

Well, in this case the average value is zero (speech signals).

Thanks for replying :-)

Reply by PeteS ●September 24, 20062006-09-24

John wrote:
> > I have a practical comment. The 10log10 statements will only be true if
> > you are measuring power. If you are measuring voltage (far more common)
> > or even current then you'll need to use 20log10 [x]
> >
>
> Thanks for replying :-)
>
> Is variance not the same as power? The power spectrum shows the distribution
> of variance over a set of frequencies. Right?
>
> As far as I remember you use a multiplication factor of 20 when you are
> measuring the ratio of squared amplitude, that is :
>
> SNR= 20 log10 (S_amplitude^2 / N_amplitude^2)
>
> Correct me if I am mistaken :-)

Well, I would usually take 10log10 (SPwr/NPwr) or 20 log10 (S
Amplitude/N Amplitude) which is the same as 10log10 (S Amplitude^2/N
Amplitude^2) assuming equal resistances in the power system

>From the basic identity that log a^2 = 2 log a, of course;  more
generally that log a^x = x log a.

Cheers

PeteS

Reply by this-email-address-is-invalid ●September 24, 20062006-09-24

> To obtain an estimate of how well the algorithm performs,
> I also need to calculate the SNR of the output.
>
> I do this by setting x[k]=n[k]. In that way I obtain the
> processed background noise Yn[k].
>
> I then set x[k]=s[k]+n[k] and obtain the estimated
> clean speech Ys[k].
>
> Then I calculate:
>
> SNR=10*log10(var(Ys)/var(Yn))
>
> Any comments to this approach?

I don't have experience in this sort of quantitative evaluation of
noise reduction algorithms, but here's my take on this.

With this calculation method, it seems to me an algorithm could get an
arbitrarily high "SNR" just by detecting the presence of speech and
boosting overall output volume when speech is present.

I suggest that you find the distance between Ys[k] and s[k] instead.
For example, calculate the rms difference between them.  If you have
the time to get into it, a more perceptual distance measure might be
better than rms.  Unfortunately I can't think of any references offhand
for perceptual distance measures but I believe various people have
developed code for them to aid in the evaluation of speech compression
or noise reduction methods.  (One idea might be to apply an A-weighting
filter to the signals before computing the rms distance.)

(I am assuming your work is aimed at helping human listeners rather
than, e.g., improving computer speech recognition accuracy.  Even for
the latter, A-weighting is not unreasonable.)

I think the best measurement of all is tests with human listeners
(e.g., having listeners rate the quality of the processed and
unprocessed output to obtain Mean Opinion Scores, or testing
intelligibility if improving intelligibility is what you are after).
But that may be time-consuming or expensive.

By the way I have some noise reduction code linked at
http://www.icsi.berkeley.edu/Speech/papers/gelbart-ms/pointers/
which you might find interesting for comparison. 

Good luck,
David

Reply by this-email-address-is-invalid ●September 24, 20062006-09-24

this-email-address-is-invalid wrote:

> I suggest that you find the distance between Ys[k] and s[k] instead.
> For example, calculate the rms difference between them.  If you have
> the time to get into it, a more perceptual distance measure might be
> better than rms.  Unfortunately I can't think of any references offhand
> for perceptual distance measures but I believe various people have
> developed code for them to aid in the evaluation of speech compression
> or noise reduction methods.  (One idea might be to apply an A-weighting
> filter to the signals before computing the rms distance.)
>

For distance calculation, Bryan Pellom's Objective Speech Quality
Assessment toolkit at http://cslr.colorado.edu/rspl/rspl_software.html
might be of interest.

Reply by John ●September 24, 20062006-09-24

Thanks for all the links :-)

I appreciate it..

The reason why I asked about how SNR should be calculated is that
it seems "wrong" to me to calculate the error signal as the difference
between the original clean speech signal and the estimated speech
signal. While it _is_ a true error signal, it doesn't make any sense
to calculate SNR using this kind of error signal as a reference in
this context. Why?

First of all, the SNR of the input is calculated as:

10log10(var(s)/var(n))

If we assume that the algorithm A operates on the input signal in
a close-to-linear way and the input signal is defined as x=s+n, then

A(x)=A(s)+A(n)

The SNR of the output should then - in the name of consistency - be
calculated as

10log10(var(A(s))/var(A(n)))

A(n) is obtained by sending the noise component for a given SNR input
signal through the algorithm; that is setting x=n for a given SNR.

I haven't thought in detail about the validity about this approach and this
is why I am posting the question. But intuitively it seems like the right
approach.

To verify the "linear" properties of the algorithm I tried to set x=s+n
for a given SNR and saved the output O1 of the algorithm.

I then set x=n for the same SNR and saved that output O2.

I then played O1-O2 and it definitely sounds better than O1 so I guess
that implies that O2 _is_ the remaining noise component after processing.

I don't know if I make any sense, so I hope some experts out there can
correct me if my approach is not valid.

Thank you.

Reply by this-email-address-is-invalid ●September 24, 20062006-09-24

John wrote:

> The reason why I asked about how SNR should be calculated is that
> it seems "wrong" to me to calculate the error signal as the difference
> between the original clean speech signal and the estimated speech
> signal. While it _is_ a true error signal, it doesn't make any sense
> to calculate SNR using this kind of error signal as a reference in
> this context.

I was suggesting to use a distance measure between the original clean
speech signal and the estimated speech signal as the quality measure
instead of using SNR, not as a step in SNR calculation.  Sorry if that
wasn't clear.

Previous12 Next

Question regarding the validity of a SNR calculation

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group