Technical discussions related to Audio Signal Processing (digital effects, acoustics, noise reduction, musical signal processing, etc).
Hi all, I would really appreciate your ideas on something... Ok, I am building a speech recognition system which uses a mic array to capture the sound. The signal coming in is not only speech but it's also affected by noise and room reverberation. I need to see what's the effect of reverberation alone so I mixed the speech signal from a close-talking microphone (so no noise and reverberation there) with some noise I cut out from the mic array recording. So, if there is a difference in how the system deals with the mic array signal and the close-talking+noise signal, it's got to be due to the reverberation. My question is simpler than that. I need to have the same noise to signal ratio in both signals. Because even though the added noise in the close-talking+noise signal is as loud as in the mic array signal, the speech in the mic array signal is quieter. So, just adding the noise is not quite right. Do you know how I can calculate the signal to noise ratio in both signals or what software to use to do this? I was thinking I could just take the average db value of the whole mic array signal, then the average level for the noise and then just subtract it from the whole, in order to get the level of the speech signal alone. Does that make sense? What software do you suggest using? Thank you in advance, Best regards, Erida
On Sunday 13 August 2006 01:05, c...@yahoo.co.uk wrote: > Hi all, Hi Erida, > >The signal coming in is not only speech but it's also > affected by noise and room reverberation. I need to see what's the effect > of reverberation alone. >From this sight, noise can be treated as desired signal-component, since=20 you're only interested in (signal+reverb)/(signal), so it can be completely= =20 ignored. > so I mixed the speech signal from a close-talking=20 > microphone (so no noise and reverberation there) with some noise I cut ou= t > from the mic array recording.=20 Replace speech by a sine wave. This will make things easy. If you are able to measure and/or calculate your desired results, then go a= =20 step further and sweep the frequency and/or phase of your sine wave, Finally extend the source to an impulse. This will lead to a quite handy model of your reverberations. Basically, this is nothing else than applying the FFT concept. I guess that this helps to solve your issue. > My question is simpler than that. > > I need to have the same noise to signal ratio in both signals. Because ev= en > though the added noise in the close-talking+noise signal is as loud as in > the mic array signal, the speech in the mic array signal is quieter. So, > just adding the noise is not quite right. I'm not sure if I fully understand, so be patient, if my answer isn't helpf= ul: Reverberation isn't independent of the causing signal. Instead it's phase i= s=20 strongly related to the original signal's phase. Adding signal levels in dB works only, if the signal components are unrelat= ed.=20=20 Therefore it's not allowed in your case: your math will always lead to wron= g=20 results. Example 1: Two sound sources (sine waves, same frequency, same phase, same amplitude,= =20 same location ). Switch one on and off. Which relation do you get? Example 2: Now phase difference of 180=B0 deg. Which relation do you get now? Example 3: Same sources, but different locations (one meter apart). Which relation do = you=20 get now? Where? Bernhard =20
Hi BVL, Thanks for your reply, I am sorry. My first post was probably poorly worded. What I am trying to do is not add reverberations, but take reverberation out of the equation for a while. In fact, I am trying to add the correct amount of noise. My ultimate goal is to see how reverberation *alone* aggravates the system's performance. My idea was to make up a new signal by mixing the speech of the c-t mic and some noise from the distant mics. Then, I would put the new wavs through the system and compare the recog results with those of the mic array recordings. Of course, I needed the same SNR in both conditions. So, here is what I did... My approach was to apply to my wav files a window of 0.2 sec in order to get the spectral envelope. Then, I got the RMS in dB for some specified vowels and noise regions in each file. I did that for both c-t and mic array recordings. I estimated the SNR (by dividing the RMS value of the speech/vowels by the RMS value of noise)and then I figured out how much I should add to the mic array noise before adding it to the c-t mic wavs, in order to match the mic array SNR. It seemed a quite simple and reasonable approach.But I, kind of, lack any deep knowledge in signal processing and this became evident once more. To my distress, I got even worse recog results than the ones for the mic array condition. Now, I am thinking of adding white gaussian noise instead of the mic array noise (the thing is that the mic array noise contains reverb as well so this might be an issue of greater significance than I had previously assumed). What is your opinion on that? Still, I am not sure how to measure the SNR of my original c-t recordings. Any suggestions? Really- I'd be grateful!
Hi Erida,
I think that should not be a difficult task........
Have you tried using COOLEDIT?
It has option for adding reverberations.
BVL
c...@yahoo.co.uk wrote:
Hi all,
I would really appreciate your ideas on something...
Ok, I am building a speech recognition system which uses a mic array to capture the sound. The
signal coming in is not only speech but it's also affected by noise and room reverberation.
I need to see what's the effect of reverberation alone so I mixed the speech signal from a
close-talking microphone (so no noise and reverberation there) with some noise I cut out from
the mic array recording.
So, if there is a difference in how the system deals with the mic array signal and the
close-talking+noise signal, it's got to be due to the reverberation.
My question is simpler than that.
I need to have the same noise to signal ratio in both signals. Because even though the added
noise in the close-talking+noise signal is as loud as in the mic array signal, the speech in
the mic array signal is quieter. So, just adding the noise is not quite right.
Do you know how I can calculate the signal to noise ratio in both signals or what software to
use to do this? I was thinking I could just take the average db value of the whole mic array
signal, then the average level for the noise and then just subtract it from the whole, in order
to get the level of the speech signal alone.
Does that make sense? What software do you suggest using?
Thank you in advance,
Best regards,
Erida
Hi Erida,
Irrespective of the algorithm ur using
1. Noise power will be same across all mics, by default.
2. Signal power will vary across mics(max at c-t)
I don't understand why are trying to make same SNR across all Mics.(or ur not?)
Your goal is to see reverberaton effect, I guees what u need is something like AECs (echo
cancellers).
ON in first case & OFF in other case to see the effect of reverberations.
Just a thought,
u can try same experiment in close space (room) & in open space & compare the result.
>Still, I am not sure how to measure the SNR of my original c-t recordings.
As long as signal strength is considerably greater than noise the approach ur using will surely
give you correct SNR (to much extent).
BVL
clcada <c...@yahoo.co.uk> wrote:
Hi BVL,
Thanks for your reply,
I am sorry. My first post was probably poorly worded.
What I am trying to do is not add reverberations, but take
reverberation out of the equation for a while. In fact, I am trying to
add the correct amount of noise.
My ultimate goal is to see how reverberation *alone* aggravates the
system's performance. My idea was to make up a new signal by mixing
the speech of the c-t mic and some noise from the distant mics. Then,
I would put the new wavs through the system and compare the recog
results with those of the mic array recordings.
Of course, I needed the same SNR in both conditions.
So, here is what I did...
My approach was to apply to my wav files a window of 0.2 sec in order
to get the spectral envelope. Then, I got the RMS in dB for some
specified vowels and noise regions in each file. I did that for both
c-t and mic array recordings. I estimated the SNR (by dividing the RMS
value of the speech/vowels by the RMS value of noise)and then I
figured out how much I should add to the mic array noise before adding
it to the c-t mic wavs, in order to match the mic array SNR.
It seemed a quite simple and reasonable approach.But I, kind of, lack
any deep knowledge in signal processing and this became evident once more.
To my distress, I got even worse recog results than the ones for the
mic array condition.
Now, I am thinking of adding white gaussian noise instead of the mic
array noise (the thing is that the mic array noise contains reverb as
well so this might be an issue of greater significance than I had
previously assumed). What is your opinion on that? Still, I am not
sure how to measure the SNR of my original c-t recordings.
Any suggestions?
Really- I'd be grateful!
HI , There are quite a number of ways to do it. One fine method would be to use Cool Edit as suggested earlier or if you are planning to write a software just pass the noise signal (with out any speech) or pure speech(with out noise) u can collect any of these samples and the mixed signal u can write a software to calculate the power spectral density in each case or simple mse and find the ratio which should solve your problem. regards Bob --- BVL <m...@yahoo.com> wrote: > Hi Erida, > I think that should not be a difficult > task........ > Have you tried using COOLEDIT? > It has option for adding reverberations. > > BVL > > > > > c...@yahoo.co.uk wrote: > Hi all, > > I would really appreciate your ideas on something... > > Ok, I am building a speech recognition system which > uses a mic array to capture the sound. The signal > coming in is not only speech but it's also affected > by noise and room reverberation. > I need to see what's the effect of reverberation > alone so I mixed the speech signal from a > close-talking microphone (so no noise and > reverberation there) with some noise I cut out from > the mic array recording. > So, if there is a difference in how the system deals > with the mic array signal and the > close-talking+noise signal, it's got to be due to > the reverberation. > > My question is simpler than that. > > I need to have the same noise to signal ratio in > both signals. Because even though the added noise in > the close-talking+noise signal is as loud as in the > mic array signal, the speech in the mic array signal > is quieter. So, just adding the noise is not quite > right. > > Do you know how I can calculate the signal to noise > ratio in both signals or what software to use to do > this? I was thinking I could just take the average > db value of the whole mic array signal, then the > average level for the noise and then just subtract > it from the whole, in order to get the level of the > speech signal alone. > Does that make sense? What software do you suggest > using? > > Thank you in advance, > > Best regards, > > Erida