comp.dsp | subtraction of two speech signals

I am working on project on Adaptive filter using TMSC6713.
I have to calculate the snr ratio at the output.SO what i did is ,I gave
only the speech signal to the adaptive filter and recorded the output,The
second thing I did is I gave both speech signal and the wind noise to the
adaptive filter and recorded the output.

So now I have 2 signals.
1. Clear speech signal 
2.Speech signal+noise(very  small amount)

I want to get just the noise from this two signal.I tried direct
subtraction on matlab but it didn't work?Can anyone help me out??


---------------------------------------
Posted through http://www.DSPRelated.com

Reply by ●December 21, 20152015-12-21

> 
> So now I have 2 signals.
> 1. Clear speech signal 
> 2.Speech signal+noise(very  small amount)
> 
> I want to get just the noise from this two signal.I tried direct
> subtraction on matlab but it didn't work?Can anyone help me out??
> 
> 

for that to work, the two copys of the desired speech signal must be at exactly the same gain and phase as each other.

M

Reply by ●December 21, 20152015-12-21

On Monday, December 21, 2015 at 12:38:16 PM UTC+13, abhi5491 wrote:
> I am working on project on Adaptive filter using TMSC6713.
> I have to calculate the snr ratio at the output.SO what i did is ,I gave
> only the speech signal to the adaptive filter and recorded the output,The
> second thing I did is I gave both speech signal and the wind noise to the
> adaptive filter and recorded the output.
> 
> So now I have 2 signals.
> 1. Clear speech signal 
> 2.Speech signal+noise(very  small amount)
> 
> I want to get just the noise from this two signal.I tried direct
> subtraction on matlab but it didn't work?Can anyone help me out??
> 
> 
> ---------------------------------------
> Posted through http://www.DSPRelated.com

Only way to approx SNR is to fins the noise power when the speech is not present. You can then find signal plus noise power and you then have (S+N)/N  . Ok so everything is non-stationary but you can make a good estimate maybe.

Reply by ●December 21, 20152015-12-21

Step 1. Record the sample-by-sample  history of the coefficients of your adaptive filter when adapting to signal plus noise. 

Step 2. Apply this coefficient history to another copy of the adaptive filter and pass the noise signal only through this filter. Note this filter is not really adapting but just providing a time-varying filter function. 

Step 3. Subtract the filtered noise from the filtered signal plus noise. 

Bob

Reply by ●December 21, 20152015-12-21

Sorry just realized you want the noise signal not the speech, so use the same technique but apply the speech signal to the slave filter and then subtract the filtered speech. 
Note you could also apply this technique on the fly without storing the coefficient history if you're careful about the order of operations (it would be easy to be off by 1 sample). 

Bob

Reply by ●January 6, 20162016-01-06

On Sunday, December 20, 2015 at 5:38:16 PM UTC-6, abhi5491 wrote:
> I am working on project on Adaptive filter using TMSC6713.
> I have to calculate the snr ratio at the output.SO what i did is ,I gave
> only the speech signal to the adaptive filter and recorded the output,The
> second thing I did is I gave both speech signal and the wind noise to the
> adaptive filter and recorded the output.
> 
> So now I have 2 signals.
> 1. Clear speech signal 
> 2.Speech signal+noise(very  small amount)

I don't know enough about the TMSC6713 to directly answer THAT question, but the general question underlying this is how to reverse-mix sound, where one of the mixing components is already given (up to slight alterations in frequency and time).

You actually only need the sound with noise, not a separate sound file. But I'll answer the direct question first before addressing the this.

Take the spectrographs of the two and subtract them. This requires  doing a kind of "motion estimation" on one to determine what portion of that one best fits over what corresponding portion of the other.

To properly handle phase, the spectrographs should ideally be generated using the phase to do frequency and time relocation with (I made a reference to this a while back:

Estimating and Interpreting the Instantaneous Frequency of a Signal
Part 1: Fundamentals
Boualem Boashash, Senior Member, IEEE
Proceedings of the IEEE, Volume 80, No. 4, April 1992, pp. 520-538.

which is on line as a PDF, via Google search.)

Subtract the amplitude spectra, once they have been lined up and (the hard part) convert back to sound.

The more direct way is to do median-subtraction of the noise and/or factor analysis(!) on the spectrograph.

These were two early experiments based indirectly on this general idea:

Cheyenne's Song:
https://www.youtube.com/watch?v=mE2_xjq9aG8&feature=youtu.be

Mixed in with a spectrographically-derived sound.
I used median-subtraction to filter out the noise from the spectrograph and re-keyed the sound in 1-2 harmonic.

Outer Limits Intro (unmixed):
https://www.youtube.com/watch?v=Qok1QLMhrxc

The famous intro sequence to the 1960s version of The Outer Limits. Voice and beacon are separated. Simple factor analysis (singular value decomposition) ... combined with median-subtraction on the spectrograph.

Factor analysis should properly be done as convex programming rather than least squares programming. One simple way (not the best) is to subtract out one factor at a time from a spectrograph by the following convex programming problem:
   minimize sum_{q,p} |A(q,p) - L(q) R(p)|^2,
   where the amplitudes A(q,p) are non-negative
   subject to the conditions
   L(q) >= 0, for all spectrograph times q
   R(p) >= 0, for all spectrograph frequencies p
   L(q) R(p) <= A(q,p) for all q, p
   sum_p R(p)^2 = 1 to normalize the R vector, for convenience.

I'm going to try and implement this in the near future and experiment with it. Right now, all I have is a routine (DeLayer.c, which is freely available; send me an e-mail note if you want this and the other routines I've mentioned elsewhere in other articles recently) which solves the least squares problem
   sum_{q,p} |A(q,p) - sum_f L_f(q) R_f(p)|^2
   performing a factor rotation in f-space on (L_f) and (R_f)
   to make it maximally sparse.
This doesn't quite work since the convexity constraints are not satisfied. But it still produces somewhat workable results.

Reply by ●January 14, 20162016-01-14

On Wednesday, January 6, 2016 at 7:29:33 PM UTC-6, federat...@netzero.com wrote:
> I don't know enough about the TMSC6713 to directly answer THAT question, but the general question underlying this is how to reverse-mix sound, where one of the mixing components is already given (up to slight alterations in frequency and time).
> 
> Take the spectrographs of the two and subtract them. This requires  doing a kind of "motion estimation" on one to determine what portion of that one best fits over what corresponding portion of the other.

I tried this. It works perfectly -- subject only to the limitations of the spectrograph -> sound conversion.

> To properly handle phase, the spectrographs should ideally be generated using the phase to do frequency and time relocation with (I made a reference to this a while back:

Since I'm not doing time-relocation, graph->sound has a small auditorium acoustic, due to the time localization. For this reason, however, no "location matching" was required between the two spectra.

As it turns out, it's even possible to use this simple subtraction method to take out regular background sound components (e.g. tones, drums). None of this really occurred to me before your query.

The algorithm I'm using for reverse conversion works with the amplitude spectrograph (even without relocation) with no DSP transforms required. (Only the sound -> spectrograph direction requires an FFT). A quadratic fit to find peaks takes the place of frequency relocation, and a (function-space derived) distance measure to mate segments from each time slice with the adjacent ones (and cubic fitting to cross-fade phases and amplitudes).

I'll see if I can post a demo on YouTube.

Reply by ●January 28, 20162016-01-28

2016 January 14 16:52:33:
> I'll see if I can post a demo on YouTube.

Here it is:
	Experiment in Reverse Mixing and Sound Morphing
	(Galadriel versus Leia)
	https://www.youtube.com/watch?v=Sl1SwkiIo30
where I separate out voice only then music only;
the reference videos are in the description below.

This also does sound morphing: when Galadriel goes postal. :)

2016 January 6 19:29:33 me:
>> ... the general question underlying this is how to reverse-mix sound,
>> where one of the mixing components is already given (up to slight
>> alterations in frequency and time).
>> ... Take the spectrographs of the two and subtract them.

2016 January 14 16:52:33:
> I tried this. It works perfectly -- subject only to the limitations
> of the spectrograph -> sound conversion.

>> ... To properly handle phase,
>> the spectrographs should ideally be generated using the phase to do
>> frequency and time relocation with

In the demo I put a "scalograph" in the lower left of the ORIGINAL sound,
and the spectrograph in the lower right the sound was produced FROM
(with a little remixing of the original at the end).

The original video is
	Princess Rap Battle: Galadriel versus Leia
	https://www.youtube.com/watch?v=RL52R7m8b7w

	Princess Rap Battle: Galadriel versus Leia (Karaoke)
	https://www.youtube.com/watch?v=c0TE_xEtRjY

the karaoke serving as the reference, as mentioned above.
Voice-only comes straight out by from subtracting their spectrographs.

The scalograph goes (27.5-1760 Hz) by octaves with a 0.2 second window.
The PHASE is color coded (red = 0 degrees) amplitude by brightness.
So around the 80 Hz. area you'll see about 16 candy-stripes for each line.

The lines on the scalograph are the individual sound components.
Though I could have done sound separation with this I didn't
The concentrated on lines is what results from doing frequency relocation.

Reply by ●January 28, 20162016-01-28

2016 January 14 16:52:33 me:
> I'll see if I can post a demo on YouTube.

The other method I described -- suitable for noise-removal -- was by median-subtraction filtering straight off the spectrograph; as demonstrated here:
(You need to hear this on heaphones, the bass is deep and rumbling!)

Cheyenne's Song
https://www.youtube.com/watch?v=mE2_xjq9aG8

The noise in the beginning was from a straight-out recording (all 6 voices are me, unedited). The noise disappears and then I added in a second set of 6 voices 1 octave up to make a 12-voice combined male/female chorus, before it reverts to the original unedited form at the end.

BTW, the poster in the static video was originally slated to be a photo -> 3D -> video converted project; which I deferred to later ... sorta like what I did with this sneak Independence Day II trailer

Flyover of a nice scene (with a twist)
https://www.youtube.com/watch?v=z_Dm-76SFCA

over the area of the Grand Canyon where Will Smith had his dogfight in the original movie. The sound here, too, is spectrographically produced, the radio or tornado-alter voice (mine) coming about by doing a bandpass on the spectrograph itself; the rumble standing in marked contrast. You DEFINITELY need headphones for this one too.

subtraction of two speech signals

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group