Reply by maury March 18, 20112011-03-18
On Mar 16, 12:59&#4294967295;pm, John McDermick <johnthedsp...@gmail.com> wrote:
> > Do it adaptively, and you don't have to the delay at all. > > If I do not delay the signal which is fed to the filter, then the > delay will manifest itself > in the filter tap values instead...so the delay compensation is going > to be there no > matter what....right??? > > Also, you say "do it adaptively"....is there a non-adaptive approach > and how well would > that work?
Sorry for the _delay_ :) I was out of town. In your steps 3 - 5 you state: 3. Find delay between buffered speaker signal and buffered microphone signal 4. Use estimated delay to time align buffered microphone signal and buffered speaker signal 5. Calculate the amplitude spectrum of the time-aligned (delayed) speaker signal. That is NOT an adaptive algorithm. If done adaptively, you don't need to know the delay. The canceller's filter will determine it automatically, and place the impulse respone estimate accordingly. In fact, the filter will give you the estimate of the impulse response, place the delay appropriately, and perform the subtraction. Which your description won't do. For the acoustic echo cancellation problem, the two biggest challenges you should have is the inherent non-linearity of the impulse response, and the non-stationarity of the impulse response. The delay estimate and signal subtraction should be trivial, and performed as part of the adaptive process. In fact, the impulse response estimate, even if extremely poor due to a bad model, is trivially generated by the echo canceller filter. Maurice Givens
Reply by Tim Wescott March 16, 20112011-03-16
On 03/16/2011 01:46 PM, John McDermick wrote:
>> So the microphone output is not then being fed to the speaker? This >> isn't an active feedback cancellation, but just canceling a signal from >> a known source (the speaker drive signal)? > > Here is the scenario...depending on how you want to look at it: > > A: Far-End person > B: Near-End person > > A is talking to B. > > B has A on speaker-phone. > > B is bragging about the great software he implemented on his new > phone. > > A's speech is played out through the loudspeaker on B's phone. > > The microphone on B's phone picks up the signal from the loudspeaker > and feeds it back to A. > > A thinks it's annoying to hear a delayed version of his speech, so he > asks B to come up with an AEC. > > I am B...except I don't like to brag! :o)
You can't admit much delay into your echo canceler, then. You need to figure out, on-line, the coupling between your phone's speaker and it's microphone, and cancel that signal in real time, or at least without introducing too much time lag in the conversation. This is a solved problem -- every speaker phone in the world does it to one extent or another. Cheezy speaker phones do this by making the connection simplex: when you're talking above a certain volume, the speaker is cut off and the microphone in, then when you're quiet the speaker is turned on and the microphone off. Really nice conference room phones (like that three-legged spider from US Robotics) give you a virtual connection to the other side. I think you want to search on "active echo cancellation". -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" was written for you. See details at http://www.wescottdesign.com/actfes/actfes.html
Reply by John McDermick March 16, 20112011-03-16
> So the microphone output is not then being fed to the speaker? &#4294967295;This > isn't an active feedback cancellation, but just canceling a signal from > a known source (the speaker drive signal)?
Here is the scenario...depending on how you want to look at it: A: Far-End person B: Near-End person A is talking to B. B has A on speaker-phone. B is bragging about the great software he implemented on his new phone. A's speech is played out through the loudspeaker on B's phone. The microphone on B's phone picks up the signal from the loudspeaker and feeds it back to A. A thinks it's annoying to hear a delayed version of his speech, so he asks B to come up with an AEC. I am B...except I don't like to brag! :o)
Reply by Tim Wescott March 16, 20112011-03-16
On 03/16/2011 12:33 PM, John McDermick wrote:
>> So what are you really trying to do, anyway? Are you doing echo >> cancellation on-line from speaker to microphone to speaker? Or are you >> trying to (e.g.) separate a singer's voice signal from a background track? > > I am trying to cancel out the signal which the microphone picks up > from the near-end loudspeaker > in real-time without degrading any near-end speech which is also > picked up by the microphone.
So the microphone output is not then being fed to the speaker? This isn't an active feedback cancellation, but just canceling a signal from a known source (the speaker drive signal)? If that's the case, then some sort of after the fact block algorithm may well work, and will probably be more efficient on most processors.
>> In signal processing, everything is happening simultaneously > > Isn't that only true to the extent that the processes are independent > and truly parallel. Ultimately, > if you only have one processor, things happen in sequences and based > on priorities. A block > processing algorithm is just supposed to pick up an input block, do > some processing and then deliver an output. > How and when the algorithm is called is determined by the framework it > operates in (like the O/S or higher > layers)....right???
Down inside the processor, yes, code is being executed in sequence (or perhaps pipelined and done in parallel in ways that are independent of the intent of the programmer, and by design don't change the effective sequence of execution). But from the point of view of a filter, all the computations going from one sample to the next happen in the same epoch. You are correct about what a block processing algorithm is supposed to do -- but if you want to process multiple blocks, then you have to do some significant work to stitch them together. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" was written for you. See details at http://www.wescottdesign.com/actfes/actfes.html
Reply by John McDermick March 16, 20112011-03-16
> So what are you really trying to do, anyway? &#4294967295;Are you doing echo > cancellation on-line from speaker to microphone to speaker? &#4294967295;Or are you > trying to (e.g.) separate a singer's voice signal from a background track?
I am trying to cancel out the signal which the microphone picks up from the near-end loudspeaker in real-time without degrading any near-end speech which is also picked up by the microphone.
> In signal processing, everything is happening simultaneously
Isn't that only true to the extent that the processes are independent and truly parallel. Ultimately, if you only have one processor, things happen in sequences and based on priorities. A block processing algorithm is just supposed to pick up an input block, do some processing and then deliver an output. How and when the algorithm is called is determined by the framework it operates in (like the O/S or higher layers)....right???
Reply by Tim Wescott March 16, 20112011-03-16
On 03/16/2011 10:55 AM, John McDermick wrote:
>> >> You don't need to estimate delays unless there are unknown delays in >> your system -- known delays can just be folded into h(tau). > > I guess that will work....I just thought it would be more logical to > align the > two signals first before estimating the transfer function. > > Can you elaborate on the sentence "particularly if you're feeding the > speaker from the microphone" ??? I am not sure > why sequential thinking wouldn't work well for real time signal > processing. Instructions are executed sequentially and > samples are acquired sequentially...so I am not sure I understand what > you mean... > > The acoustic echo canceller I am about to implement, is done in > software so that explains the software-engineerish approach. > > Is there a reason why you wrote "continuously"....It confuses me a > little bit because samples are acquired in discrete time...maybe > I am just too literal...
You use the language of block processing, and you have an if-then construct in your algorithm (step 2). In signal processing, everything is happening simultaneously; if you're not used to thinking in that way then your code doesn't work well (I've repaired a lot of such code). Yes, you do end up realizing your signal processing algorithm in software, but if you don't want to get tied in knots your innermost algorithm should read: "FOR EACH SAMPLE do this do that do the other thing etc. " Inside the "for each sample" block, then you can go hog-wild with all sorts of "softwareish" stuff, as long as you realize that from the outside of that block your signal processing should look like a system that operates on an input and a state, and coughs up an updated state and an output. If you're doing the processing in blocks, then you either need to do each block as a stand-alone effort, or you need to have some way of stitching the blocks together when you're done. In either case, you're not really doing things in real time anymore, or at least you are introducing a delay of at least one block time into your processing. So what are you really trying to do, anyway? Are you doing echo cancellation on-line from speaker to microphone to speaker? Or are you trying to (e.g.) separate a singer's voice signal from a background track? -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" was written for you. See details at http://www.wescottdesign.com/actfes/actfes.html
Reply by John McDermick March 16, 20112011-03-16
> > Do it adaptively, and you don't have to the delay at all.
If I do not delay the signal which is fed to the filter, then the delay will manifest itself in the filter tap values instead...so the delay compensation is going to be there no matter what....right??? Also, you say "do it adaptively"....is there a non-adaptive approach and how well would that work?
Reply by John McDermick March 16, 20112011-03-16
> > You don't need to estimate delays unless there are unknown delays in > your system -- known delays can just be folded into h(tau).
I guess that will work....I just thought it would be more logical to align the two signals first before estimating the transfer function. Can you elaborate on the sentence "particularly if you're feeding the speaker from the microphone" ??? I am not sure why sequential thinking wouldn't work well for real time signal processing. Instructions are executed sequentially and samples are acquired sequentially...so I am not sure I understand what you mean... The acoustic echo canceller I am about to implement, is done in software so that explains the software-engineerish approach. Is there a reason why you wrote "continuously"....It confuses me a little bit because samples are acquired in discrete time...maybe I am just too literal... Thank you....
Reply by maury March 16, 20112011-03-16
On Mar 16, 12:08&#4294967295;pm, Tim Wescott <t...@seemywebsite.com> wrote:
> On 03/16/2011 09:49 AM, John McDermick wrote: > > > > > > > Hello, > > > If I model the microphone signal, y, as the output, y_hat, of an FIR > > filter H(z) and the input to that FIR filter is the speaker signal x, > > is the following then the principles of acoustic echo cancellation: > > > Initialize H(z) to an all pass filter > > > Intialize estimated delay to 0 > > > 0. Buffer block of microphone data > > 1. Buffer block of speaker data > > 2. If near-end speaker is talking, go to 8. > > 3. Find delay between buffered speaker signal and buffered microphone > > signal > > 4. Use estimated delay to time align buffered microphone signal and > > buffered speaker signal > > 5. Calculate the amplitude spectrum of the time-aligned (delayed) > > speaker signal > > 6. Calculate the amplitude spectrum of the buffered microphone signal > > 7. Estimate H as Y/X > > 8. Filter the delayed speaker data using H(z) to obtain output y_hat > > 9. Subtract y_hat from y and send result to far-end > > This looks like a very software-engineerish way to attack a signal > processing problem. &#4294967295;Unfortunately, this sort of sequential thinking > doesn't work well for real time signal processing -- particularly if > you're feeding the speaker from the microphone. > > * Continuously input microphone data. > * Continuously input speaker data > * Continuously filter speaker data through the FIR filter to > &#4294967295; &#4294967295;get the estimated response from the speaker: > &#4294967295; &#4294967295;y_hat = h(tau) * x(t) (where '*' is convolution and h(tau) is the > &#4294967295; &#4294967295;impulse response of the FIR). > * Continuously subtract y_hat from y, to get the 'intended' signal > &#4294967295; &#4294967295;from the microphone: y_int = y - y_hat. > * Sit back and enjoy the lack of echoes, or perhaps the loud > &#4294967295; &#4294967295;feedback squeal because you used the wrong values for h(tau). > > You don't need to estimate delays unless there are unknown delays in > your system -- known delays can just be folded into h(tau). > > -- > > Tim Wescott > Wescott Design Serviceshttp://www.wescottdesign.com > > Do you need to implement control loops in software? > "Applied Control Theory for Embedded Systems" was written for you. > See details athttp://www.wescottdesign.com/actfes/actfes.html- Hide quoted text - > > - Show quoted text -
Do it adaptively, and you don't have to the delay at all.
Reply by Tim Wescott March 16, 20112011-03-16
On 03/16/2011 09:49 AM, John McDermick wrote:
> Hello, > > If I model the microphone signal, y, as the output, y_hat, of an FIR > filter H(z) and the input to that FIR filter is the speaker signal x, > is the following then the principles of acoustic echo cancellation: > > > Initialize H(z) to an all pass filter > > Intialize estimated delay to 0 > > 0. Buffer block of microphone data > 1. Buffer block of speaker data > 2. If near-end speaker is talking, go to 8. > 3. Find delay between buffered speaker signal and buffered microphone > signal > 4. Use estimated delay to time align buffered microphone signal and > buffered speaker signal > 5. Calculate the amplitude spectrum of the time-aligned (delayed) > speaker signal > 6. Calculate the amplitude spectrum of the buffered microphone signal > 7. Estimate H as Y/X > 8. Filter the delayed speaker data using H(z) to obtain output y_hat > 9. Subtract y_hat from y and send result to far-end
This looks like a very software-engineerish way to attack a signal processing problem. Unfortunately, this sort of sequential thinking doesn't work well for real time signal processing -- particularly if you're feeding the speaker from the microphone. * Continuously input microphone data. * Continuously input speaker data * Continuously filter speaker data through the FIR filter to get the estimated response from the speaker: y_hat = h(tau) * x(t) (where '*' is convolution and h(tau) is the impulse response of the FIR). * Continuously subtract y_hat from y, to get the 'intended' signal from the microphone: y_int = y - y_hat. * Sit back and enjoy the lack of echoes, or perhaps the loud feedback squeal because you used the wrong values for h(tau). You don't need to estimate delays unless there are unknown delays in your system -- known delays can just be folded into h(tau). -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" was written for you. See details at http://www.wescottdesign.com/actfes/actfes.html