Technical discussions related to Audio Signal Processing (digital effects, acoustics, noise reduction, musical signal processing, etc).
Hi all, I am interested in implementing a Digital Mixer(mixing 2 voice signals). But I had HARD luck in finding the relevant material so far....I could only find filters in abundance. Can anyone suggest me ANY book/link which would have the implementation of the Mixer? Currently I have the following books: i)DSP by Oppenheim & Schafer. ii)Theory and Application of Digital Signal Processing - Rabiner and Gold iii) DSP - Salivahanan,Vallavaraj,Gnanapriya But there is nothing about Mixers in these books.... Also what are the basics in DSP that I can learn from these books before going into the Mixer implementation/algorithm? I need your suggestions.....Please reply....... Thanks in advance, Prakash.
Ulrich Prakash- > I am interested in implementing a Digital Mixer(mixing 2 voice signals). > But I had HARD luck in finding the relevant material so far....I could only > find filters in abundance. > > Can anyone suggest me ANY book/link which would have the implementation of > the Mixer? > > Currently I have the following books: > i)DSP by Oppenheim & Schafer. > ii)Theory and Application of Digital Signal Processing - Rabiner and Gold > iii) DSP - Salivahanan,Vallavaraj,Gnanapriya > > But there is nothing about Mixers in these books.... > > Also what are the basics in DSP that I can learn from these books before > going into the Mixer implementation/algorithm? Mix == add. Add your signals together, like this: y[n] = a*x1[n] + b*x2[n] Suggest that you maintain a + b = 1. Then put a dial on your MATLAB GUI that allows the user to adjust between a and b. All the way to left, a = 1. All the way to right, b = 1. Jeff Brower DSP sw/hw engineer Signalogic
Hi Jeff,
Thanks for the reply.
I have some queries,pls see inline.....
At 07:34 AM 8/19/02 -0500, Jeff Brower wrote:
>Ulrich Prakash-
>
> > I am interested in implementing a Digital Mixer(mixing 2 voice signals).
> > But I had HARD luck in finding the relevant material so far....I could only
> > find filters in abundance.
> >
> > Can anyone suggest me ANY book/link which would have the implementation of
> > the Mixer?
> >
> > Currently I have the following books:
> > i)DSP by Oppenheim & Schafer.
> > ii)Theory and Application of Digital Signal Processing - Rabiner and Gold
> > iii) DSP - Salivahanan,Vallavaraj,Gnanapriya
> >
> > But there is nothing about Mixers in these books....
> >
> > Also what are the basics in DSP that I can learn from these books before
> > going into the Mixer implementation/algorithm?
>
>Mix == add. Add your signals together, like this:
>
> y[n] = a*x1[n] + b*x2[n]
>
>Suggest that you maintain a + b = 1.
[Prakash]
Can I take a = 0.5 and b = 0.5,or is there any particular factor to
evaluate the value of a and b?
> Then put a dial on your MATLAB GUI that allows
>the user to adjust between a and b. All the way to left, a = 1. All the
>way to
>right, b = 1.
[Prakash]
From what you have said,I assume,if
x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
y = {2+4, 1+5, 3+6} = {6, 6, 9}
Is that so straight-forward?Will I hear both samples x1 and x2 from y output?
But I didn't understand what you mean to the LEFT and RIGHT here.
Thanks in advance,
Prakash.
>Jeff Brower
>DSP sw/hw engineer
>Signalogic
>
>
>_____________________________________
>
>
>
>
Ulrich Prakash-
> >Mix == add. Add your signals together, like this:
> >
> > y[n] = a*x1[n] + b*x2[n]
> >
> >Suggest that you maintain a + b = 1.
>
> [Prakash]
> Can I take a = 0.5 and b = 0.5,or is there any particular factor to
> evaluate the value of a and b?
>
> > Then put a dial on your MATLAB GUI that allows
> >the user to adjust between a and b. All the way to left, a = 1. All the
> >way to
> >right, b = 1.
>
> [Prakash]
> From what you have said,I assume,if
> x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> y = {2+4, 1+5, 3+6} = {6, 6, 9}
> Is that so straight-forward?
Yep.
>Will I hear both samples x1 and x2 from y output?
Yep.
> But I didn't understand what you mean to the LEFT and RIGHT here.
When the user turns the dial. You can put a dial (knob) on your GUI to easily
demonstrate the effect; better than asking your users to enter values of a and b.
Jeff Brower
DSP sw/hw engineer
Signalogic
Have a look at: 1) Watkinson, J., The art of digital audio, 2nd ed., Focal Press (Butterworth), 1993. 2) Zölzer, U., Digital audio signal processing, John Wiley & Sons, Ltd., 1995. Regards Sigmund. -----Original Message----- From: Ulrich Prakash To: Audiodsp@Audi... Sent: 8/19/2002 11:01 AM Subject: [audiodsp] Digital Mixer. Hi all, I am interested in implementing a Digital Mixer(mixing 2 voice signals). But I had HARD luck in finding the relevant material so far....I could only find filters in abundance. Can anyone suggest me ANY book/link which would have the implementation of the Mixer? Currently I have the following books: i)DSP by Oppenheim & Schafer. ii)Theory and Application of Digital Signal Processing - Rabiner and Gold iii) DSP - Salivahanan,Vallavaraj,Gnanapriya But there is nothing about Mixers in these books.... Also what are the basics in DSP that I can learn from these books before going into the Mixer implementation/algorithm? I need your suggestions.....Please reply....... Thanks in advance, Prakash. _____________________________________
On Mon, Aug 19, 2002 at 08:20:16AM -0500, Jeff Brower wrote:
> Ulrich Prakash-
>
> > >Mix == add. Add your signals together, like this:
> > >
> > > y[n] = a*x1[n] + b*x2[n]
> > >
> > >Suggest that you maintain a + b = 1.
> >
> > [Prakash]
> > Can I take a = 0.5 and b = 0.5,or is there any particular factor to
> > evaluate the value of a and b?
> >
> > > Then put a dial on your MATLAB GUI that allows
> > >the user to adjust between a and b. All the way to left, a = 1. All the
> > >way to
> > >right, b = 1.
> >
> > [Prakash]
> > From what you have said,I assume,if
> > x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > Is that so straight-forward?
>
> Yep.
>
> >Will I hear both samples x1 and x2 from y output?
>
> Yep.
>
> > But I didn't understand what you mean to the LEFT and RIGHT here.
>
> When the user turns the dial. You can put a dial (knob) on your GUI to easily
> demonstrate the effect; better than asking your users to enter values of a and b.
>
Jeff,
Ulrich didn't say he was using Matlab.
Ulrich,
Don't read so many books. Experiment more. Look at the waveforms and listen to them.
You would do well to learn a little analog signal processing as well.
Regards,
Mark
Mark- > > > >Mix == add. Add your signals together, like this: > > > > > > > > y[n] = a*x1[n] + b*x2[n] > > Ulrich didn't say he was using Matlab. I did not assume he was. Above is standard textbook notation, used widely. Jeff Brower DSP sw/hw engineer Signalogic
Hi Jeff,
In that case,if I need to mix 'm' voice channels(in the form of RTP
packets),then
y(n) = a1*x1 + a2*x2 + ... am*xm
= SIGMA ai*xi,where i = 1 to m (SIGMA - i don't have SIGMA symbol
in my mail editor)
And SIGMA ai = 1,where i = 1 to m.
In that case,if there are 10 input voice channels,then the amplitude of
original sample x1 in output y will be around x1/10.Will this not affect
the fidelity of x1?
Regards,
Prakash.
At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
>Ulrich Prakash-
>
> > >Mix == add. Add your signals together, like this:
> > >
> > > y[n] = a*x1[n] + b*x2[n]
> > >
> > >Suggest that you maintain a + b = 1.
> >
> > [Prakash]
> > Can I take a = 0.5 and b = 0.5,or is there any particular factor to
> > evaluate the value of a and b?
> >
> > > Then put a dial on your MATLAB GUI that allows
> > >the user to adjust between a and b. All the way to left, a = 1. All the
> > >way to
> > >right, b = 1.
> >
> > [Prakash]
> > From what you have said,I assume,if
> > x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > Is that so straight-forward?
>
>Yep.
>
> >Will I hear both samples x1 and x2 from y output?
>
>Yep.
>
> > But I didn't understand what you mean to the LEFT and RIGHT here.
>
>When the user turns the dial. You can put a dial (knob) on your GUI to easily
>demonstrate the effect; better than asking your users to enter values of a
>and b.
>
>Jeff Brower
>DSP sw/hw engineer
>Signalogic
Ulrich-
> In that case,if I need to mix 'm' voice channels(in the form of RTP
> packets),then
> y(n) = a1*x1 + a2*x2 + ... am*xm
>
> = SIGMA ai*xi,where i = 1 to m (SIGMA - i don't have SIGMA symbol
> in my mail editor)
>
> And SIGMA ai = 1,where i = 1 to m.
> In that case,if there are 10 input voice channels,then the amplitude of
> original sample x1 in output y will be around x1/10.Will this not affect
> the fidelity of x1?
Ahh, now you not mixing, but conferencing -- you should have said so earlier. You
need some type of algorithm that dynamically looks for "most energetic voice" and
give that channel more weight. I've seen some conferencing algorithms that will look
for most 2 or 3 dominant channels and give them weight.
Clearly if you just constantly multiply every channel by 0.1 you will not hear the
speaker -- he/she will be averaged with noise/silence. Not everyone talks at once,
right? Or at least you hope that your algorithm does not cause them to do that :-)
-Jeff
> At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> >Ulrich Prakash-
> >
> > > >Mix == add. Add your signals together, like this:
> > > >
> > > > y[n] = a*x1[n] + b*x2[n]
> > > >
> > > >Suggest that you maintain a + b = 1.
> > >
> > > [Prakash]
> > > Can I take a = 0.5 and b = 0.5,or is there any particular factor to
> > > evaluate the value of a and b?
> > >
> > > > Then put a dial on your MATLAB GUI that allows
> > > >the user to adjust between a and b. All the way to left, a = 1. All the
> > > >way to
> > > >right, b = 1.
> > >
> > > [Prakash]
> > > From what you have said,I assume,if
> > > x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > Is that so straight-forward?
> >
> >Yep.
> >
> > >Will I hear both samples x1 and x2 from y output?
> >
> >Yep.
> >
> > > But I didn't understand what you mean to the LEFT and RIGHT here.
> >
> >When the user turns the dial. You can put a dial (knob) on your GUI to easily
> >demonstrate the effect; better than asking your users to enter values of a
> >and b.
> >
> >Jeff Brower
> >DSP sw/hw engineer
> >Signalogic
Last year I went to SPIE's AeroSense conference and learned about
Independent Component Analysis and an application called the 'Cocktail
Party.' Basically, it's a method to assume that n multiple signals are
recorded by n microphones. ICA is the method that allows the signals to be
separated into n discrete signals. Perhaps a look down this path may be
what you need...
Regards,/s/neil
Neil E. Van de Voorde, Ph.D.
Senior Scientist
Planning Systems Inc.
228.689.8775
-----Original Message-----
From: Jeff Brower [mailto:jbrower@jbro...]
Sent: Tuesday, August 20, 2002 7:21 AM
To: Ulrich Prakash
Cc: audiodsp@audi...
Subject: Re: [audiodsp] Digital Mixer.
Ulrich-
> In that case,if I need to mix 'm' voice channels(in the form of RTP
> packets),then
> y(n) = a1*x1 + a2*x2 + ... am*xm
>
> = SIGMA ai*xi,where i = 1 to m (SIGMA - i don't have SIGMA
symbol
> in my mail editor)
>
> And SIGMA ai = 1,where i = 1 to m.
> In that case,if there are 10 input voice channels,then the amplitude of
> original sample x1 in output y will be around x1/10.Will this not affect
> the fidelity of x1?
Ahh, now you not mixing, but conferencing -- you should have said so
earlier. You
need some type of algorithm that dynamically looks for "most energetic
voice" and
give that channel more weight. I've seen some conferencing algorithms that
will look
for most 2 or 3 dominant channels and give them weight.
Clearly if you just constantly multiply every channel by 0.1 you will not
hear the
speaker -- he/she will be averaged with noise/silence. Not everyone talks
at once,
right? Or at least you hope that your algorithm does not cause them to do
that :-)
-Jeff
> At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> >Ulrich Prakash-
> >
> > > >Mix == add. Add your signals together, like this:
> > > >
> > > > y[n] = a*x1[n] + b*x2[n]
> > > >
> > > >Suggest that you maintain a + b = 1.
> > >
> > > [Prakash]
> > > Can I take a = 0.5 and b = 0.5,or is there any particular factor to
> > > evaluate the value of a and b?
> > >
> > > > Then put a dial on your MATLAB GUI that allows
> > > >the user to adjust between a and b. All the way to left, a = 1. All
the
> > > >way to
> > > >right, b = 1.
> > >
> > > [Prakash]
> > > From what you have said,I assume,if
> > > x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > Is that so straight-forward?
> >
> >Yep.
> >
> > >Will I hear both samples x1 and x2 from y output?
> >
> >Yep.
> >
> > > But I didn't understand what you mean to the LEFT and RIGHT here.
> >
> >When the user turns the dial. You can put a dial (knob) on your GUI to
easily
> >demonstrate the effect; better than asking your users to enter values of
a
> >and b.
> >
> >Jeff Brower
> >DSP sw/hw engineer
> >Signalogic
_____________________________________
Neil-
> Last year I went to SPIE's AeroSense conference and learned about
> Independent Component Analysis and an application called the 'Cocktail
> Party.' Basically, it's a method to assume that n multiple signals are
> recorded by n microphones. ICA is the method that allows the signals to be
> separated into n discrete signals. Perhaps a look down this path may be
> what you need...
ICA and other techniques aimed at separating source signals is intended to locate
direction of the signals and also to deliberately focus on signal vs. another. At a
cocktail party, you want to listen to the person talking to you and tune out everyone
else talking around you. In a conferencing situation, you would not normally make an
assumption about one listener's preference vs. another -- several people are in the
room listening to the same speaker. Whoever is speaking has to be heard by all. If
two people are speaking at the same time, then the system would not automatically
make a choice as to who to emphasize. Or if you gave users that option, then it
would be a very sophisticated conference system indeed.
Jeff Brower
DSP sw/hw engineer
Signalogic
> -----Original Message-----
> From: Jeff Brower [mailto:jbrower@jbro...]
> Sent: Tuesday, August 20, 2002 7:21 AM
> To: Ulrich Prakash
> Cc: audiodsp@audi...
> Subject: Re: [audiodsp] Digital Mixer.
>
> Ulrich-
>
> > In that case,if I need to mix 'm' voice channels(in the form of RTP
> > packets),then
> > y(n) = a1*x1 + a2*x2 + ... am*xm
> >
> > = SIGMA ai*xi,where i = 1 to m (SIGMA - i don't have SIGMA
> symbol
> > in my mail editor)
> >
> > And SIGMA ai = 1,where i = 1 to m.
> > In that case,if there are 10 input voice channels,then the amplitude of
> > original sample x1 in output y will be around x1/10.Will this not affect
> > the fidelity of x1?
>
> Ahh, now you not mixing, but conferencing -- you should have said so
> earlier. You
> need some type of algorithm that dynamically looks for "most energetic
> voice" and
> give that channel more weight. I've seen some conferencing algorithms that
> will look
> for most 2 or 3 dominant channels and give them weight.
>
> Clearly if you just constantly multiply every channel by 0.1 you will not
> hear the
> speaker -- he/she will be averaged with noise/silence. Not everyone talks
> at once,
> right? Or at least you hope that your algorithm does not cause them to do
> that :-)
>
> -Jeff
>
>
> > At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> > >Ulrich Prakash-
> > >
> > > > >Mix == add. Add your signals together, like this:
> > > > >
> > > > > y[n] = a*x1[n] + b*x2[n]
> > > > >
> > > > >Suggest that you maintain a + b = 1.
> > > >
> > > > [Prakash]
> > > > Can I take a = 0.5 and b = 0.5,or is there any particular factor to
> > > > evaluate the value of a and b?
> > > >
> > > > > Then put a dial on your MATLAB GUI that allows
> > > > >the user to adjust between a and b. All the way to left, a = 1. All
> the
> > > > >way to
> > > > >right, b = 1.
> > > >
> > > > [Prakash]
> > > > From what you have said,I assume,if
> > > > x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> > > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > > Is that so straight-forward?
> > >
> > >Yep.
> > >
> > > >Will I hear both samples x1 and x2 from y output?
> > >
> > >Yep.
> > >
> > > > But I didn't understand what you mean to the LEFT and RIGHT here.
> > >
> > >When the user turns the dial. You can put a dial (knob) on your GUI to
> easily
> > >demonstrate the effect; better than asking your users to enter values of
> a
> > >and b.
> > >
> > >Jeff Brower
> > >DSP sw/hw engineer
> > >Signalogic
Hello, I just subsribe this list:)
I have a related problem: suppose I have 2 ulaw packet, what is the
algorithm to mix them and create a new ulaw packet. When remote
client receive and decode it, he can hear both audio.
Is it possible?
Kun
--- In audiodsp@y..., "VandeVoorde, Neil" <nvandevoorde@p...> wrote:
> Last year I went to SPIE's AeroSense conference and learned about
> Independent Component Analysis and an application called the
'Cocktail
> Party.' Basically, it's a method to assume that n multiple signals
are
> recorded by n microphones. ICA is the method that allows the
signals to be
> separated into n discrete signals. Perhaps a look down this path
may be
> what you need...
>
> Regards,/s/neil
>
> Neil E. Van de Voorde, Ph.D.
> Senior Scientist
> Planning Systems Inc.
> 228.689.8775
>
>
> -----Original Message-----
> From: Jeff Brower [mailto:jbrower@s...]
> Sent: Tuesday, August 20, 2002 7:21 AM
> To: Ulrich Prakash
> Cc: audiodsp@y...
> Subject: Re: [audiodsp] Digital Mixer.
>
> Ulrich-
>
> > In that case,if I need to mix 'm' voice channels(in the form of
RTP
> > packets),then
> > y(n) = a1*x1 + a2*x2 + ... am*xm
> >
> > = SIGMA ai*xi,where i = 1 to m (SIGMA - i don't have
SIGMA
> symbol
> > in my mail editor)
> >
> > And SIGMA ai = 1,where i = 1 to m.
> > In that case,if there are 10 input voice channels,then the
amplitude of
> > original sample x1 in output y will be around x1/10.Will this not
affect
> > the fidelity of x1?
>
> Ahh, now you not mixing, but conferencing -- you should have said so
> earlier. You
> need some type of algorithm that dynamically looks for "most
energetic
> voice" and
> give that channel more weight. I've seen some conferencing
algorithms that
> will look
> for most 2 or 3 dominant channels and give them weight.
>
> Clearly if you just constantly multiply every channel by 0.1 you
will not
> hear the
> speaker -- he/she will be averaged with noise/silence. Not everyone
talks
> at once,
> right? Or at least you hope that your algorithm does not cause them
to do
> that :-)
>
> -Jeff
>
>
> > At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> > >Ulrich Prakash-
> > >
> > > > >Mix == add. Add your signals together, like this:
> > > > >
> > > > > y[n] = a*x1[n] + b*x2[n]
> > > > >
> > > > >Suggest that you maintain a + b = 1.
> > > >
> > > > [Prakash]
> > > > Can I take a = 0.5 and b = 0.5,or is there any particular
factor to
> > > > evaluate the value of a and b?
> > > >
> > > > > Then put a dial on your MATLAB GUI that allows
> > > > >the user to adjust between a and b. All the way to left, a =
1. All
> the
> > > > >way to
> > > > >right, b = 1.
> > > >
> > > > [Prakash]
> > > > From what you have said,I assume,if
> > > > x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> > > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > > Is that so straight-forward?
> > >
> > >Yep.
> > >
> > > >Will I hear both samples x1 and x2 from y output?
> > >
> > >Yep.
> > >
> > > > But I didn't understand what you mean to the LEFT and RIGHT
here.
> > >
> > >When the user turns the dial. You can put a dial (knob) on your
GUI to
> easily
> > >demonstrate the effect; better than asking your users to enter
values of
> a
> > >and b.
> > >
> > >Jeff Brower
> > >DSP sw/hw engineer
> > >Signalogic
>
>
> _____________________________________
>
>
>
>
Kun-
> I have a related problem: suppose I have 2 ulaw packet, what is the
> algorithm to mix them and create a new ulaw packet. When remote
> client receive and decode it, he can hear both audio.
y[n] = U(a*u(x1[n]) + b*u(x2[n]))
where
U(x) = uLaw(x) (compress)
u(x) = inverse uLaw(x) (expand)
a + b = 1
y[n] is packet data
Jeff Brower
DSP sw/hw engineer
Signalogic
> --- In audiodsp@y..., "VandeVoorde, Neil" <nvandevoorde@p...> wrote:
> > Last year I went to SPIE's AeroSense conference and learned about
> > Independent Component Analysis and an application called the
> 'Cocktail
> > Party.' Basically, it's a method to assume that n multiple signals
> are
> > recorded by n microphones. ICA is the method that allows the
> signals to be
> > separated into n discrete signals. Perhaps a look down this path
> may be
> > what you need...
> >
> > Regards,/s/neil
> >
> > Neil E. Van de Voorde, Ph.D.
> > Senior Scientist
> > Planning Systems Inc.
> > 228.689.8775
> >
> >
> > -----Original Message-----
> > From: Jeff Brower [mailto:jbrower@s...]
> > Sent: Tuesday, August 20, 2002 7:21 AM
> > To: Ulrich Prakash
> > Cc: audiodsp@y...
> > Subject: Re: [audiodsp] Digital Mixer.
> >
> > Ulrich-
> >
> > > In that case,if I need to mix 'm' voice channels(in the form of
> RTP
> > > packets),then
> > > y(n) = a1*x1 + a2*x2 + ... am*xm
> > >
> > > = SIGMA ai*xi,where i = 1 to m (SIGMA - i don't have
> SIGMA
> > symbol
> > > in my mail editor)
> > >
> > > And SIGMA ai = 1,where i = 1 to m.
> > > In that case,if there are 10 input voice channels,then the
> amplitude of
> > > original sample x1 in output y will be around x1/10.Will this not
> affect
> > > the fidelity of x1?
> >
> > Ahh, now you not mixing, but conferencing -- you should have said so
> > earlier. You
> > need some type of algorithm that dynamically looks for "most
> energetic
> > voice" and
> > give that channel more weight. I've seen some conferencing
> algorithms that
> > will look
> > for most 2 or 3 dominant channels and give them weight.
> >
> > Clearly if you just constantly multiply every channel by 0.1 you
> will not
> > hear the
> > speaker -- he/she will be averaged with noise/silence. Not everyone
> talks
> > at once,
> > right? Or at least you hope that your algorithm does not cause them
> to do
> > that :-)
> >
> > -Jeff
> >
> >
> > > At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> > > >Ulrich Prakash-
> > > >
> > > > > >Mix == add. Add your signals together, like this:
> > > > > >
> > > > > > y[n] = a*x1[n] + b*x2[n]
> > > > > >
> > > > > >Suggest that you maintain a + b = 1.
> > > > >
> > > > > [Prakash]
> > > > > Can I take a = 0.5 and b = 0.5,or is there any particular
> factor to
> > > > > evaluate the value of a and b?
> > > > >
> > > > > > Then put a dial on your MATLAB GUI that allows
> > > > > >the user to adjust between a and b. All the way to left, a =
> 1. All
> > the
> > > > > >way to
> > > > > >right, b = 1.
> > > > >
> > > > > [Prakash]
> > > > > From what you have said,I assume,if
> > > > > x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> > > > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > > > Is that so straight-forward?
> > > >
> > > >Yep.
> > > >
> > > > >Will I hear both samples x1 and x2 from y output?
> > > >
> > > >Yep.
> > > >
> > > > > But I didn't understand what you mean to the LEFT and RIGHT
> here.
> > > >
> > > >When the user turns the dial. You can put a dial (knob) on your
> GUI to
> > easily
> > > >demonstrate the effect; better than asking your users to enter
> values of
> > a
> > > >and b.
> > > >
> > > >Jeff Brower
> > > >DSP sw/hw engineer
> > > >Signalogic
Kun-
> Thanks. Where can I find this kind of algorithms as mentioned below to
> dynamically conferencing?
Not sure really. We've done some conferencing work before, we just worried about the
2 channels with most energy at any one time. 3 channels worked Ok, too, but beyond
that, I seem to recall there was not much to be gained -- if 4 people are speaking at
once, well, what can you do...it's a mess "in real life", too. Also you have to be
careful how you transition the weights -- you can't have a,b,c "jerk" to different
values; use some type of lowpass/averaging filter on the weights, something that
transitions smoothly, over about 50 to 100 msec.
Jeff Brower
DSP sw/hw engineer
Signalogic
> > Ahh, now you not mixing, but conferencing -- you should have said so
> > earlier. You
> > need some type of algorithm that dynamically looks for "most energetic
> > voice" and
> > give that channel more weight. I've seen some conferencing algorithms that
> > will look
> > for most 2 or 3 dominant channels and give them weight.
>
> Jeff Brower wrote:
> >
> > Kun-
> >
> > > I have a related problem: suppose I have 2 ulaw packet, what is the
> > > algorithm to mix them and create a new ulaw packet. When remote
> > > client receive and decode it, he can hear both audio.
> >
> > y[n] = U(a*u(x1[n]) + b*u(x2[n]))
> >
> > where
> >
> > U(x) = uLaw(x) (compress)
> > u(x) = inverse uLaw(x) (expand)
> > a + b = 1
> > y[n] is packet data
> >
> > Jeff Brower
> > DSP sw/hw engineer
> > Signalogic
> >
> > > --- In audiodsp@y..., "VandeVoorde, Neil" <nvandevoorde@p...>
wrote:
> > > > Last year I went to SPIE's AeroSense conference and learned about
> > > > Independent Component Analysis and an application called the
> > > 'Cocktail
> > > > Party.' Basically, it's a method to assume that n multiple signals
> > > are
> > > > recorded by n microphones. ICA is the method that allows the
> > > signals to be
> > > > separated into n discrete signals. Perhaps a look down this path
> > > may be
> > > > what you need...
> > > >
> > > > Regards,/s/neil
> > > >
> > > > Neil E. Van de Voorde, Ph.D.
> > > > Senior Scientist
> > > > Planning Systems Inc.
> > > > 228.689.8775
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Jeff Brower [mailto:jbrower@s...]
> > > > Sent: Tuesday, August 20, 2002 7:21 AM
> > > > To: Ulrich Prakash
> > > > Cc: audiodsp@y...
> > > > Subject: Re: [audiodsp] Digital Mixer.
> > > >
> > > > Ulrich-
> > > >
> > > > > In that case,if I need to mix 'm' voice channels(in the form of
> > > RTP
> > > > > packets),then
> > > > > y(n) = a1*x1 + a2*x2 + ... am*xm
> > > > >
> > > > > = SIGMA ai*xi,where i = 1 to m (SIGMA - i don't have
> > > SIGMA
> > > > symbol
> > > > > in my mail editor)
> > > > >
> > > > > And SIGMA ai = 1,where i = 1 to m.
> > > > > In that case,if there are 10 input voice channels,then the
> > > amplitude of
> > > > > original sample x1 in output y will be around x1/10.Will this not
> > > affect
> > > > > the fidelity of x1?
> > > >
> > > > Ahh, now you not mixing, but conferencing -- you should have said so
> > > > earlier. You
> > > > need some type of algorithm that dynamically looks for "most
> > > energetic
> > > > voice" and
> > > > give that channel more weight. I've seen some conferencing
> > > algorithms that
> > > > will look
> > > > for most 2 or 3 dominant channels and give them weight.
> > > >
> > > > Clearly if you just constantly multiply every channel by 0.1 you
> > > will not
> > > > hear the
> > > > speaker -- he/she will be averaged with noise/silence. Not everyone
> > > talks
> > > > at once,
> > > > right? Or at least you hope that your algorithm does not cause them
> > > to do
> > > > that :-)
> > > >
> > > > -Jeff
> > > >
> > > >
> > > > > At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> > > > > >Ulrich Prakash-
> > > > > >
> > > > > > > >Mix == add. Add your signals together, like this:
> > > > > > > >
> > > > > > > > y[n] = a*x1[n] + b*x2[n]
> > > > > > > >
> > > > > > > >Suggest that you maintain a + b = 1.
> > > > > > >
> > > > > > > [Prakash]
> > > > > > > Can I take a = 0.5 and b = 0.5,or is there any particular
> > > factor to
> > > > > > > evaluate the value of a and b?
> > > > > > >
> > > > > > > > Then put a dial on your MATLAB GUI that allows
> > > > > > > >the user to adjust between a and b. All the way to
left, a =
> > > 1. All
> > > > the
> > > > > > > >way to
> > > > > > > >right, b = 1.
> > > > > > >
> > > > > > > [Prakash]
> > > > > > > From what you have said,I assume,if
> > > > > > > x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> > > > > > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > > > > > Is that so straight-forward?
> > > > > >
> > > > > >Yep.
> > > > > >
> > > > > > >Will I hear both samples x1 and x2 from y output?
> > > > > >
> > > > > >Yep.
> > > > > >
> > > > > > > But I didn't understand what you mean to the LEFT and RIGHT
> > > here.
> > > > > >
> > > > > >When the user turns the dial. You can put a dial (knob) on your
> > > GUI to
> > > > easily
> > > > > >demonstrate the effect; better than asking your users to enter
> > > values of
> > > > a
> > > > > >and b.
> > > > > >
> > > > > >Jeff Brower
> > > > > >DSP sw/hw engineer
> > > > > >Signalogic
Hi,
Using the summing algorithm,I feel that we are not making MAXIMUM
utilization of the mixed Output Channel,if I got it right.
That is,though we take a + b + c = 1(for 3 inputs),we have to subtract one
input with coefficient(i.e. same sample)before sending.Hence,b+c<1 and NOT
equal to 1,to have maximum utilization.
Let me explain the scenario more in detail:
Consider there are 3 users(A,B,C) in a conference and there is ONE
conference server which mixes the RTP packets.
For 3 inputs,the mixed output,
y = ax1+bx2+cx3, ------(1)
and a+b+c=1. ---(2)
However,we CANNOT send output 'y' as it is,to A,B and C.
This is because,A should not hear his own voice,....likewise for B,C also.
Hence we have to recalculate 3 outputs,like
y1 = y - ax1
y2 = y - bx2 --------------(3)
y3 = y - cx3
and then send y1(put into RTP packet) to A,y2 to B and y3 to C.
If this is the case,
y1 = y- ax1 = bx2+cx3 and here b+c <1.
Hence here actually the outputs coefficients b+c<1,and NOT EQUAL to 1 and
hence we are not making maximum utilization of the output channel.
ALTERNATIVE solution:
------------------------------------
The alternative is doing a calculation like this:
y1 = b1*x2 + c1*x3 , where b1 + c1 = 1,
y2 = a2*x1 + c2*x3 , where a2 + c2 = 1,
y3 = a3*x1 + b3*x2 , where a3 + b3 = 1,
But this method will involve MORE COMPUTATION TIME due to more no. of
multiplications and additions and finding co-efficients,if we mix more than
3 channels.
So must one follow this ALTERNATIVE solution with more computations or the
PREVIOUS one with lesser utilization of the channel?
Also please let me know if there is some other way of dealing this situation.
Thanks in advance,
Prakash.
At 11:44 PM 8/21/02 -0500, Jeff Brower wrote:
>Kun-
>
> > Thanks. Where can I find this kind of algorithms as mentioned below to
> > dynamically conferencing?
>
>Not sure really. We've done some conferencing work before, we just
>worried about the
>2 channels with most energy at any one time. 3 channels worked Ok, too,
>but beyond
>that, I seem to recall there was not much to be gained -- if 4 people are
>speaking at
>once, well, what can you do...it's a mess "in real life", too. Also you
>have to be
>careful how you transition the weights -- you can't have a,b,c "jerk" to
>different
>values; use some type of lowpass/averaging filter on the weights,
>something that
>transitions smoothly, over about 50 to 100 msec.
>
>Jeff Brower
>DSP sw/hw engineer
>Signalogic
>
>
>
> > > Ahh, now you not mixing, but conferencing -- you should have said so
> > > earlier. You
> > > need some type of algorithm that dynamically looks for "most energetic
> > > voice" and
> > > give that channel more weight. I've seen some conferencing algorithms
> that
> > > will look
> > > for most 2 or 3 dominant channels and give them weight.
> >
> > Jeff Brower wrote:
> > >
> > > Kun-
> > >
> > > > I have a related problem: suppose I have 2 ulaw packet, what is the
> > > > algorithm to mix them and create a new ulaw packet. When remote
> > > > client receive and decode it, he can hear both audio.
> > >
> > > y[n] = U(a*u(x1[n]) + b*u(x2[n]))
> > >
> > > where
> > >
> > > U(x) = uLaw(x) (compress)
> > > u(x) = inverse uLaw(x) (expand)
> > > a + b = 1
> > > y[n] is packet data
> > >
> > > Jeff Brower
> > > DSP sw/hw engineer
> > > Signalogic
> > >
> > > > --- In audiodsp@y..., "VandeVoorde, Neil"
<nvandevoorde@p...> wrote:
> > > > > Last year I went to SPIE's AeroSense conference and learned about
> > > > > Independent Component Analysis and an application called the
> > > > 'Cocktail
> > > > > Party.' Basically, it's a method to assume that n multiple signals
> > > > are
> > > > > recorded by n microphones. ICA is the method that allows the
> > > > signals to be
> > > > > separated into n discrete signals. Perhaps a look down this path
> > > > may be
> > > > > what you need...
> > > > >
> > > > > Regards,/s/neil
> > > > >
> > > > > Neil E. Van de Voorde, Ph.D.
> > > > > Senior Scientist
> > > > > Planning Systems Inc.
> > > > > 228.689.8775
> > > > >
> > > > >
> > > > > -----Original Message-----
> > > > > From: Jeff Brower [mailto:jbrower@s...]
> > > > > Sent: Tuesday, August 20, 2002 7:21 AM
> > > > > To: Ulrich Prakash
> > > > > Cc: audiodsp@y...
> > > > > Subject: Re: [audiodsp] Digital Mixer.
> > > > >
> > > > > Ulrich-
> > > > >
> > > > > > In that case,if I need to mix 'm' voice channels(in the form of
> > > > RTP
> > > > > > packets),then
> > > > > > y(n) = a1*x1 + a2*x2 + ... am*xm
> > > > > >
> > > > > > = SIGMA ai*xi,where i = 1 to m (SIGMA - i don't have
> > > > SIGMA
> > > > > symbol
> > > > > > in my mail editor)
> > > > > >
> > > > > > And SIGMA ai = 1,where i = 1 to m.
> > > > > > In that case,if there are 10 input voice channels,then the
> > > > amplitude of
> > > > > > original sample x1 in output y will be around x1/10.Will this not
> > > > affect
> > > > > > the fidelity of x1?
> > > > >
> > > > > Ahh, now you not mixing, but conferencing -- you should have said so
> > > > > earlier. You
> > > > > need some type of algorithm that dynamically looks for "most
> > > > energetic
> > > > > voice" and
> > > > > give that channel more weight. I've seen some conferencing
> > > > algorithms that
> > > > > will look
> > > > > for most 2 or 3 dominant channels and give them weight.
> > > > >
> > > > > Clearly if you just constantly multiply every channel by 0.1 you
> > > > will not
> > > > > hear the
> > > > > speaker -- he/she will be averaged with noise/silence. Not everyone
> > > > talks
> > > > > at once,
> > > > > right? Or at least you hope that your algorithm does not cause them
> > > > to do
> > > > > that :-)
> > > > >
> > > > > -Jeff
> > > > >
> > > > >
> > > > > > At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> > > > > > >Ulrich Prakash-
> > > > > > >
> > > > > > > > >Mix == add. Add your signals together, like this:
> > > > > > > > >
> > > > > > > > > y[n] = a*x1[n] + b*x2[n]
> > > > > > > > >
> > > > > > > > >Suggest that you maintain a + b = 1.
> > > > > > > >
> > > > > > > > [Prakash]
> > > > > > > > Can I take a = 0.5 and b = 0.5,or is there any
particular
> > > > factor to
> > > > > > > > evaluate the value of a and b?
> > > > > > > >
> > > > > > > > > Then put a dial on your MATLAB GUI that allows
> > > > > > > > >the user to adjust between a and b. All the way to
left, a =
> > > > 1. All
> > > > > the
> > > > > > > > >way to
> > > > > > > > >right, b = 1.
> > > > > > > >
> > > > > > > > [Prakash]
> > > > > > > > From what you have said,I assume,if
> > > > > > > > x1 = {4,2,6} and x2 = {8,10,12},and a=0.5,b=0.5then
> > > > > > > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > > > > > > Is that so straight-forward?
> > > > > > >
> > > > > > >Yep.
> > > > > > >
> > > > > > > >Will I hear both samples x1 and x2 from y output?
> > > > > > >
> > > > > > >Yep.
> > > > > > >
> > > > > > > > But I didn't understand what you mean to the LEFT and
RIGHT
> > > > here.
> > > > > > >
> > > > > > >When the user turns the dial. You can put a dial (knob) on
your
> > > > GUI to
> > > > > easily
> > > > > > >demonstrate the effect; better than asking your users to
enter
> > > > values of
> > > > > a
> > > > > > >and b.
> > > > > > >
> > > > > > >Jeff Brower
> > > > > > >DSP sw/hw engineer
> > > > > > >Signalogic
>
>
>_____________________________________
>
>
>
>
Ulrich-
As you can see, it's more difficult when the speakers are in different locations and
the mixed sound comes from a central server. Do you have control of processing at
each speaker's location? If so, you can do something simple (like a typical Office
Max phone) and mute the sound when a person is speaking -- they can't hear anything,
so your processing requirements stay the same. Basically, that becomes a 1-way sound
situation (also known as half-duplex in communication systems).
If you can't do that, or you have processing ability only at the server, then you
will need to subtract or selectively mix, as you describe.
You are approaching echo canceller / adaptive signal cancellation techniques, which
are used to improve conferencing quality and provide a constant 2-way sound situation
(full-duplex). But if you do that, you will need a DSP or other fast processor, and
cannot rely on the PC/sound card to do it.
Jeff Brower
DSP sw/hw engineer
Signalogic
Ulrich Prakash wrote:
>
> Hi,
>
> Using the summing algorithm,I feel that we are not making MAXIMUM
> utilization of the mixed Output Channel,if I got it right.
>
> That is,though we take a + b + c = 1(for 3 inputs),we have to subtract one
> input with coefficient(i.e. same sample)before sending.Hence,b+c<1 and NOT
> equal to 1,to have maximum utilization.
>
> Let me explain the scenario more in detail:
>
> Consider there are 3 users(A,B,C) in a conference and there is ONE
> conference server which mixes the RTP packets.
> For 3 inputs,the mixed output,
> y = ax1+bx2+cx3, ------(1)
> and a+b+c=1. ---(2)
>
> However,we CANNOT send output 'y' as it is,to A,B and C.
> This is because,A should not hear his own voice,....likewise for B,C also.
>
> Hence we have to recalculate 3 outputs,like
> y1 = y - ax1
> y2 = y - bx2 --------------(3)
> y3 = y - cx3
> and then send y1(put into RTP packet) to A,y2 to B and y3 to C.
>
> If this is the case,
> y1 = y- ax1 = bx2+cx3 and here b+c <1.
> Hence here actually the outputs coefficients b+c<1,and NOT EQUAL to 1 and
> hence we are not making maximum utilization of the output channel.
>
> ALTERNATIVE solution:
> ------------------------------------
> The alternative is doing a calculation like this:
> y1 = b1*x2 + c1*x3 , where b1 + c1 = 1,
> y2 = a2*x1 + c2*x3 , where a2 + c2 = 1,
> y3 = a3*x1 + b3*x2 , where a3 + b3 = 1,
>
> But this method will involve MORE COMPUTATION TIME due to more no. of
> multiplications and additions and finding co-efficients,if we mix more than
> 3 channels.
>
> So must one follow this ALTERNATIVE solution with more computations or the
> PREVIOUS one with lesser utilization of the channel?
>
> Also please let me know if there is some other way of dealing this situation.
>
> Thanks in advance,
> Prakash.
>
> At 11:44 PM 8/21/02 -0500, Jeff Brower wrote:
> >Kun-
> >
> > > Thanks. Where can I find this kind of algorithms as mentioned below to
> > > dynamically conferencing?
> >
> >Not sure really. We've done some conferencing work before, we just
> >worried about the
> >2 channels with most energy at any one time. 3 channels worked Ok, too,
> >but beyond
> >that, I seem to recall there was not much to be gained -- if 4 people are
> >speaking at
> >once, well, what can you do...it's a mess "in real life", too. Also you
> >have to be
> >careful how you transition the weights -- you can't have a,b,c "jerk" to
> >different
> >values; use some type of lowpass/averaging filter on the weights,
> >something that
> >transitions smoothly, over about 50 to 100 msec.
> >
> >Jeff Brower
> >DSP sw/hw engineer
> >Signalogic
> >
> >
> >
> > > > Ahh, now you not mixing, but conferencing -- you should have said so
> > > > earlier. You
> > > > need some type of algorithm that dynamically looks for "most energetic
> > > > voice" and
> > > > give that channel more weight. I've seen some conferencing algorithms
> > that
> > > > will look
> > > > for most 2 or 3 dominant channels and give them weight.
> > >
> > > Jeff Brower wrote:
> > > >
> > > > Kun-
> > > >
> > > > > I have a related problem: suppose I have 2 ulaw packet, what is the
> > > > > algorithm to mix them and create a new ulaw packet. When remote
> > > > > client receive and decode it, he can hear both audio.
> > > >
> > > > y[n] = U(a*u(x1[n]) + b*u(x2[n]))
> > > >
> > > > where
> > > >
> > > > U(x) = uLaw(x) (compress)
> > > > u(x) = inverse uLaw(x) (expand)
> > > > a + b = 1
> > > > y[n] is packet data
> > > >
> > > > Jeff Brower
> > > > DSP sw/hw engineer
> > > > Signalogic
> > > >
> > > > > --- In audiodsp@y..., "VandeVoorde, Neil"
<nvandevoorde@p...> wrote:
> > > > > > Last year I went to SPIE's AeroSense conference and learned about
> > > > > > Independent Component Analysis and an application called the
> > > > > 'Cocktail
> > > > > > Party.' Basically, it's a method to assume that n multiple
signals
> > > > > are
> > > > > > recorded by n microphones. ICA is the method that allows the
> > > > > signals to be
> > > > > > separated into n discrete signals. Perhaps a look down this path
> > > > > may be
> > > > > > what you need...
> > > > > >
> > > > > > Regards,/s/neil
> > > > > >
> > > > > > Neil E. Van de Voorde, Ph.D.
> > > > > > Senior Scientist
> > > > > > Planning Systems Inc.
> > > > > > 228.689.8775
> > > > > >
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Jeff Brower [mailto:jbrower@s...]
> > > > > > Sent: Tuesday, August 20, 2002 7:21 AM
> > > > > > To: Ulrich Prakash
> > > > > > Cc: audiodsp@y...
> > > > > > Subject: Re: [audiodsp] Digital Mixer.
> > > > > >
> > > > > > Ulrich-
> > > > > >
> > > > > > > In that case,if I need to mix 'm' voice channels(in the form
of
> > > > > RTP
> > > > > > > packets),then
> > > > > > > y(n) = a1*x1 + a2*x2 + ... am*xm
> > > > > > >
> > > > > > > = SIGMA ai*xi,where i = 1 to m (SIGMA - i don't
have
> > > > > SIGMA
> > > > > > symbol
> > > > > > > in my mail editor)
> > > > > > >
> > > > > > > And SIGMA ai = 1,where i = 1 to m.
> > > > > > > In that case,if there are 10 input voice channels,then the
> > > > > amplitude of
> > > > > > > original sample x1 in output y will be around x1/10.Will
this not
> > > > > affect
> > > > > > > the fidelity of x1?
> > > > > >
> > > > > > Ahh, now you not mixing, but conferencing -- you should have said
so
> > > > > > earlier. You
> > > > > > need some type of algorithm that dynamically looks for "most
> > > > > energetic
> > > > > > voice" and
> > > > > > give that channel more weight. I've seen some conferencing
> > > > > algorithms that
> > > > > > will look
> > > > > > for most 2 or 3 dominant channels and give them weight.
> > > > > >
> > > > > > Clearly if you just constantly multiply every channel by 0.1 you
> > > > > will not
> > > > > > hear the
> > > > > > speaker -- he/she will be averaged with noise/silence. Not
everyone
> > > > > talks
> > > > > > at once,
> > > > > > right? Or at least you hope that your algorithm does not cause
them
> > > > > to do
> > > > > > that :-)
> > > > > >
> > > > > > -Jeff
> > > > > >
> > > > > >
> > > > > > > At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> > > > > > > >Ulrich Prakash-
> > > > > > > >
> > > > > > > > > >Mix == add. Add your signals together, like
this:
> > > > > > > > > >
> > > > > > > > > > y[n] = a*x1[n] + b*x2[n]
> > > > > > > > > >
> > > > > > > > > >Suggest that you maintain a + b = 1.
> > > > > > > > >
> > > > > > > > > [Prakash]
> > > > > > > > > Can I take a = 0.5 and b = 0.5,or is there any
particular
> > > > > factor to
> > > > > > > > > evaluate the value of a and b?
> > > > > > > > >
> > > > > > > > > > Then put a dial on your MATLAB GUI that
allows
> > > > > > > > > >the user to adjust between a and b. All the
way to left, a =
> > > > > 1. All
> > > > > > the
> > > > > > > > > >way to
> > > > > > > > > >right, b = 1.
> > > > > > > > >
> > > > > > > > > [Prakash]
> > > > > > > > > From what you have said,I assume,if
> > > > > > > > > x1 = {4,2,6} and x2 = {8,10,12},and
a=0.5,b=0.5then
> > > > > > > > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > > > > > > > Is that so straight-forward?
> > > > > > > >
> > > > > > > >Yep.
> > > > > > > >
> > > > > > > > >Will I hear both samples x1 and x2 from y output?
> > > > > > > >
> > > > > > > >Yep.
> > > > > > > >
> > > > > > > > > But I didn't understand what you mean to the LEFT
and RIGHT
> > > > > here.
> > > > > > > >
> > > > > > > >When the user turns the dial. You can put a dial (knob)
on your
> > > > > GUI to
> > > > > > easily
> > > > > > > >demonstrate the effect; better than asking your users to
enter
> > > > > values of
> > > > > > a
> > > > > > > >and b.
> > > > > > > >
> > > > > > > >Jeff Brower
> > > > > > > >DSP sw/hw engineer
> > > > > > > >Signalogic
Jeff Brower wrote: > > You are approaching echo canceller / adaptive signal cancellation techniques, which > are used to improve conferencing quality and provide a constant 2-way sound situation > (full-duplex). But if you do that, you will need a DSP or other fast processor, and > cannot rely on the PC/sound card to do it. Well, high quality echo cancellation on PC platforms _is_ possible. Regards, Alexander -- dipl. ing. alexander lerch zplane.development http://www.zplane.de holsteinische str. 39-42 D-12161 berlin fon: +49.30.854 09 15.0 fax: +49.30.854 09 15.5
I have tried A+B, A+B/2, it seems worse. I end up using algorithm
A+B-A*B/max (A and B are 2 different audio sample from 0 to max)
Kun
Jeff Brower wrote:
>
> Ulrich-
>
> As you can see, it's more difficult when the speakers are in different locations and
> the mixed sound comes from a central server. Do you have control of processing at
> each speaker's location? If so, you can do something simple (like a typical Office
> Max phone) and mute the sound when a person is speaking -- they can't hear anything,
> so your processing requirements stay the same. Basically, that becomes a 1-way sound
> situation (also known as half-duplex in communication systems).
>
> If you can't do that, or you have processing ability only at the server, then you
> will need to subtract or selectively mix, as you describe.
>
> You are approaching echo canceller / adaptive signal cancellation techniques, which
> are used to improve conferencing quality and provide a constant 2-way sound situation
> (full-duplex). But if you do that, you will need a DSP or other fast processor, and
> cannot rely on the PC/sound card to do it.
>
> Jeff Brower
> DSP sw/hw engineer
> Signalogic
>
> Ulrich Prakash wrote:
> >
> > Hi,
> >
> > Using the summing algorithm,I feel that we are not making MAXIMUM
> > utilization of the mixed Output Channel,if I got it right.
> >
> > That is,though we take a + b + c = 1(for 3 inputs),we have to subtract one
> > input with coefficient(i.e. same sample)before sending.Hence,b+c<1 and NOT
> > equal to 1,to have maximum utilization.
> >
> > Let me explain the scenario more in detail:
> >
> > Consider there are 3 users(A,B,C) in a conference and there is ONE
> > conference server which mixes the RTP packets.
> > For 3 inputs,the mixed output,
> > y = ax1+bx2+cx3, ------(1)
> > and a+b+c=1. ---(2)
> >
> > However,we CANNOT send output 'y' as it is,to A,B and C.
> > This is because,A should not hear his own voice,....likewise for B,C also.
> >
> > Hence we have to recalculate 3 outputs,like
> > y1 = y - ax1
> > y2 = y - bx2 --------------(3)
> > y3 = y - cx3
> > and then send y1(put into RTP packet) to A,y2 to B and y3 to C.
> >
> > If this is the case,
> > y1 = y- ax1 = bx2+cx3 and here b+c <1.
> > Hence here actually the outputs coefficients b+c<1,and NOT EQUAL to 1 and
> > hence we are not making maximum utilization of the output channel.
> >
> > ALTERNATIVE solution:
> > ------------------------------------
> > The alternative is doing a calculation like this:
> > y1 = b1*x2 + c1*x3 , where b1 + c1 = 1,
> > y2 = a2*x1 + c2*x3 , where a2 + c2 = 1,
> > y3 = a3*x1 + b3*x2 , where a3 + b3 = 1,
> >
> > But this method will involve MORE COMPUTATION TIME due to more no. of
> > multiplications and additions and finding co-efficients,if we mix more than
> > 3 channels.
> >
> > So must one follow this ALTERNATIVE solution with more computations or the
> > PREVIOUS one with lesser utilization of the channel?
> >
> > Also please let me know if there is some other way of dealing this situation.
> >
> > Thanks in advance,
> > Prakash.
> >
> > At 11:44 PM 8/21/02 -0500, Jeff Brower wrote:
> > >Kun-
> > >
> > > > Thanks. Where can I find this kind of algorithms as mentioned below to
> > > > dynamically conferencing?
> > >
> > >Not sure really. We've done some conferencing work before, we just
> > >worried about the
> > >2 channels with most energy at any one time. 3 channels worked Ok, too,
> > >but beyond
> > >that, I seem to recall there was not much to be gained -- if 4 people are
> > >speaking at
> > >once, well, what can you do...it's a mess "in real life", too. Also
you
> > >have to be
> > >careful how you transition the weights -- you can't have a,b,c "jerk"
to
> > >different
> > >values; use some type of lowpass/averaging filter on the weights,
> > >something that
> > >transitions smoothly, over about 50 to 100 msec.
> > >
> > >Jeff Brower
> > >DSP sw/hw engineer
> > >Signalogic
> > >
> > >
> > >
> > > > > Ahh, now you not mixing, but conferencing -- you should have said so
> > > > > earlier. You
> > > > > need some type of algorithm that dynamically looks for "most
energetic
> > > > > voice" and
> > > > > give that channel more weight. I've seen some conferencing algorithms
> > > that
> > > > > will look
> > > > > for most 2 or 3 dominant channels and give them weight.
> > > >
> > > > Jeff Brower wrote:
> > > > >
> > > > > Kun-
> > > > >
> > > > > > I have a related problem: suppose I have 2 ulaw packet, what is
the
> > > > > > algorithm to mix them and create a new ulaw packet. When remote
> > > > > > client receive and decode it, he can hear both audio.
> > > > >
> > > > > y[n] = U(a*u(x1[n]) + b*u(x2[n]))
> > > > >
> > > > > where
> > > > >
> > > > > U(x) = uLaw(x) (compress)
> > > > > u(x) = inverse uLaw(x) (expand)
> > > > > a + b = 1
> > > > > y[n] is packet data
> > > > >
> > > > > Jeff Brower
> > > > > DSP sw/hw engineer
> > > > > Signalogic
> > > > >
> > > > > > --- In audiodsp@y..., "VandeVoorde, Neil"
<nvandevoorde@p...> wrote:
> > > > > > > Last year I went to SPIE's AeroSense conference and learned
about
> > > > > > > Independent Component Analysis and an application called the
> > > > > > 'Cocktail
> > > > > > > Party.' Basically, it's a method to assume that n multiple
signals
> > > > > > are
> > > > > > > recorded by n microphones. ICA is the method that allows
the
> > > > > > signals to be
> > > > > > > separated into n discrete signals. Perhaps a look down this
path
> > > > > > may be
> > > > > > > what you need...
> > > > > > >
> > > > > > > Regards,/s/neil
> > > > > > >
> > > > > > > Neil E. Van de Voorde, Ph.D.
> > > > > > > Senior Scientist
> > > > > > > Planning Systems Inc.
> > > > > > > 228.689.8775
> > > > > > >
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Jeff Brower [mailto:jbrower@s...]
> > > > > > > Sent: Tuesday, August 20, 2002 7:21 AM
> > > > > > > To: Ulrich Prakash
> > > > > > > Cc: audiodsp@y...
> > > > > > > Subject: Re: [audiodsp] Digital Mixer.
> > > > > > >
> > > > > > > Ulrich-
> > > > > > >
> > > > > > > > In that case,if I need to mix 'm' voice channels(in the
form of
> > > > > > RTP
> > > > > > > > packets),then
> > > > > > > > y(n) = a1*x1 + a2*x2 + ... am*xm
> > > > > > > >
> > > > > > > > = SIGMA ai*xi,where i = 1 to m (SIGMA - i
don't have
> > > > > > SIGMA
> > > > > > > symbol
> > > > > > > > in my mail editor)
> > > > > > > >
> > > > > > > > And SIGMA ai = 1,where i = 1 to m.
> > > > > > > > In that case,if there are 10 input voice channels,then
the
> > > > > > amplitude of
> > > > > > > > original sample x1 in output y will be around
x1/10.Will this not
> > > > > > affect
> > > > > > > > the fidelity of x1?
> > > > > > >
> > > > > > > Ahh, now you not mixing, but conferencing -- you should have
said so
> > > > > > > earlier. You
> > > > > > > need some type of algorithm that dynamically looks for
"most
> > > > > > energetic
> > > > > > > voice" and
> > > > > > > give that channel more weight. I've seen some conferencing
> > > > > > algorithms that
> > > > > > > will look
> > > > > > > for most 2 or 3 dominant channels and give them weight.
> > > > > > >
> > > > > > > Clearly if you just constantly multiply every channel by 0.1
you
> > > > > > will not
> > > > > > > hear the
> > > > > > > speaker -- he/she will be averaged with noise/silence. Not
everyone
> > > > > > talks
> > > > > > > at once,
> > > > > > > right? Or at least you hope that your algorithm does not
cause them
> > > > > > to do
> > > > > > > that :-)
> > > > > > >
> > > > > > > -Jeff
> > > > > > >
> > > > > > >
> > > > > > > > At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> > > > > > > > >Ulrich Prakash-
> > > > > > > > >
> > > > > > > > > > >Mix == add. Add your signals together,
like this:
> > > > > > > > > > >
> > > > > > > > > > > y[n] = a*x1[n] + b*x2[n]
> > > > > > > > > > >
> > > > > > > > > > >Suggest that you maintain a + b = 1.
> > > > > > > > > >
> > > > > > > > > > [Prakash]
> > > > > > > > > > Can I take a = 0.5 and b = 0.5,or is there
any particular
> > > > > > factor to
> > > > > > > > > > evaluate the value of a and b?
> > > > > > > > > >
> > > > > > > > > > > Then put a dial on your MATLAB GUI
that allows
> > > > > > > > > > >the user to adjust between a and b. All
the way to left, a =
> > > > > > 1. All
> > > > > > > the
> > > > > > > > > > >way to
> > > > > > > > > > >right, b = 1.
> > > > > > > > > >
> > > > > > > > > > [Prakash]
> > > > > > > > > > From what you have said,I assume,if
> > > > > > > > > > x1 = {4,2,6} and x2 = {8,10,12},and
a=0.5,b=0.5then
> > > > > > > > > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > > > > > > > > Is that so straight-forward?
> > > > > > > > >
> > > > > > > > >Yep.
> > > > > > > > >
> > > > > > > > > >Will I hear both samples x1 and x2 from y
output?
> > > > > > > > >
> > > > > > > > >Yep.
> > > > > > > > >
> > > > > > > > > > But I didn't understand what you mean to the
LEFT and RIGHT
> > > > > > here.
> > > > > > > > >
> > > > > > > > >When the user turns the dial. You can put a dial
(knob) on your
> > > > > > GUI to
> > > > > > > easily
> > > > > > > > >demonstrate the effect; better than asking your
users to enter
> > > > > > values of
> > > > > > > a
> > > > > > > > >and b.
> > > > > > > > >
> > > > > > > > >Jeff Brower
> > > > > > > > >DSP sw/hw engineer
> > > > > > > > >Signalogic
--
-----------------------------
Kun Wei
Email: weikun@weik...
Address: MC 256-48, Caltech
Pasadena, CA 91125
Phone: 1-626-395-8767
-----------------------------
Alexander- > > You are approaching echo canceller / adaptive signal cancellation techniques, which > > are used to improve conferencing quality and provide a constant 2-way sound situation > > (full-duplex). But if you do that, you will need a DSP or other fast processor, and > > cannot rely on the PC/sound card to do it. > > Well, high quality echo cancellation on PC platforms _is_ possible. How many channels? Maybe a few is Ok. But then you need G.7xx and GSM codecs, VAD, jitter buffer, and pretty soon you have a large gateway box that takes forever to open a Word document. I remember well the problems with WinModems, and I wouldn't go there again. Unfortunately it's a Win9x and WinXP world, and doesn't mix well with NSP. Jeff Brower DSP sw/hw engineer Signalogic > -- > dipl. ing. > alexander lerch > > zplane.development > http://www.zplane.de > holsteinische str. 39-42 > D-12161 berlin > fon: +49.30.854 09 15.0 > fax: +49.30.854 09 15.5
Hi Jeff, Jeff Brower wrote: > Alexander- > > >>>You are approaching echo canceller / adaptive signal cancellation techniques, which >>>are used to improve conferencing quality and provide a constant 2-way sound situation >>>(full-duplex). But if you do that, you will need a DSP or other fast processor, and >>>cannot rely on the PC/sound card to do it. >> >>Well, high quality echo cancellation on PC platforms _is_ possible. > > > How many channels? Maybe a few is Ok. But then you need G.7xx and GSM codecs, VAD, > jitter buffer, and pretty soon you have a large gateway box that takes forever to > open a Word document. I remember well the problems with WinModems, and I wouldn't go > there again. Unfortunately it's a Win9x and WinXP world, and doesn't mix well with > NSP. > The number of channels is obviously dependent on the samplerate and the maximum allowed delay. At 8kHz and 200ms delay we found that you can run 20-30 (optimized) cancellers in parallel on a 1GHz-machine. However, in most cases only one or two cancellers will run (especially on machines where you can open Word) and you will have more than enough performance headroom for other applications. Speech codecs are not as complex as that would be a problem on a modern PC. But, you are right that a more complex system should to be designed carefully. Regards, Alexander -- dipl. ing. alexander lerch zplane.development http://www.zplane.de holsteinische str. 39-42 D-12161 berlin fon: +49.30.854 09 15.0 fax: +49.30.854 09 15.5
Hi Jeff,
At 09:19 AM 8/29/02 -0500, Jeff Brower wrote:
>Ulrich-
>
>If you can't do that, or you have processing ability only at the server,
>then you
>will need to subtract or selectively mix, as you describe.
Of these two(selectively mix,subtract),which one is better?Any suggestions.. ?
>You are approaching echo canceller / adaptive signal cancellation
>techniques, which
>are used to improve conferencing quality and provide a constant 2-way
>sound situation
>(full-duplex).
We would have a central server,where conferencing is done...
Does what I discussed in the previous mail refers to Echo Cancellation at
the conference server?
Or whether here Echo Cancellation here refers to something else?
> But if you do that, you will need a DSP or other fast processor, and
>cannot rely on the PC/sound card to do it.
Yeah,we will use DSP.
Thanks & regards,
Prakash.
>Jeff Brower
>DSP sw/hw engineer
>Signalogic
>
>
>Ulrich Prakash wrote:
> >
> > Hi,
> >
> > Using the summing algorithm,I feel that we are not making MAXIMUM
> > utilization of the mixed Output Channel,if I got it right.
> >
> > That is,though we take a + b + c = 1(for 3 inputs),we have to subtract one
> > input with coefficient(i.e. same sample)before sending.Hence,b+c<1 and NOT
> > equal to 1,to have maximum utilization.
> >
> > Let me explain the scenario more in detail:
> >
> > Consider there are 3 users(A,B,C) in a conference and there is ONE
> > conference server which mixes the RTP packets.
> > For 3 inputs,the mixed output,
> > y = ax1+bx2+cx3, ------(1)
> > and a+b+c=1. ---(2)
> >
> > However,we CANNOT send output 'y' as it is,to A,B and C.
> > This is because,A should not hear his own voice,....likewise for B,C also.
> >
> > Hence we have to recalculate 3 outputs,like
> > y1 = y - ax1
> > y2 = y - bx2 --------------(3)
> > y3 = y - cx3
> > and then send y1(put into RTP packet) to A,y2 to B and y3 to C.
> >
> > If this is the case,
> > y1 = y- ax1 = bx2+cx3 and here b+c <1.
> > Hence here actually the outputs coefficients b+c<1,and NOT EQUAL to 1 and
> > hence we are not making maximum utilization of the output channel.
> >
> > ALTERNATIVE solution:
> > ------------------------------------
> > The alternative is doing a calculation like this:
> > y1 = b1*x2 + c1*x3 , where b1 + c1 = 1,
> > y2 = a2*x1 + c2*x3 , where a2 + c2 = 1,
> > y3 = a3*x1 + b3*x2 , where a3 + b3 = 1,
> >
> > But this method will involve MORE COMPUTATION TIME due to more no. of
> > multiplications and additions and finding co-efficients,if we mix more than
> > 3 channels.
> >
> > So must one follow this ALTERNATIVE solution with more computations or the
> > PREVIOUS one with lesser utilization of the channel?
> >
> > Also please let me know if there is some other way of dealing this
> situation.
> >
> > Thanks in advance,
> > Prakash.
> >
> > At 11:44 PM 8/21/02 -0500, Jeff Brower wrote:
> > >Kun-
> > >
> > > > Thanks. Where can I find this kind of algorithms as mentioned below to
> > > > dynamically conferencing?
> > >
> > >Not sure really. We've done some conferencing work before, we just
> > >worried about the
> > >2 channels with most energy at any one time. 3 channels worked Ok, too,
> > >but beyond
> > >that, I seem to recall there was not much to be gained -- if 4 people are
> > >speaking at
> > >once, well, what can you do...it's a mess "in real life", too. Also
you
> > >have to be
> > >careful how you transition the weights -- you can't have a,b,c "jerk"
to
> > >different
> > >values; use some type of lowpass/averaging filter on the weights,
> > >something that
> > >transitions smoothly, over about 50 to 100 msec.
> > >
> > >Jeff Brower
> > >DSP sw/hw engineer
> > >Signalogic
> > >
> > >
> > >
> > > > > Ahh, now you not mixing, but conferencing -- you should have said so
> > > > > earlier. You
> > > > > need some type of algorithm that dynamically looks for "most
> energetic
> > > > > voice" and
> > > > > give that channel more weight. I've seen some conferencing algorithms
> > > that
> > > > > will look
> > > > > for most 2 or 3 dominant channels and give them weight.
> > > >
> > > > Jeff Brower wrote:
> > > > >
> > > > > Kun-
> > > > >
> > > > > > I have a related problem: suppose I have 2 ulaw packet, what is
the
> > > > > > algorithm to mix them and create a new ulaw packet. When remote
> > > > > > client receive and decode it, he can hear both audio.
> > > > >
> > > > > y[n] = U(a*u(x1[n]) + b*u(x2[n]))
> > > > >
> > > > > where
> > > > >
> > > > > U(x) = uLaw(x) (compress)
> > > > > u(x) = inverse uLaw(x) (expand)
> > > > > a + b = 1
> > > > > y[n] is packet data
> > > > >
> > > > > Jeff Brower
> > > > > DSP sw/hw engineer
> > > > > Signalogic
> > > > >
> > > > > > --- In audiodsp@y..., "VandeVoorde, Neil"
<nvandevoorde@p...>
> wrote:
> > > > > > > Last year I went to SPIE's AeroSense conference and learned
about
> > > > > > > Independent Component Analysis and an application called the
> > > > > > 'Cocktail
> > > > > > > Party.' Basically, it's a method to assume that n multiple
> signals
> > > > > > are
> > > > > > > recorded by n microphones. ICA is the method that allows
the
> > > > > > signals to be
> > > > > > > separated into n discrete signals. Perhaps a look down this
path
> > > > > > may be
> > > > > > > what you need...
> > > > > > >
> > > > > > > Regards,/s/neil
> > > > > > >
> > > > > > > Neil E. Van de Voorde, Ph.D.
> > > > > > > Senior Scientist
> > > > > > > Planning Systems Inc.
> > > > > > > 228.689.8775
> > > > > > >
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Jeff Brower [mailto:jbrower@s...]
> > > > > > > Sent: Tuesday, August 20, 2002 7:21 AM
> > > > > > > To: Ulrich Prakash
> > > > > > > Cc: audiodsp@y...
> > > > > > > Subject: Re: [audiodsp] Digital Mixer.
> > > > > > >
> > > > > > > Ulrich-
> > > > > > >
> > > > > > > > In that case,if I need to mix 'm' voice channels(in the
form of
> > > > > > RTP
> > > > > > > > packets),then
> > > > > > > > y(n) = a1*x1 + a2*x2 + ... am*xm
> > > > > > > >
> > > > > > > > = SIGMA ai*xi,where i = 1 to m (SIGMA - i
don't have
> > > > > > SIGMA
> > > > > > > symbol
> > > > > > > > in my mail editor)
> > > > > > > >
> > > > > > > > And SIGMA ai = 1,where i = 1 to m.
> > > > > > > > In that case,if there are 10 input voice channels,then
the
> > > > > > amplitude of
> > > > > > > > original sample x1 in output y will be around
x1/10.Will
> this not
> > > > > > affect
> > > > > > > > the fidelity of x1?
> > > > > > >
> > > > > > > Ahh, now you not mixing, but conferencing -- you should have
> said so
> > > > > > > earlier. You
> > > > > > > need some type of algorithm that dynamically looks for
"most
> > > > > > energetic
> > > > > > > voice" and
> > > > > > > give that channel more weight. I've seen some conferencing
> > > > > > algorithms that
> > > > > > > will look
> > > > > > > for most 2 or 3 dominant channels and give them weight.
> > > > > > >
> > > > > > > Clearly if you just constantly multiply every channel by 0.1
you
> > > > > > will not
> > > > > > > hear the
> > > > > > > speaker -- he/she will be averaged with noise/silence. Not
> everyone
> > > > > > talks
> > > > > > > at once,
> > > > > > > right? Or at least you hope that your algorithm does not
> cause them
> > > > > > to do
> > > > > > > that :-)
> > > > > > >
> > > > > > > -Jeff
> > > > > > >
> > > > > > >
> > > > > > > > At 08:20 AM 8/19/02 -0500, Jeff Brower wrote:
> > > > > > > > >Ulrich Prakash-
> > > > > > > > >
> > > > > > > > > > >Mix == add. Add your signals together,
like this:
> > > > > > > > > > >
> > > > > > > > > > > y[n] = a*x1[n] + b*x2[n]
> > > > > > > > > > >
> > > > > > > > > > >Suggest that you maintain a + b = 1.
> > > > > > > > > >
> > > > > > > > > > [Prakash]
> > > > > > > > > > Can I take a = 0.5 and b = 0.5,or is there
any particular
> > > > > > factor to
> > > > > > > > > > evaluate the value of a and b?
> > > > > > > > > >
> > > > > > > > > > > Then put a dial on your MATLAB GUI
that allows
> > > > > > > > > > >the user to adjust between a and b. All
the way to
> left, a =
> > > > > > 1. All
> > > > > > > the
> > > > > > > > > > >way to
> > > > > > > > > > >right, b = 1.
> > > > > > > > > >
> > > > > > > > > > [Prakash]
> > > > > > > > > > From what you have said,I assume,if
> > > > > > > > > > x1 = {4,2,6} and x2 = {8,10,12},and
a=0.5,b=0.5then
> > > > > > > > > > y = {2+4, 1+5, 3+6} = {6, 6, 9}
> > > > > > > > > > Is that so straight-forward?
> > > > > > > > >
> > > > > > > > >Yep.
> > > > > > > > >
> > > > > > > > > >Will I hear both samples x1 and x2 from y
output?
> > > > > > > > >
> > > > > > > > >Yep.
> > > > > > > > >
> > > > > > > > > > But I didn't understand what you mean to the
LEFT and RIGHT
> > > > > > here.
> > > > > > > > >
> > > > > > > > >When the user turns the dial. You can put a dial
(knob)
> on your
> > > > > > GUI to
> > > > > > > easily
> > > > > > > > >demonstrate the effect; better than asking your
users to enter
> > > > > > values of
> > > > > > > a
> > > > > > > > >and b.
> > > > > > > > >
> > > > > > > > >Jeff Brower
> > > > > > > > >DSP sw/hw engineer
> > > > > > > > >Signalogic
>
>
>_____________________________________
>
>
>
>