Hi ,
Suppose there are two speakers speaking in two different
microphones connected to a
mixer.Are there any known algorithms which can separate the
speech of the two people
from output signal of the mixer.
Consider one person is male and other female. Will this make
things simpler.
How about separating the voiced segments alone.(say one is
utteing the vowel 'aaa' and
the other 'eee')
Can Linear prediction analysis still be used to seperate the
vocal tract filter and
the source. If so what is the order of the LPC to be used
assuming 8khz sampling rate.
What will be the chracteristics of the source function of such a
composite speech.
Given such a signal is there any algorithm which can identify
that there are two different
speakers.
Given the data of the two speakers(when they are speaking alone)
will this be of
any use.
Is there any algorithm which can track the speech of one person
alone given the data
of that person(when speaking alone).
Consider one person singing (opera) on one microphonel and the
other person reading
news on another microphone . Is there any known algorithm to
identify such a case.
regards
Rajesh Dachiraju
Separation of speech from multiple speakers speaking in different microphones connected to a mixer
Started by ●May 2, 2008
Reply by ●May 2, 20082008-05-02
This is the well-known "cocktail party" problem. There are a variety of means for separation, but my interest has recently been focused on exploiting the independence of individual speakers. Look up the phrase "independent component analysis" and you'll find a bunch of references. One of the first is a paper by Comon titled "Independent Component Analysis: A New Concept?" which is a pretty good tutorial on the concepts. I'd also suggest the ICA book by Hyvarinen titled "Independent Component Analysis." This book also covers the concept of principal component analysis, which is a 2nd order processing technique. Mark
Reply by ●May 2, 20082008-05-02
On May 2, 12:00�pm, "markt" <tak...@pericle.com> wrote:> This is the well-known "cocktail party" problem. �There are a variety of > means for separation, but my interest has recently been focused on > exploiting the independence of individual speakers. �Look up the phrase > "independent component analysis" and you'll find a bunch of references. > One of the first is a paper by Comon titled "Independent Component > Analysis: A New Concept?" which is a pretty good tutorial on the concepts. > I'd also suggest the ICA book by Hyvarinen titled "Independent Component > Analysis." �This book also covers the concept of principal component > analysis, which is a 2nd order processing technique. > > MarkI looked at a book on ICA a couple of years ago (don't remember the title). Near the start of the book it said speaker separation was one of the many great applications. However at the end of the book where they addressed applications, they showed that it did not work well at all. Dirk
Reply by ●May 2, 20082008-05-02
>However at the end of the book where >they addressed applications, they showed that it did not work well at >all. > >Dirk >Dig deeper. There are many methods for tackling such a problem and no single one necessarily applies to all situations. I'm using ICA in the presence of multipath fading for DS-CDMA communication systems. ICA is just now getting big. The biggest problem with ICA is the fact that it uses higher-order statistics, which converge slowly. Mark
Reply by ●May 2, 20082008-05-02
"markt" <takatz@pericle.com> writes:>>However at the end of the book where >>they addressed applications, they showed that it did not work well at >>all. >> >>Dirk >> > > Dig deeper. There are many methods for tackling such a problem and no > single one necessarily applies to all situations. I'm using ICA in the > presence of multipath fading for DS-CDMA communication systems. ICA is > just now getting big. The biggest problem with ICA is the fact that it > uses higher-order statistics, which converge slowly. > > MarkMark, do you know of any papers on these latest applications? -- % Randy Yates % "Rollin' and riding and slippin' and %% Fuquay-Varina, NC % sliding, it's magic." %%% 919-577-9882 % %%%% <yates@ieee.org> % 'Living' Thing', *A New World Record*, ELO http://www.digitalsignallabs.com
Reply by ●May 2, 20082008-05-02
On May 2, 12:23�pm, "markt" <tak...@pericle.com> wrote:> >However at the end of the book where > >they addressed applications, they showed that it did not work well at > >all. > > >Dirk > > Dig deeper. �There are many methods for tackling such a problem and no > single one necessarily applies to all situations. �I'm using ICA in the > presence of multipath fading for DS-CDMA communication systems. �ICA is > just now getting big. �The biggest problem with ICA is the fact that it > uses higher-order statistics, which converge slowly. > > MarkMark, Dig deeper into what? RADC (not the current name) was trying to solve this problem in the 1980's-early 90's. I followed their efforts at that time. Their efforts up through that period were ultimately admitted to not perform well. The last solution that I am aware of threw out most of the processing of previous solutions, because a much simpler method produced more useful (not good) results. I would expect that slow convergence of ICA would be a problem because the acoustic environment (head orientation, etc) often changes pretty quickly. BTW what is being described here is NOT the well known "Cocktail Party" problem. That problem has to do with two microphones (or ears) that are not mixed together. Also, most people don't realize how the actual "Cocktail Party" problem differs from what actually happens at a cocktail party. If you are interested in the problem you can figure out the huge difference. Dirk
Reply by ●May 5, 20082008-05-05
>Dig deeper into what? RADC (not the current name) was trying to solve >this problem in the 1980's-early 90's. I followed their efforts at >that time. Their efforts up through that period were ultimately >admitted to not perform well. The last solution that I am aware of >threw out most of the processing of previous solutions, because a much >simpler method produced more useful (not good) results.I gave you two references, both of which are more current than the 80s/90s. Do a simple Google search on "cocktail party problem" and ICA and you'll get a bevy of hits.>I would expect that slow convergence of ICA would be a problem because >the acoustic environment (head orientation, etc) often changes pretty >quickly.I've implemented ICA on Rayleigh fading channels for a DS-CDMA system not unlike that used in US PCS systems. It is not the end-all, yet, but there's a lot of research in this area. Do the Google search I mention above and the folks at Helsinki U of T are the first that pop up (not coincidentally, Hyvarinen is the author of the book).>BTW what is being described here is NOT the well known "Cocktail >Party" problem. That problem has to do with two microphones (or ears) >that are not mixed together. Also, most people don't realize how the >actual "Cocktail Party" problem differs from what actually happens at >a cocktail party. If you are interested in the problem you can figure >out the huge difference.I disagree. The cocktail party problem as typically devised is simply the mixing of multiple speakers in a room. Using multiple receivers, e.g. two ears, is merely one method to separate the signals (by exploiting phase differences which provides position information). How they get mixed is really immaterial (well, assuming some linear summation), what matters is their independence. Try not to be so ignorant in your replies and maybe others will be helpful, too. Remember, YOU are asking for help, not me. Mark
Reply by ●May 5, 20082008-05-05
On May 5, 2:57�pm, "markt" <tak...@pericle.com> wrote:> >Dig deeper into what? RADC (not the current name) was trying to solve > >this problem in the 1980's-early 90's. I followed their efforts at > >that time. �Their efforts up through that period were ultimately > >admitted to not perform well. �The last solution that I am aware of > >threw out most of the processing of previous solutions, because a much > >simpler method produced more useful (not good) results. > > I gave you two references, both of which are more current than the > 80s/90s. �Do a simple Google search on "cocktail party problem" and ICA and > you'll get a bevy of hits. > > >I would expect that slow convergence of ICA would be a problem because > >the acoustic environment (head orientation, etc) often changes pretty > >quickly. > > I've implemented ICA on Rayleigh fading channels for a DS-CDMA system not > unlike that used in US PCS systems. �It is not the end-all, yet, but > there's a lot of research in this area. �Do the Google search I mention > above and the folks at Helsinki U of T are the first that pop up (not > coincidentally, Hyvarinen is the author of the book). > > >BTW what is being described here is NOT the well known "Cocktail > >Party" problem. That problem has to do with two microphones (or ears) > >that are not mixed together. �Also, most people don't realize how the > >actual "Cocktail Party" problem differs from what actually happens at > >a cocktail party. If you are interested in the problem you can figure > >out the huge difference. > > I disagree. �The cocktail party problem as typically devised is simply the > mixing of multiple speakers in a room. �Using multiple receivers, e.g. two > ears, is merely one method to separate the signals (by exploiting phase > differences which provides position information). �How they get mixed is > really immaterial (well, assuming some linear summation), what matters is > their independence. > > Try not to be so ignorant in your replies and maybe others will be > helpful, too. �Remember, YOU are asking for help, not me. > > MarkMark, Feel free to disagree. If you think about it, how they get mixed is anything but immaterial. At your next cocktail party try listening with only one ear. Then report back to us. I did the google search ("cocktail party problem" ICA) and this is from the first thing that popped up: "COCKTAIL PARTY PROBLEM Imagine you're at a cocktail party. For you it is no problem to follow the discussion of your neighbours, even if there are lots of other sound sources in the room: other discussions in English and in other languages, different kinds of music, etc.. You might even hear a siren from the passing-by police car. It is not known exactly how humans are able to separate the different sound sources. Independent component analysis is able to do it, if there are at least as many microphones or 'ears' in the room as there are different simultaneous sound sources. In this demo, you can select which sounds are present in your cocktail party. ICA will separate them without knowing anything about the different sound sources or the positions of the microphones." Doesn't sound like they were just all added together, huh. Also, if you can read, and follow the posts, you will realize I was never asking for help. By reading the posts, you will also note that my latest reference is not out of the 80s or 90s. It is a couple of years ago from an ICA book (popular I understand) I read but didn't keep after I saw how lousy they said the results were after promoting it to solve the problem earlier in the book. Try not to be so arrogant and people will not think you are such an asshole. Dirk
Reply by ●May 5, 20082008-05-05
>Also, if you can read, and follow the posts, you will realize I was >never asking for help.To hell you weren't... first post: "Are there any known algorithms which can separate the speech of the two people from output signal of the mixer." I simply provided an answer and you got ignorant.>Feel free to disagree. If you think about it, how they get mixed is >anything but immaterial. At your next cocktail party try listening >with only one ear. Then report back to us.I implement ICA with multiple mixes and only one receiver. It works. Like I said, dig deeper.>It is not known exactly how humans are able to separate the different >sound sources. Independent component analysis is able to do it, if >there are at least as many microphones or 'ears' in the room as there >are different simultaneous sound sources. In this demo, you can select >which sounds are present in your cocktail party. ICA will separate >them without knowing anything about the different sound sources or the >positions of the microphones." > >Doesn't sound like they were just all added together, huh.Uh, yes, they are linearly summed sources, which is plain simply by the description. In other words, they are all added (there are also methods for convolutive mixing). If you actually read deep enough about the concept, you'll discover that a simple way around the "more microphones than sources" is to simply take more samples from a time series and treat each sample of the vector as a separate source. This works well when you have time series representations of signals, which the Hyvarinen work delves into. The term "cocktail party problem" has nothing to do with how many receivers there are or how you separate them. It merely refers to the concept of multiple sources of some signal present within a single channel. In a room with multiple speakers, or a DS-CDMA channel with multiple users, it is still the classic cocktail party problem. If you had actually read further, you would have understood this. Instead, you display your ignorance as well as arrogance after I was simply attempting to offer some help.>Try not to be so arrogant and people will not think you are such an >asshole.I suggest you take a look in the mirror. Mark
Reply by ●May 5, 20082008-05-05
On May 5, 4:21�pm, "markt" <tak...@pericle.com> wrote:> >Also, if you can read, and follow the posts, you will realize I was > >never asking for help. > > To hell you weren't... first post: > > "Are there any known algorithms which can separate the speech of the two > people from output signal of the mixer." > > I simply provided an answer and you got ignorant.Mark, You are a jackass. The first post wasn't mine. Pay attention to details. Dirk> > >Feel free to disagree. �If you think about it, how they get mixed is > >anything but immaterial. At your next cocktail party try listening > >with only one ear. Then report back to us. > > I implement ICA with multiple mixes and only one receiver. �It works. > Like I said, dig deeper. > > >It is not known exactly how humans are able to separate the different > >sound sources. Independent component analysis is able to do it, if > >there are at least as many microphones or 'ears' in the room as there > >are different simultaneous sound sources. In this demo, you can select > >which sounds are present in your cocktail party. ICA will separate > >them without knowing anything about the different sound sources or the > >positions of the microphones." > > >Doesn't sound like they were just all added together, huh. > > Uh, yes, they are linearly summed sources, which is plain simply by the > description. �In other words, they are all added (there are also methods > for convolutive mixing). �If you actually read deep enough about the > concept, you'll discover that a simple way around the "more microphones > than sources" is to simply take more samples from a time series and treat > each sample of the vector as a separate source. �This works well when you > have time series representations of signals, which the Hyvarinen work > delves into. � > > The term "cocktail party problem" has nothing to do with how many > receivers there are or how you separate them. �It merely refers to the > concept of multiple sources of some signal present within a single channel. > �In a room with multiple speakers, or a DS-CDMA channel with multiple > users, it is still the classic cocktail party problem. > > If you had actually read further, you would have understood this. > Instead, you display your ignorance as well as arrogance after I was simply > attempting to offer some help. > > >Try not to be so arrogant and people will not think you are such an > >asshole. > > I suggest you take a look in the mirror. > > Mark






