Hello, i have a real-time speeh detection algorithm. I wish to connect a microphone and let the input waveform flows through a subplot in a figure at a slow rate. Is it possible? How should i go about doing it? via figure or GUI? my algorithm will also process the input waveform in blocks. Say each block has 160 samples. So i guess i will perform checks for the condition where the input bin is multiple of 160, then i run my speech detection algorithm and display the result on the same figure/GUI too. Any comments will be appreciated. Thanks
need help on speech signal acquisition and display
Started by ●January 2, 2007
Reply by ●January 2, 20072007-01-02
doggie skrev:> Hello, i have a real-time speeh detection algorithm. I wish to connect a > microphone and let the input waveform flows through a subplot in a figure > at a slow rate. Is it possible? How should i go about doing it? via figure > or GUI?Yes and no. Yes. it is possible to display data in a slower rate than they are produced. No, it is not possible to do that *and* maintain real-time. If the system produces, say, 1000 samples per second and you display one new sample every 5 ms, it will take 5 s to display 1 s worth of data. So you have to make a choise: Either display data as they come, or store them somewhere and display them in an off-line application. Rune
Reply by ●January 2, 20072007-01-02
doggie wrote:> Hello, i have a real-time speeh detection algorithm. I wish to connect a > microphone and let the input waveform flows through a subplot in a figure > at a slow rate. Is it possible? How should i go about doing it? via figure > or GUI? > > my algorithm will also process the input waveform in blocks. Say each > block has 160 samples. So i guess i will perform checks for the condition > where the input bin is multiple of 160, then i run my speech detection > algorithm and display the result on the same figure/GUI too. > > Any comments will be appreciated.I don't understand what "flow through a subplot" means. 8,000 samples per second doesn't capture all the information in speech, but it's probably adequate. At that rate, 160 samples represents 1/50th of a second. What will you learn in that time. What does your detection algorithm accomplish? What will be displayed? Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●January 2, 20072007-01-02
On Jan 2, 3:46 pm, "doggie" <elusivetruelove2...@yahoo.com> wrote:> Hello, i have a real-time speeh detection algorithm. I wish to connect a > microphone and let the input waveform flows through a subplot in a figure > at a slow rate. Is it possible? How should i go about doing it? via figure > or GUI? > > my algorithm will also process the input waveform in blocks. Say each > block has 160 samples. So i guess i will perform checks for the condition > where the input bin is multiple of 160, then i run my speech detection > algorithm and display the result on the same figure/GUI too.I assume you're talking about MATLAB here. You will need the Data Acquisition Toolbox, which has audio (analogue) capture capabilities. You can then set up callbacks so that your algorithm and GUI are called every x samples. -- Oli
Reply by ●January 2, 20072007-01-02
>I don't understand what "flow through a subplot" means. >Hi, actually i was thinking of letting the input signal flow from right to left in a subplot figure so that i can show if they are classified as speech or noise at the same time. But come to think of it, it will be too fast anyway so i guess i will do it offline as Rune suggested.>8,000 samples per second doesn't capture all the information in speech, >but it's probably adequate. At that rate, 160 samples represents 1/50th >of a second. What will you learn in that time.I guess 16kHz would have been better.>What does your detection >algorithm accomplish? What will be displayed?My algorithm is an energy based approach on the energy of each frame of noisy speech. I will display the input speech and superimpose on it a line showing 1 when speech and 0 when noise for easy viewing of results. Currently, i can successfully detect all the speech frames but there are some noise frames being misclassified. So i have been trying to look for another method to post-process my results to correct those noise frames misclassified as speech. Is there any useful methods? autocorrelation seems to work only for voiced speech and there seems to be nothing distinctive to detect unvoiced speech. Thanks
Reply by ●January 2, 20072007-01-02
"doggie" <elusivetruelove2003@yahoo.com> wrote in news:pr2dnSeEx8M_AgfYnZ2dnUVZ_vShnZ2d@giganews.com:> autocorrelation > seems to work only for voiced speech and there seems to be nothing > distinctive to detect unvoiced speech. > >I'll bite-- what is unvoiced speech? -- Scott Reverse name to reply
Reply by ●January 2, 20072007-01-02
>I assume you're talking about MATLAB here. You will need the Data >Acquisition Toolbox, which has audio (analogue) capture capabilities. > >You can then set up callbacks so that your algorithm and GUI are called >every x samples. > > >-- >Oli >Hi Oli, i guess i will do it exactly like what u said in figure rather than GUI as im not really familiar with it. Basically, i should have a code that starts data acquistion when round and store the data in a variable. each time it reaches every x samples, i will send the latest frame to the detection algorithm and output the updated results in a figure together with the preceding results. Thanks
Reply by ●January 2, 20072007-01-02
doggie wrote:>> I don't understand what "flow through a subplot" means. >> > Hi, actually i was thinking of letting the input signal flow from right to > left in a subplot figure so that i can show if they are classified as > speech or noise at the same time. But come to think of it, it will be too > fast anyway so i guess i will do it offline as Rune suggested. > > > >> 8,000 samples per second doesn't capture all the information in speech, >> but it's probably adequate. At that rate, 160 samples represents 1/50th >> of a second. What will you learn in that time. > > I guess 16kHz would have been better.160 samples at 16 KHz is 10 ms. What aspect of speech can you analyze in that time?>> What does your detection >> algorithm accomplish? What will be displayed? > > My algorithm is an energy based approach on the energy of each frame of > noisy speech. I will display the input speech and superimpose on it a line > showing 1 when speech and 0 when noise for easy viewing of results. > Currently, i can successfully detect all the speech frames but there are > some noise frames being misclassified. So i have been trying to look for > another method to post-process my results to correct those noise frames > misclassified as speech. Is there any useful methods? autocorrelation > seems to work only for voiced speech and there seems to be nothing > distinctive to detect unvoiced speech.Given the spiky nature of speech waveforms, using a contrasting color for the speech/noise indicator and overwriting with it might be clearer. You might classify noise with greater accuracy by using longer sample chunks. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●January 2, 20072007-01-02
Scott Seidman wrote:> "doggie" <elusivetruelove2003@yahoo.com> wrote in > news:pr2dnSeEx8M_AgfYnZ2dnUVZ_vShnZ2d@giganews.com: > >> autocorrelation >> seems to work only for voiced speech and there seems to be nothing >> distinctive to detect unvoiced speech. >> >> > > I'll bite-- what is unvoiced speech?Speech sounds that don't involve the vocal cords. Siblants, whispering. "Zzzz" is voiced "ssss". Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●January 2, 20072007-01-02
doggie wrote:> Currently, i can successfully detect all the speech frames but there are > some noise frames being misclassified. So i have been trying to look for > another method to post-process my results to correct those noise frames > misclassified as speech. Is there any useful methods? autocorrelation > seems to work only for voiced speech and there seems to be nothing > distinctive to detect unvoiced speech.Unvoiced speech IS noise, so there is no way to distinguish it from an ambient noise based on energy alone, if spectral characteristics are similar. For that reason a reliable speech detection in noisy recordings is only possible for voiced speech sounds and is based on pitch cue, or f0. US Patent 7,124,075 is the good place to start...






