Forums

Dividing speech signal to pitch-synchronous frames

Started by vjw December 10, 2010
Hello,

For my application, I have to segment a speech waveform in to
pitch-synchronous frames i.e. each frame is two pitch periods long and
centered around the current pitch mark. Can anyone give me an idea how to
approach this problem (using Matlab)?

Thanks.


On Dec 10, 12:09=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> wrote:
> Hello, > > For my application, I have to segment a speech waveform in to > pitch-synchronous frames i.e. each frame is two pitch periods long and > centered around the current pitch mark. Can anyone give me an idea how to > approach this problem (using Matlab)? > > Thanks.
You mean you are looking for free matlab code for doing this ? Good luck with your search Please let us know what you find And btw, why do you need two pitch periods ? Isn't one enough ?
On Dec 10, 12:19=A0pm, fatalist <simfid...@gmail.com> wrote:
> On Dec 10, 12:09=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> wrote: > > > Hello, > > > For my application, I have to segment a speech waveform in to > > pitch-synchronous frames i.e. each frame is two pitch periods long and > > centered around the current pitch mark. Can anyone give me an idea how =
to
> > approach this problem (using Matlab)?
do you have a pitch-detection algorithm? that's the first thing that you need. you need to know the period length. if you don't, better look into pitch detection algs (often something based on AMDF or autocorrelation or something similar). and then, depending on your required precision of time, you may need a means of interpolation or resampling because the period length is not necessarily an integer number of samples.
> > You mean you are looking for free matlab code for doing this ? > > Good luck with your search
yeah, don't expect someone else to work for free for you.
> Please let us know what you find > > And btw, why do you need two pitch periods ? =A0Isn't one enough ?
often, for use of these "grains" or little "wavelets" (a term i used in 1995, not to be confused with any wavelet transform) or "diphones", you need two adjacent periods and you end up ramping the first up from zero and the second period down to zero (usually using a Hann window or Flattened Hann or something similar, i s'pose you could use a triangular window). ...
> > Thanks.
yer welcome, FWIW. r b-j
On Dec 10, 9:09=A0am, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> wrote:
> Hello, > > For my application, I have to segment a speech waveform in to > pitch-synchronous frames i.e. each frame is two pitch periods long and > centered around the current pitch mark. Can anyone give me an idea how to > approach this problem (using Matlab)?
You seem to be assuming that something called "pitch" exists at the current mark. If so, estimate it (using a localized pitch estimation algorithm). Verify your assumption and the estimate by correlating one pitch period prior and subsequent to the mark. Interpolate or upsample, as necessary. If the correlation is high enough, use that prior and subsequent data as the frame. If not, do something else. At least, that's one possible method. IMHO. YMMV. -- rhn A.T nicholson d.0.t C-o-M http://www.nicholson.com/rhn/dsp.html
>On Dec 10, 12:19=A0pm, fatalist <simfid...@gmail.com> wrote: >> On Dec 10, 12:09=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com>
wrote:
>> >> > Hello, >> >> > For my application, I have to segment a speech waveform in to >> > pitch-synchronous frames i.e. each frame is two pitch periods long
and
>> > centered around the current pitch mark. Can anyone give me an idea how
=
>to >> > approach this problem (using Matlab)? > >do you have a pitch-detection algorithm? that's the first thing that >you need. you need to know the period length. if you don't, better >look into pitch detection algs (often something based on AMDF or >autocorrelation or something similar). > >and then, depending on your required precision of time, you may need a >means of interpolation or resampling because the period length is not >necessarily an integer number of samples. > >> >> You mean you are looking for free matlab code for doing this ? >> >> Good luck with your search > >yeah, don't expect someone else to work for free for you. > >> Please let us know what you find >> >> And btw, why do you need two pitch periods ? =A0Isn't one enough ? > >often, for use of these "grains" or little "wavelets" (a term i used >in 1995, not to be confused with any wavelet transform) or "diphones", >you need two adjacent periods and you end up ramping the first up from >zero and the second period down to zero (usually using a Hann window >or Flattened Hann or something similar, i s'pose you could use a >triangular window). > > >... >> > Thanks. > >yer welcome, FWIW. > >r b-j >
>You seem to be assuming that something called "pitch" exists >at the current mark. If so, estimate it (using a localized >pitch estimation algorithm). Verify your assumption and the >estimate by correlating one pitch period prior and subsequent >to the mark. Interpolate or upsample, as necessary. If the >correlation is high enough, use that prior and subsequent data >as the frame. If not, do something else. > >At least, that's one possible method. > > >IMHO. YMMV. >-- >rhn A.T nicholson d.0.t C-o-M > http://www.nicholson.com/rhn/dsp.html
Thanks for the suggestions. And i take back 'using MATLAB' part :)
>>On Dec 10, 12:19=A0pm, fatalist <simfid...@gmail.com> wrote: >>> On Dec 10, 12:09=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> >wrote: >>> >>> > Hello, >>> >>> > For my application, I have to segment a speech waveform in to >>> > pitch-synchronous frames i.e. each frame is two pitch periods long >and >>> > centered around the current pitch mark. Can anyone give me an idea
how
>= >>to >>> > approach this problem (using Matlab)? >>
I tried this in MATLAB. But as the lengths of pitch-synchronized frames vary, how can u store the frames for further processing? My MATLAB code is here and hope I got the idea correctly. samp_rate= 16000; wav= tts('I can speak',[],-5,samp_rate); %wavplay(wav,fs); [f0, t, r] = spPitchTrackCorr(wav,samp_rate,20,10,[],0);%pitch detection from Naotoshi Seo no_of_pitchmarks = length(f0); %for i=:no_of_pitchmarks -- Can't store the frames in a matrix i=1; pitch= f0(i); pit_period= 1/pitch; no_of_samps_per_period= floor(samp_rate*pit_period); sampInd_atPitMark= floor(samp_rate*t); pit_syn_frames(:,i) = wav(sampInd_atPitMark - no_of_samps_per_period:sampInd_atPitMark + no_of_samps_per_period); %end
On Dec 11, 3:57=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> wrote:
> >>On Dec 10, 12:19=3DA0pm, fatalist <simfid...@gmail.com> wrote: > >>> On Dec 10, 12:09=3DA0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> > >wrote: > > >>> > Hello, > > >>> > For my application, I have to segment a speech waveform in to > >>> > pitch-synchronous frames i.e. each frame is two pitch periods long > >and > >>> > centered around the current pitch mark. Can anyone give me an idea > how > >=3D > >>to > >>> > approach this problem (using Matlab)? > > I tried this in MATLAB. But as the lengths of pitch-synchronized frames > vary, how can u store the frames for further processing? My MATLAB code i=
s
> here and hope I got the idea correctly. > > samp_rate=3D 16000; > wav=3D tts('I can speak',[],-5,samp_rate); > %wavplay(wav,fs); > [f0, t, r] =3D spPitchTrackCorr(wav,samp_rate,20,10,[],0);%pitch detectio=
n
> from Naotoshi Seo > no_of_pitchmarks =3D length(f0); > > %for i=3D:no_of_pitchmarks -- Can't store the frames in a matrix > =A0i=3D1; > =A0 =A0 pitch=3D f0(i); > =A0 =A0 pit_period=3D 1/pitch; > =A0 =A0 no_of_samps_per_period=3D floor(samp_rate*pit_period); > =A0 =A0 sampInd_atPitMark=3D floor(samp_rate*t); > =A0 =A0 pit_syn_frames(:,i) =3D wav(sampInd_atPitMark - =A0 > no_of_samps_per_period:sampInd_atPitMark + no_of_samps_per_period); > %end
Dude Storing your data in matlab is the least of your problems Try matlab cell array You are in way over your head If you want this thing work for real speech