# Dividing speech signal to pitch-synchronous frames

Started by December 10, 2010
```Hello,

For my application, I have to segment a speech waveform in to
pitch-synchronous frames i.e. each frame is two pitch periods long and
centered around the current pitch mark. Can anyone give me an idea how to
approach this problem (using Matlab)?

Thanks.

```
```On Dec 10, 12:09=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> wrote:
> Hello,
>
> For my application, I have to segment a speech waveform in to
> pitch-synchronous frames i.e. each frame is two pitch periods long and
> centered around the current pitch mark. Can anyone give me an idea how to
> approach this problem (using Matlab)?
>
> Thanks.

You mean you are looking for free matlab code for doing this ?

Please let us know what you find

And btw, why do you need two pitch periods ?  Isn't one enough ?
```
```On Dec 10, 12:19=A0pm, fatalist <simfid...@gmail.com> wrote:
> On Dec 10, 12:09=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> wrote:
>
> > Hello,
>
> > For my application, I have to segment a speech waveform in to
> > pitch-synchronous frames i.e. each frame is two pitch periods long and
> > centered around the current pitch mark. Can anyone give me an idea how =
to
> > approach this problem (using Matlab)?

do you have a pitch-detection algorithm?  that's the first thing that
you need.  you need to know the period length.  if you don't, better
look into pitch detection algs (often something based on AMDF or
autocorrelation or something similar).

and then, depending on your required precision of time, you may need a
means of interpolation or resampling because the period length is not
necessarily an integer number of samples.

>
> You mean you are looking for free matlab code for doing this ?
>
> Good luck with your search

> Please let us know what you find
>
> And btw, why do you need two pitch periods ? =A0Isn't one enough ?

often, for use of these "grains" or little "wavelets" (a term i used
in 1995, not to be confused with any wavelet transform) or "diphones",
you need two adjacent periods and you end up ramping the first up from
zero and the second period down to zero (usually using a Hann window
or Flattened Hann or something similar, i s'pose you could use a
triangular window).

...
> > Thanks.

yer welcome, FWIW.

r b-j
```
```On Dec 10, 9:09=A0am, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> wrote:
> Hello,
>
> For my application, I have to segment a speech waveform in to
> pitch-synchronous frames i.e. each frame is two pitch periods long and
> centered around the current pitch mark. Can anyone give me an idea how to
> approach this problem (using Matlab)?

You seem to be assuming that something called "pitch" exists
at the current mark.  If so, estimate it (using a localized
pitch estimation algorithm).  Verify your assumption and the
estimate by correlating one pitch period prior and subsequent
to the mark.  Interpolate or upsample, as necessary.  If the
correlation is high enough, use that prior and subsequent data
as the frame.  If not, do something else.

At least, that's one possible method.

IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M
http://www.nicholson.com/rhn/dsp.html
```
```>On Dec 10, 12:19=A0pm, fatalist <simfid...@gmail.com> wrote:
>> On Dec 10, 12:09=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com>
wrote:
>>
>> > Hello,
>>
>> > For my application, I have to segment a speech waveform in to
>> > pitch-synchronous frames i.e. each frame is two pitch periods long
and
>> > centered around the current pitch mark. Can anyone give me an idea how
=
>to
>> > approach this problem (using Matlab)?
>
>do you have a pitch-detection algorithm?  that's the first thing that
>you need.  you need to know the period length.  if you don't, better
>look into pitch detection algs (often something based on AMDF or
>autocorrelation or something similar).
>
>and then, depending on your required precision of time, you may need a
>means of interpolation or resampling because the period length is not
>necessarily an integer number of samples.
>
>>
>> You mean you are looking for free matlab code for doing this ?
>>
>> Good luck with your search
>
>
>> Please let us know what you find
>>
>> And btw, why do you need two pitch periods ? =A0Isn't one enough ?
>
>often, for use of these "grains" or little "wavelets" (a term i used
>in 1995, not to be confused with any wavelet transform) or "diphones",
>you need two adjacent periods and you end up ramping the first up from
>zero and the second period down to zero (usually using a Hann window
>or Flattened Hann or something similar, i s'pose you could use a
>triangular window).
>
>
>...
>> > Thanks.
>
>yer welcome, FWIW.
>
>r b-j
>

>You seem to be assuming that something called "pitch" exists
>at the current mark.  If so, estimate it (using a localized
>pitch estimation algorithm).  Verify your assumption and the
>estimate by correlating one pitch period prior and subsequent
>to the mark.  Interpolate or upsample, as necessary.  If the
>correlation is high enough, use that prior and subsequent data
>as the frame.  If not, do something else.
>
>At least, that's one possible method.
>
>
>IMHO. YMMV.
>--
>rhn A.T nicholson d.0.t C-o-M
> http://www.nicholson.com/rhn/dsp.html

Thanks for the suggestions. And i take back 'using MATLAB' part :)
```
```>>On Dec 10, 12:19=A0pm, fatalist <simfid...@gmail.com> wrote:
>>> On Dec 10, 12:09=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com>
>wrote:
>>>
>>> > Hello,
>>>
>>> > For my application, I have to segment a speech waveform in to
>>> > pitch-synchronous frames i.e. each frame is two pitch periods long
>and
>>> > centered around the current pitch mark. Can anyone give me an idea
how
>=
>>to
>>> > approach this problem (using Matlab)?
>>

I tried this in MATLAB. But as the lengths of pitch-synchronized frames
vary, how can u store the frames for further processing? My MATLAB code is
here and hope I got the idea correctly.

samp_rate= 16000;
wav= tts('I can speak',[],-5,samp_rate);
%wavplay(wav,fs);
[f0, t, r] = spPitchTrackCorr(wav,samp_rate,20,10,[],0);%pitch detection
from Naotoshi Seo
no_of_pitchmarks = length(f0);

%for i=:no_of_pitchmarks -- Can't store the frames in a matrix
i=1;
pitch= f0(i);
pit_period= 1/pitch;
no_of_samps_per_period= floor(samp_rate*pit_period);
sampInd_atPitMark= floor(samp_rate*t);
pit_syn_frames(:,i) = wav(sampInd_atPitMark -
no_of_samps_per_period:sampInd_atPitMark + no_of_samps_per_period);
%end

```
```On Dec 11, 3:57=A0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com> wrote:
> >>On Dec 10, 12:19=3DA0pm, fatalist <simfid...@gmail.com> wrote:
> >>> On Dec 10, 12:09=3DA0pm, "vjw" <jayachamaree@n_o_s_p_a_m.gmail.com>
> >wrote:
>
> >>> > Hello,
>
> >>> > For my application, I have to segment a speech waveform in to
> >>> > pitch-synchronous frames i.e. each frame is two pitch periods long
> >and
> >>> > centered around the current pitch mark. Can anyone give me an idea
> how
> >=3D
> >>to
> >>> > approach this problem (using Matlab)?
>
> I tried this in MATLAB. But as the lengths of pitch-synchronized frames
> vary, how can u store the frames for further processing? My MATLAB code i=
s
> here and hope I got the idea correctly.
>
> samp_rate=3D 16000;
> wav=3D tts('I can speak',[],-5,samp_rate);
> %wavplay(wav,fs);
> [f0, t, r] =3D spPitchTrackCorr(wav,samp_rate,20,10,[],0);%pitch detectio=
n
> from Naotoshi Seo
> no_of_pitchmarks =3D length(f0);
>
> %for i=3D:no_of_pitchmarks -- Can't store the frames in a matrix
> =A0i=3D1;
> =A0 =A0 pitch=3D f0(i);
> =A0 =A0 pit_period=3D 1/pitch;
> =A0 =A0 no_of_samps_per_period=3D floor(samp_rate*pit_period);
> =A0 =A0 sampInd_atPitMark=3D floor(samp_rate*t);
> =A0 =A0 pit_syn_frames(:,i) =3D wav(sampInd_atPitMark - =A0
> no_of_samps_per_period:sampInd_atPitMark + no_of_samps_per_period);
> %end
Dude