Reply by Ilya Druker March 9, 20052005-03-09


1. -

2. It seems that for better estimation of pitch in the following
formula for calculation of normalized correlation it is useful to
window the signal x and y.
Cp = sum_over_n(x(n)*y(p+n)) /
sqrt( sum_over_n(x(n)^2)*sum_over_n(y(p+n)^2) )
However, the problem is that the additional computational complexity
is very high since you have to window the signal y for every lag p. I
guess that this is the reason why nobody is windowing signals for
pitch estimation.

3. Spectrum subtraction may slightly help only for colored noise. The
algorithm generally may corrupt vital features of the signal that may
lead to different pitch.

4. -

5. There is a very strong pitch estimation algorithm that makes a
special treatment to pitch doubling. See the speech coder MELP.

Hope it will help a little,
Ilya Druker

--- In , "viper4295" <swanand_y_d@h...> wrote:
>
>
> Hi,
> I am currently doing this project of Classification and
> categorisation on the basis of speech. I have to sort the samples
> into male or female and into age categories on the basis of the
> speech samples. For this i have used pitch as a fetaure and on this
> i am trying to achieve the classification. HAve a few problems.
> Would be gr8 if there is some help on the following pts.
> 1. I have done the pitch period estimation part using the SIFT
> algorithm. HAve a few problems regarding voiced and unvoiced
> decision. Wanted some help on it
> 2. I have done the processing frame wise. Do i need to multiply by
> some window llike HAnning
> 3. Also i have attempted to do Speech enhancement using Spectral
> subtraction. But i find that after the enhancement it is altering
> the pitch of the sample.
> 4. My prog has some problems differentiating between voiced and
> unvoiced frames. So it calculates the pitch inaccurately. What do i
> do
> 5. Also in framewise processing how do i reach to a single pitch
> period value
> All of this coding has been done in C. Thank you
> Cheers
> Swanand



Reply by venkat ramanan March 7, 20052005-03-07

Hai,
For voiced or unvoiced decision, energy or zero crossing rate can be used. This is discussed in "Digital processing of speech signals" by Rabiner. More zero crossing refers unvoiced and less refers voiced.

Pitch is not a constant parameter. So, u can find pitch for each window and can take the maximum value. Fix the threshold. If pitch falls in some range, u can decide whether it is male or female.
(Its just an idea).

For windowing, Hamming window is generally used. (Length of the window and amount of overlap has to be fixed).

All the best
Venkat

On Mon, 07 Mar 2005 viper4295 wrote :
>Hi,
> I am currently doing this project of Classification and
>categorisation on the basis of speech. I have to sort the samples
>into male or female and into age categories on the basis of the
>speech samples. For this i have used pitch as a fetaure and on this
>i am trying to achieve the classification. HAve a few problems.
>Would be gr8 if there is some help on the following pts.
>1. I have done the pitch period estimation part using the SIFT
>algorithm. HAve a few problems regarding voiced and unvoiced
>decision. Wanted some help on it
>2. I have done the processing frame wise. Do i need to multiply by
>some window llike HAnning
>3. Also i have attempted to do Speech enhancement using Spectral
>subtraction. But i find that after the enhancement it is altering
>the pitch of the sample.
>4. My prog has some problems differentiating between voiced and
>unvoiced frames. So it calculates the pitch inaccurately. What do i
>do
>5. Also in framewise processing how do i reach to a single pitch
>period value
> All of this coding has been done in C. Thank you
>Cheers
>Swanand
>NEW! You can now post a message or access and search the archives of this group on DSPRelated.com:
>http://www.dsprelated.com/groups/speechcoding/1.php
>_____________________________________
>Note: If you do a simple "reply" with your email client, only the author of this message will receive your answer. You need to do a "reply all" if you want your answer to be distributed to the entire group.
>_____________________________________
>About this discussion group:
>Archives:
>http://www.dsprelated.com/groups/speechcoding/1.php
>To Post: Send an email to
>Other DSP Related Groups:
>http://www.dsprelated.com/groups.php
>
>Yahoo! Groups Links
>To br> >.
>


Mistakes are not end of the world but repeating them is



Reply by viper4295 March 6, 20052005-03-06


Hi,
I am currently doing this project of Classification and
categorisation on the basis of speech. I have to sort the samples
into male or female and into age categories on the basis of the
speech samples. For this i have used pitch as a fetaure and on this
i am trying to achieve the classification. HAve a few problems.
Would be gr8 if there is some help on the following pts.
1. I have done the pitch period estimation part using the SIFT
algorithm. HAve a few problems regarding voiced and unvoiced
decision. Wanted some help on it
2. I have done the processing frame wise. Do i need to multiply by
some window llike HAnning
3. Also i have attempted to do Speech enhancement using Spectral
subtraction. But i find that after the enhancement it is altering
the pitch of the sample.
4. My prog has some problems differentiating between voiced and
unvoiced frames. So it calculates the pitch inaccurately. What do i
do
5. Also in framewise processing how do i reach to a single pitch
period value
All of this coding has been done in C. Thank you
Cheers
Swanand