I have been trying to write a pitch detector within C++ for voiced
speech. I currently get the sample speech, apply centre clipping for
the entire speech data, and then calculate all the peaks. However, I
am not sure whether I need to split the data into frames, and then
apply a hamming window for example on each frame. Is this necessary?
All of this is done within the time domain. Additionally, I wanted to
use autocorrelation for determining the pitch period, and whether or
not a pitch was a glottal pitch. All the examples of autocorrelation
that I have seen use something called a lag, but they never explain
how to compute this lag.
So is it necessary to use frame and windowing? And
How should I calculate the glottal period from the other peaks?
Thanks in advance for any advice.. Sub.