Ladies & Gentlemen
The additional criterion for voicing measure in MELP is peakiness. For
r=1.4 the voicing in the first band is set 1 (strong voicing), for
r=1.6 the voicing at second and third bands also set to 1.
According to the classical paper
"Selective Modeling of the LPC residual during unvoiced frames: White
noise or pulse excitation" by D. Thomson & D. Prezas
for gaussian noise r=sqrt(pi/2)=1.2533. You can easily check that for
noise +1,-1,+1,-1,... r=sqrt(2)=1.41....
Thus it is not quite clear why MELP's developers chose this threshold
for voiced/unvoiced discrimination which may classify ordinary noise
as voiced what leads to buzziness.
Does anybody know what was the reason to choose these thresholds in MELP?
Thanks, Ilya Druker