Reply by Richard Dobson●February 19, 20112011-02-19
On 19/02/2011 01:53, Raeldor wrote:
> Are there any good techniques to either remove or detect the amount of
> non-harmonic data (white noise?) in speech? I really want to remove
> or detect the 's', 'z' etc. sounds in the sample. Is this possible?
>
> Thanks
> Rael
There is a standard name for this in audio production - a "de-esser".
Googling this will provide plenty of information. The standard approach
(inasmuch as there is one) is band-specific dynamic audio compression,
based on the "sss" part of the sound being (a) relatively-speaking much
higher than the voiced parts of speech (higher than "fff" and "th" for
example) , and (b) higher energy. A wide range of commercial and free
plugins (VST etc) is available.
I am not so sure about removing "zz" sounds, as that combines the noise
with a (pitched) voiced component; remove the noise part and the whole
phoneme has changed. It will change "lose" to "loo", and "physical" to
"fi-ickle".
Richard Dobson
Reply by maury●February 18, 20112011-02-18
On Feb 18, 7:53=A0pm, Raeldor <rael...@gmail.com> wrote:
> Are there any good techniques to either remove or detect the amount of
> non-harmonic data (white noise?) in speech? =A0I really want to remove
> or detect the 's', 'z' etc. sounds in the sample. =A0Is this possible?
>
> Thanks
> Rael
"complement theorem". Take the universe,subtract what you don't want,
and what is left is what you do want. Find a way to detect/remove
harmonic data, and you will be left with non-harmonic data (harmonic
data may be easier to detect than non-harmonic data).
Another approach. Adaptive line enhancer (ALE) predicts/detects
harmonic information. Instead of using the "tradictional" output from
the ALE (the predicted data), use the "error" output.
Maybe these suggestions will spark something.
Reply by Raeldor●February 18, 20112011-02-18
Are there any good techniques to either remove or detect the amount of
non-harmonic data (white noise?) in speech? I really want to remove
or detect the 's', 'z' etc. sounds in the sample. Is this possible?
Thanks
Rael