DSPRelated.com
Forums

Can Huffmann coding be directly applied to speech samples ?

Started by Jaydeep Inamdar January 16, 2004
Hello all,

Can you plz tell whether Huffmann Coding (or adaptive Huffmann)
method be directly applied to speech samples ?
If yes, how much it helps to reduce bitrate ? and if no, why it
can not be applied ?
I am eagerly waiting for answer(although the question may be stupid).
Thanx in advance.

regards
Jaydeep



hi Jaydeep

to derive the huffman codes for a given source, you need to know the
probablity distribution for various symbols. Speech signal is highly
random.. the loudness of speech can vary over abt 30 dB for the same
speaker and generally there will be more than one speaker. hence I doubt if
it is possible to estimate an optimal model of the probability distribution
at the source, as far as speech is concerned.

moreover, huffman coding is a lossless coding technique. most standard
speech codecs use codebooks, which are lossy, but still give great
compression. hence, directly representing speech samples using hufman coding
may not be a very effective way of compressing speech.

best regards,
Sameer. -----Original Message-----
From: Jaydeep Inamdar [mailto:]
Sent: Friday, January 16, 2004 11:12 AM
To:
Subject: [speechcoding] Can Huffmann coding be directly applied to
speech samples ? Hello all,

Can you plz tell whether Huffmann Coding (or adaptive Huffmann)
method be directly applied to speech samples ?
If yes, how much it helps to reduce bitrate ? and if no, why it
can not be applied ?
I am eagerly waiting for answer(although the question may be stupid).
Thanx in advance.

regards
Jaydeep


Sameer-

> to derive the huffman codes for a given source, you need to know the
> probablity distribution for various symbols. Speech signal is highly
> random.. the loudness of speech can vary over abt 30 dB for the same
> speaker and generally there will be more than one speaker. hence I doubt if
> it is possible to estimate an optimal model of the probability distribution
> at the source, as far as speech is concerned.
>
> moreover, huffman coding is a lossless coding technique. most standard
> speech codecs use codebooks, which are lossy, but still give great
> compression. hence, directly representing speech samples using hufman coding
> may not be a very effective way of compressing speech.

That is a great answer.

I might add that I have heard of people doing research on applying lossless
methods
to the resulting bitstream; i.e. looking for patterns and consistent probability
distributions in the compressed stream. The idea is that the codec itself
removes
certain variations in human speech due to speakers, loudness, background noise,
etc.

The tradeoff in these approaches seems to be latency, as a lot of bits need to
be
considered, making real-time and full-duplex communication difficult. U can
kind of
think like "zipping" up a chunk of compressed bitstream -- and how much delay
that
might pose in the worst-case.

-Jeff

> -----Original Message-----
> From: Jaydeep Inamdar [mailto:]
> Sent: Friday, January 16, 2004 11:12 AM
> To:
> Subject: [speechcoding] Can Huffmann coding be directly applied to
> speech samples ?
>
> Hello all,
>
> Can you plz tell whether Huffmann Coding (or adaptive Huffmann)
> method be directly applied to speech samples ?
> If yes, how much it helps to reduce bitrate ? and if no, why it
> can not be applied ?
> I am eagerly waiting for answer(although the question may be stupid).
> Thanx in advance.
>
> regards
> Jaydeep





Dear All

One more point to add on to this is regarding the concept of lossy and lossless
data compression techniques.

Although lossy techniques are branded as 'information lossy' the point to note
is that we should address the compression from a psycoacoustic point of view
with regards to speech.

The question to ask in speech compression is can our ears actually perceive the
"lossy"ness? Meaning is it really necessary to duplicate the original PDF
function at the receiver end or can we use loudness percpetion techniques which
say that the human ear works in logarithm scale.

Speech as mentioned earlier exhibits a laplacian pdf function and techniques
like mu-law, ADPCM, A-law among others swear by this.These exhibit artifacts in
the error signal noise but if we take a look at mp3 coders, ac3 etc its
basically using the quantization tables BUT with a physcoacoustic touch, its
lossy but since the tables encompass filter functions catered for human acoustic
system, allowing greater quantization noise to exist in certain frequency zones
(critical band) than other areas, the ear simply cannot tell the difference.

In short, Speech cannot be treated like raw binary data communication data and
the use of loseless coding such as Huffman would be an overkill.

Also as Jeff mentioned in real cases, codecs which use 1-bit based over sampling
sigma delta techniques (using shaping filters) inherently condition the signal
to keep the quantized noise away from the in-band speech region. If you were to
process this signal with Huffman, I am not sure what will result. The condition
signal will for one have greater SNR as compared to a SAR approximated
codec.........

anybody want to give a shot at this??

Dr. Arijit (from TU/Eindoven) its your call dude :)

cheers
Jeff Brower <> wrote:
Sameer-

> to derive the huffman codes for a given source, you need to know the
> probablity distribution for various symbols. Speech signal is highly
> random.. the loudness of speech can vary over abt 30 dB for the same
> speaker and generally there will be more than one speaker. hence I doubt if
> it is possible to estimate an optimal model of the probability distribution
> at the source, as far as speech is concerned.
>
> moreover, huffman coding is a lossless coding technique. most standard
> speech codecs use codebooks, which are lossy, but still give great
> compression. hence, directly representing speech samples using hufman coding
> may not be a very effective way of compressing speech.

That is a great answer.

I might add that I have heard of people doing research on applying lossless
methods
to the resulting bitstream; i.e. looking for patterns and consistent probability
distributions in the compressed stream. The idea is that the codec itself
removes
certain variations in human speech due to speakers, loudness, background noise,
etc.

The tradeoff in these approaches seems to be latency, as a lot of bits need to
be
considered, making real-time and full-duplex communication difficult. U can
kind of
think like "zipping" up a chunk of compressed bitstream -- and how much delay
that
might pose in the worst-case.

-Jeff

> -----Original Message-----
> From: Jaydeep Inamdar [mailto:]
> Sent: Friday, January 16, 2004 11:12 AM
> To:
> Subject: [speechcoding] Can Huffmann coding be directly applied to
> speech samples ?
>
> Hello all,
>
> Can you plz tell whether Huffmann Coding (or adaptive Huffmann)
> method be directly applied to speech samples ?
> If yes, how much it helps to reduce bitrate ? and if no, why it
> can not be applied ?
> I am eagerly waiting for answer(although the question may be stupid).
> Thanx in advance.
>
> regards
> Jaydeep
_____________________________________
Note: If you do a simple "reply" with your email client, only the author of this
message will receive your answer. You need to do a "reply all" if you want your
answer to be distributed to the entire group.

_____________________________________
About this discussion group:

To Join:

To Post:

To Leave:

Archives: http://www.yahoogroups.com/group/speechcoding

Other DSP-Related Groups: http://www.dsprelated.com
---------------------------------
Yahoo! Groups Links

To
Shree Jaisimha

---------------------------------