DSPRelated.com
Forums

why DCT

Started by Neo May 3, 2005
sandeep_mc81@yahoo.com schrieb:
> > Can you explain what is meant by "derivative go to zero" or "function > go to zero" ? and when does it go to zero ?
See Figure 1 on this page: http://cnx.rice.edu/content/m11092/latest/ The diagram is a bit impure, but you should see the zero slopes of underlying cosine functions at the ends of the interval. Regards Guido
thanks guido, that was illuminating but I made a mistake while framing
the question. it should have read why DCT for video compression as in
mpeg. I am sorry, but does it change your answer?

Neo wrote:
> > thanks guido, that was illuminating but I made a mistake while framing > the question. it should have read why DCT for video compression as in > mpeg. I am sorry, but does it change your answer?
Essentially not. The MPEG video is coded as a sequence of frame images which are basically encoded with the same DCT as used in JPEG. So with proper understanding of the fundamental DCT property, the MPEG folks could make their videos more scalable, but, as in the case of JPEG, they are unable to recognize this simple but basic property, unfortunately, and pursue rather inferior approaches in actual developments. Regards Guido
Why don't the DFT and DST provide the same scalability feature? 

Mark

Mark wrote:
> > Why don't the DFT and DST provide the same scalability feature?
I think the DFT and DST provide similar features, but the DCT is real (while DFT is complex) and has better boundary conditions for block decomposition than the DST (see cosine zero slope property at interval ends). I think the key point in this regard in my explanation is this:
> The 8-point DCT gives you 8 linearly increasing resolution representations > from 8 spatial sample values. You can hardly do better than that.
Other transforms could do similar, but hardly better than that. Regards Guido
Mark wrote:
> > Why don't the DFT and DST provide the same scalability feature?
BTW, there is also a DHT (Discrete Hartley Transform, kind of combination of DCT and DST) which has similar features. There is a paper of Andrew B. Watson from NASA ARC which discusses these fetatures for DFT and DHT: "Ideal Shrinking and Expansion of Discrete Sequences" (1986) http://vision.arc.nasa.gov/publications/IdealShrinking.pdf This is the first reference which I found regarding such feature for DFT-like transforms. Interestingly it doesn't mention the DCT, which is today widely used in image coding, based on the same property. Regards Guido
"Neo" <zingafriend@yahoo.com> wrote in message 
news:1115110312.874275.309550@o13g2000cwo.googlegroups.com...
> Hi guys, > I was curious to know why is it that DCT is used in image compression. > why not FFT?
Argh. I don't like any of the answers you have so far. The energy compaction property is useful for the lossless parts of the compression process, but the reason the DCT is used for *lossy* compression is that it separates the image into relatively important components, to which they eye is quite sensitive, and relatively unimportant components, to which the eye is relatively insensitive. This allows the important components to be recorded accurately, with more bits, while the unimportant components can be encoded less accurately, with fewer bits. Reducing the accuracy of these unimportant components is called quantization. It is the lossy part of lossy compression. The DCT is used instead of the DFT, because it accomplishes this separation better w.r.t the human visual response. It has basis vectors for the most important components like overall brightness and ramps, while the DFT and DST do not. The basic procedure involving transformation and quantization, followed by lossless compression, is used in all of your favourite lossy compression schemes -- image, audio, or video, but schemes for different media use different kinds of transformations, to match the response of the human sense to which they are directed. Audio compressors don't use the DCT. -- Matt
A most unique argument. I don't think it is true. Audio uses MDCT.

Regards
Piyush
Matt Timmermans wrote:
> "Neo" <zingafriend@yahoo.com> wrote in message > news:1115110312.874275.309550@o13g2000cwo.googlegroups.com... > > Hi guys, > > I was curious to know why is it that DCT is used in image
compression.
> > why not FFT? > > Argh. I don't like any of the answers you have so far. > > The energy compaction property is useful for the lossless parts of
the
> compression process, but the reason the DCT is used for *lossy*
compression
> is that it separates the image into relatively important components,
to
> which they eye is quite sensitive, and relatively unimportant
components, to
> which the eye is relatively insensitive. > > This allows the important components to be recorded accurately, with
more
> bits, while the unimportant components can be encoded less
accurately, with
> fewer bits. Reducing the accuracy of these unimportant components is
called
> quantization. It is the lossy part of lossy compression. > > The DCT is used instead of the DFT, because it accomplishes this
separation
> better w.r.t the human visual response. It has basis vectors for the
most
> important components like overall brightness and ramps, while the DFT
and
> DST do not. > > The basic procedure involving transformation and quantization,
followed by
> lossless compression, is used in all of your favourite lossy
compression
> schemes -- image, audio, or video, but schemes for different media
use
> different kinds of transformations, to match the response of the
human sense
> to which they are directed. Audio compressors don't use the DCT. > > -- > Matt
<piyushkaul@gmail.com> wrote in message 
news:1115278504.831743.235190@g14g2000cwa.googlegroups.com...
>A most unique argument. I don't think it is true. Audio uses MDCT.
MDCT is different. And the argument is obviously true if it's successfully communicated ;-) For the more mathematically inclined, I can say it this way: Assume a linear approximation to the average human visual response to 8x8 block -- a symmetric 64x64 matrix that tranforms pixel differences into perceptual differences, so that the magnitude of the transformed difference corresponds to its perceived significance. The goal, then, is to minimize the maximum possible perceptual error for a block encoded at a given overall quantization level. Remembering that the quantization step quantizes each coefficient independently, one way to accomplish the goal is: 1) Decompose the image block in terms of the normalized Eigenvectors of the response matrix; and 2) Quantize each resulting coefficient with the same perceived difference between quantization levels. The DCT is a fair approximation to that Eigenvector decomposition. -- Matt
Matt Timmermans wrote:
> > The energy compaction property is useful for the lossless parts of the > compression process, but the reason the DCT is used for *lossy* compression > is that it separates the image into relatively important components, to > which they eye is quite sensitive, and relatively unimportant components, to > which the eye is relatively insensitive.
Here we have another example of a rather useless explanation. "Important" and "unimportant" components are too general terms, they rather hide the true reason and don't explain why the DCT is so useful for image compression. These are just phrases, and they don't explain anything. But this is typical for the current state in this field: The relevant people ignore and deny the true reasons, and thus they turn in a circle and no progress is being made. Regards Guido