# DCT in video compression

Started by July 15, 2018
```The mp3 standard uses the discrete cosine transform (DCT),
rather than the full DFT.

Doesn't that imply throwing away half the information?
Can anyone explain this?

--
Rich
```
```Am 15.07.18 um 08:17 schrieb RichD:
> The mp3 standard uses the discrete cosine transform (DCT),
> rather than the full DFT.

Yes

> Doesn't that imply throwing away half the information?

No. It implies different boundary conditions. While the mathematical
definition of the (continuous) Fourier transform operates on an infinite
signal, these discrete transforms only operate over a buffer. Therefore,
it needs to be defined, what the data "outside" this buffer looks like.
If you do DFT (or FFT, which is the same result but faster), then it is
assumed that everything repeats at the boundary from start. If you do
DCT, it is assumed that it is mirrored. Let's say your buffer contains

1,2,3,4

Then for DFT it is assumed to be

....1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,....

For DCT:

....1,2,3,4,4,3,2,1,1,2,3,4,4,3,2,1,....

When the latter is transformed, due to symmetry, only cosine terms can
be there, the sine terms give zero. When you transform the former, due
to symmetry, half of the coefficients are the complex conjugate of the
other and also redundant. Therefore, there is always the same amount of
information.

The DCT assumption has the advantage that there is can not be a jump at
the boundary.

Christian
```
```On Sat, 14 Jul 2018 23:17:27 -0700 (PDT), RichD
<r_delaney2001@yahoo.com> wrote:

>The mp3 standard uses the discrete cosine transform (DCT),
>rather than the full DFT.
>
>Doesn't that imply throwing away half the information?
>Can anyone explain this?
>
>--
>Rich

No, it's real-input, real-output.

```
```Eric Jacobsen <theman@ericjacobsen.org> wrote:

><r_delaney2001@yahoo.com> wrote:

>>The mp3 standard uses the discrete cosine transform (DCT),
>>rather than the full DFT.

>>Doesn't that imply throwing away half the information?
>>Can anyone explain this?

>No, it's real-input, real-output.

The math works.  The input is extended by reflection to create an even
signal twice as long as the original signal; this is then subjected
to a discrete cosine transform (preserving information, as the signal
is even). In this cosine transform, the bins are more closely spaced by a
factor of two, relative to the base case of taking the original
(unextended) signal and subjecting it to a discrete Fourier
transform.  So there are twice as many bins with the cosine method.

Had the original input been complex, this cosine method would still
work -- but the transformed values would also be complex.

Steve
```
```spope384@gmail.com (Steve Pope) writes:

> Eric Jacobsen <theman@ericjacobsen.org> wrote:
>
>><r_delaney2001@yahoo.com> wrote:
>
>>>The mp3 standard uses the discrete cosine transform (DCT),
>>>rather than the full DFT.
>
>>>Doesn't that imply throwing away half the information?
>>>Can anyone explain this?
>
>>No, it's real-input, real-output.
>
> The math works.  The input is extended by reflection to create an even
> signal twice as long as the original signal; this is then subjected
> to a discrete cosine transform (preserving information, as the signal
> is even).

Steve,

I don't see anything about "extension by reflection" in the definition
of the DCT here (e.g., DCT-II):

https://en.wikipedia.org/wiki/Discrete_cosine_transform#Formal_definition

To what are you referring?

I think the key concept to focus on, and which is mentioned early on in
the Wikipedia article, is that the transform can be represented by an
invertible NxN matrix, where N is the input sequence length. The fact it
is invertible is what guarantees the transform does not "throw away
information."
--
Randy Yates, DSP/Embedded Firmware Developer
Digital Signal Labs
http://www.digitalsignallabs.com
```
```On Monday, July 16, 2018 at 12:35:20 AM UTC-7, Randy Yates wrote:
> spope384@gmail.com (Steve Pope) writes:
...
>
> I don't see anything about "extension by reflection" in the definition
> of the DCT here (e.g., DCT-II):
>
>   https://en.wikipedia.org/wiki/Discrete_cosine_transform#Formal_definition
>
> To what are you referring?
>
...
> Randy Yates

Randy

Look for "implicit even/odd extensions of DCT input data". I think the mp3 mentioned by the OP is DCT-IV. But what do you expect from using a wikipedia as a technical reference?

Dale B. Dalrymple
```
```dbd <d.dalrymple@sbcglobal.net> writes:

> On Monday, July 16, 2018 at 12:35:20 AM UTC-7, Randy Yates wrote:
>> spope384@gmail.com (Steve Pope) writes:
>  ...
>>
>> I don't see anything about "extension by reflection" in the definition
>> of the DCT here (e.g., DCT-II):
>>
>>   https://en.wikipedia.org/wiki/Discrete_cosine_transform#Formal_definition
>>
>> To what are you referring?
>>
> ...
>> Randy Yates
>
> Randy
>
> Look for "implicit even/odd extensions of DCT input data". I think the
> mp3 mentioned by the OP is DCT-IV. But what do you expect from using a
> wikipedia as a technical reference?

Hi Dale,

It seemed to me that Steve was saying you first have to extend the input
sequence before applying the DCT. Based on the wikipedia article
DCT definitions, that is not the case.

Are the Wikipedia article definitions wrong? If not, then the quality of
the rest of the article is irrelevent as I understand the OP's question.

The OP's question seems to be: how can we represent the input sequence
with only cosines instead of both sines and cosines? Don't we lose
information? By "lose information" I take him to mean that we cannot
fully represent an arbitrary N-sample input sequence with the
transformed output sequence.

My answer is that, since the transform is invertible, you can always get
back the exact N input samples given the N transformed samples.

As far as I can tell, the discussions regarding extensions of the DCT
input data give insight on the utility of the DCT in compression and
are not relevent to the OPs "loss of information" question.
--
Randy Yates, Embedded Linux Developer
The Garner Underground, Inc.
http://www.garnerundergroundinc.com
```
```Randy Yates  <randyy@garnerundergroundinc.com> wrote:

>dbd <d.dalrymple@sbcglobal.net> writes:

>> On Monday, July 16, 2018 at 12:35:20 AM UTC-7, Randy Yates wrote:

>>> spope384@gmail.com (Steve Pope) writes:

>>> I don't see anything about "extension by reflection" in the definition
>>> of the DCT here (e.g., DCT-II):

>>>   https://en.wikipedia.org/wiki/Discrete_cosine_transform#Formal_definition

>>> To what are you referring?

>> Look for "implicit even/odd extensions of DCT input data". I think the
>> mp3 mentioned by the OP is DCT-IV. But what do you expect from using a
>> wikipedia as a technical reference?

>Hi Dale,
>
>It seemed to me that Steve was saying you first have to extend the input
>sequence before applying the DCT. Based on the wikipedia article
>DCT definitions, that is not the case.

You don't have to literally extend the data by reflection, this
is just one way of visualizing the math.

And technically, if you do not perform said extension, then
the transform you subsequently perform is not strictly speaking a DCT, it
could maybe be called a discrete half-cosine transform.

In any case, the math works, and can be shown to be the equivalent
of a DFT.

Steve
```
```spope384@gmail.com (Steve Pope) writes:

> Randy Yates  <randyy@garnerundergroundinc.com> wrote:
>
>>dbd <d.dalrymple@sbcglobal.net> writes:
>
>>> On Monday, July 16, 2018 at 12:35:20 AM UTC-7, Randy Yates wrote:
>
>>>> spope384@gmail.com (Steve Pope) writes:
>
>>>> I don't see anything about "extension by reflection" in the definition
>>>> of the DCT here (e.g., DCT-II):
>
>>>>   https://en.wikipedia.org/wiki/Discrete_cosine_transform#Formal_definition
>
>>>> To what are you referring?
>
>>> Look for "implicit even/odd extensions of DCT input data". I think the
>>> mp3 mentioned by the OP is DCT-IV. But what do you expect from using a
>>> wikipedia as a technical reference?
>
>>Hi Dale,
>>
>>It seemed to me that Steve was saying you first have to extend the input
>>sequence before applying the DCT. Based on the wikipedia article
>>DCT definitions, that is not the case.
>
> You don't have to literally extend the data by reflection, this
> is just one way of visualizing the math.
>
> And technically, if you do not perform said extension, then
> the transform you subsequently perform is not strictly speaking a DCT, it
> could maybe be called a discrete half-cosine transform.

[mallat, p.349] does use your exact terminology. I'll have to study
more on the definitions and notions of extensions.

@book{mallat,
title = "A Wavelet Tour of Signal Processing",
author = "St{\'e}phane Mallat",
edition = "second",
year = "1999"}

--
Randy Yates, DSP/Embedded Firmware Developer
Digital Signal Labs
http://www.digitalsignallabs.com
```
```On Saturday, July 14, 2018 at 11:46:59 PM UTC-7, Christian Gollwitzer wrote:
> Am 15.07.18 um 08:17 schrieb RichD:
> > The mp3 standard uses the discrete cosine transform (DCT),
> > rather than the full DFT.

> Yes

> > Doesn't that imply throwing away half the information?

> No. It implies different boundary conditions. While the mathematical
> definition of the (continuous) Fourier transform operates on an infinite
> signal, these discrete transforms only operate over a buffer. Therefore,
> it needs to be defined, what the data "outside" this buffer looks like.
> If you do DFT (or FFT, which is the same result but faster), then it is
> assumed that everything repeats at the boundary from start. If you do
> DCT, it is assumed that it is mirrored. Let's say your buffer contains

> 1,2,3,4

The more usual description is that the FFT has periodic boundary
conditions, DST has the function go to zero at the boundary,
and DCT has the derivative go to zero.

Putting transform blocks together, as is usually done in
image reconstruction, the boundaries are less visible with DCT.

> Then for DFT it is assumed to be

> ....1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,....

Specifically periodic boundary conditions.

> For DCT:

> ....1,2,3,4,4,3,2,1,1,2,3,4,4,3,2,1,....

The derivative is zero, such that the basis functions,
and so the result of the transform have reflection
symmetry.

> When the latter is transformed, due to symmetry, only cosine terms can
> be there, the sine terms give zero. When you transform the former, due
> to symmetry, half of the coefficients are the complex conjugate of the
> other and also redundant. Therefore, there is always the same amount of
> information.

> The DCT assumption has the advantage that there is can not be a jump at
> the boundary.

The derivative has a jump, though.

It is the same math as standing waves with shorted (DST)
or open (DCT) boundary conditions on a transmission line.

As to the OP question, there are the same number of basis
functions, but they are different functions.

In the 8 element case, the DFT has four sines and four cosines,
DST has 8 sines, and DCT has 8 cosines. (Given real input data.)

```