Not a member?

# Discussion Groups | Comp.DSP | Understanding a DCT

There are 8 messages in this thread.

You are currently looking at messages 1 to .

Is this discussion worth a thumbs up?

0

# Understanding a DCT - wond3rboy - 2012-10-26 16:42:00

```Hello, I am sorry for this naive question but it is something that has been
nagging me. I wanted to understand the interpretation of the DCT of any
image. These are the questions I have, hope someone can answer them for
me:

1. Are DCTs always done on 8x8 pixel sets? If so, each DCT block will
consist of 64 basis image coefficients( a white represents a presence and a
black indicates and absence).

2. Can DCTs be done on 16x16 pixel sets of an image? If so then each DCT
coefficient block will consist of 16x16%6 coefficients as weights for its
basis images?

3. What is the effect of performing a larger or smaller size DCT?

Thank you.
```
______________________________

# Re: Understanding a DCT - glen herrmannsfeldt - 2012-10-26 17:54:00

```wond3rboy <47228@dsprelated> wrote:

> Hello, I am sorry for this naive question but it is something that has been
> nagging me. I wanted to understand the interpretation of the DCT of any
> image. These are the questions I have, hope someone can answer them for
> me:

> 1. Are DCTs always done on 8x8 pixel sets? If so, each DCT block will
> consist of 64 basis image coefficients( a white represents a presence and a
> black indicates and absence).

> 2. Can DCTs be done on 16x16 pixel sets of an image? If so then each DCT
> coefficient block will consist of 16x16%6 coefficients as weights for its
> basis images?

> 3. What is the effect of performing a larger or smaller size DCT?

These sound suspiciously like homework, but I will answer them anyway.

The answers won't be quite good enough to turn in, but might get you
in the right direction.

The DCT, in general, is one dimensional. Like the Fourier transform
in general, and other discrete transforms, it is separable in
rectangular coordinates. (Look up separable in any book on partial
differential equations.)

Being seperable makes the computation in 2D easier.

In any case, the DCT itself can be, and often is done, on lengths
other than 8. For image processing, 8x8 is popular, but it is a
based on DCT.

If it works right, the 8x8 squares will not be visible in the
decompressed image, but that isn't always true. Especially for MPEG,
with fast changing scenes, they are often visible.

I remember noticing it during fireworks for the 2008 olympics,
first believing it to be a neat special effect, and then, after
a few seconds, deciding that it was the side effect of the
algorithm.

DCT is preferred over DST or DFT for image compression, as
the boundaries between squares are less visible.

My guess is that as processing power increases that they will
move to larger transforms like 16x16.

Best would be to transform the whole image, but that takes too long.

-- glen
```
______________________________

# Re: Understanding a DCT - Richard Owlett - 2012-10-26 21:57:00

```wond3rboy wrote:
> Hello, I am sorry for this naive question but it is something that has been
> nagging me. I wanted to understand the interpretation of the DCT of any
> image. These are the questions I have, hope someone can answer them for
> me:
>
> 1. Are DCTs always done on 8x8 pixel sets? If so, each DCT block will
> consist of 64 basis image coefficients( a white represents a presence and a
> black indicates and absence).
>
> 2. Can DCTs be done on 16x16 pixel sets of an image? If so then each DCT
> coefficient block will consist of 16x16%6 coefficients as weights for its
> basis images?
>
> 3. What is the effect of performing a larger or smaller size DCT?
>
> Thank you.
>

if all else fails
what is/are relevant definition(s) ?

```
______________________________

# Re: Understanding a DCT - Christian Gollwitzer - 2012-10-27 02:16:00

```Am 26.10.12 22:42, schrieb wond3rboy:
> 1. Are DCTs always done on 8x8 pixel sets? If so, each DCT block will
> consist of 64 basis image coefficients( a white represents a presence and a
> black indicates and absence).

As pointed out by glen, DCT like DFT can be done at any size, not even a
power of 2. Power of 2 just leads to the fastest algorithm in general.

> 2. Can DCTs be done on 16x16 pixel sets of an image? If so then each DCT
> coefficient block will consist of 16x16%6 coefficients as weights for its
> basis images?

8x8 is the standard size for JPEG still image compression, but others
are possible and used. For instance, H.264 or MPEG4 AVC can use 4x4 or
8x8. Nothing prevents you from applying basically the same principles
with a 16x16 or even 17x13 transform

> 3. What is the effect of performing a larger or smaller size DCT?

The idea of using DCT is to detect correlations between the pixels and
reduce the blocks to a small number of coefficients. This works well
when you have a constant color or something like a smooth variation - in
essence only low-frequency components. But have a look at the higher
order basis functions for large block sizes. Unless you are taking
photographs of a zebra, where could you find a regular spaced
stripe-pattern in an arbitrarily cut block from an image?

The inverse transform does exist mathematically, so these zebra patterns
are needed to describe any possible input. When you increase the block
size, you are looking for longer correlations at the expense of
distortions at sudden changes, where the higher order basis functions
are desperately needed. Look for "Gibbs phenomenon" if you are not
familiar with that. Larger blocks o better in smooth areas, where they
can exploit the long range correlations, whereas smaller blocks are
better at edges, where locality is needed.

Christian

```
______________________________

# Re: Understanding a DCT - Vladimir Vassilevsky - 2012-10-27 10:53:00

```"Christian Gollwitzer" <a...@gmx.de> wrote:

> 8x8 is the standard size for JPEG still image compression, but others are
> possible and used. For instance, H.264 or MPEG4 AVC can use 4x4 or 8x8.
> Nothing prevents you from applying basically the same principles with a
> 16x16 or even 17x13 transform

I saw a book where the author proposed 3-dimensional DCT for moving picture
compression. He claimed less of computing burden compared to traditional
motion compensation.

DSP and Mixed Signal Consultant
www.abvolt.com

```
______________________________

# Re: Understanding a DCT - wond3rboy - 2012-10-27 15:38:00

```Thank you all for your replies. This not a homework, we had a lecture about
DCTs that just went over my head. I have gone through separability and
understood that the kernels for DCT (forward and reverse) are separable.
One more question I have is how to interpret DCT outputs? I wanted to know
whether my understanding is correct and would be very thankful if you would
help me out.

When I do a DCT on a 256x256 image in 8x8 pixel blocks. I should get 1024
blocks in the DCT output. Each block will consist of 64 basis
coefficients(represented by squares of intensity of white through black )
arranged in 8 rows and 8 columns. An intensity of white means a strong
presence and an intensity of black means no presence? In the top left
corner is the DC and than the increasing frequencies in both directions(to
the sides and downward).

Thank you.
```
______________________________

# Re: Understanding a DCT - glen herrmannsfeldt - 2012-10-27 23:00:00

```wond3rboy <47228@dsprelated> wrote:

(snip)

> When I do a DCT on a 256x256 image in 8x8 pixel blocks. I should get 1024
> blocks in the DCT output. Each block will consist of 64 basis
> coefficients(represented by squares of intensity of white through black )
> arranged in 8 rows and 8 columns. An intensity of white means a strong
> presence and an intensity of black means no presence? In the top left
> corner is the DC and than the increasing frequencies in both directions(to
> the sides and downward).

For DCT, there is a choice for each end of having the boundary on or
half way between sample points. Otherwise,

f(x)=sum A(k)cos(k x pi/8) and g(y)=sum B(l)cos(l y pi/8)

where the appropriate x, y, k, and l, depend on the boundary
conditions.

Then h(x,y)=sum C(k,l)cos(k x pi/8)cos(l y pi/8)

The x's, y's, k's and l's are either integers or odd
half integers.

-- glen
```
______________________________

# Re: Understanding a DCT - wond3rboy - 2012-10-29 17:29:00

```>wond3rboy <47228@dsprelated> wrote:
>
>(snip)
>
>> When I do a DCT on a 256x256 image in 8x8 pixel blocks. I should get
1024
>> blocks in the DCT output. Each block will consist of 64 basis
>> coefficients(represented by squares of intensity of white through black
)
>> arranged in 8 rows and 8 columns. An intensity of white means a strong
>> presence and an intensity of black means no presence? In the top left
>> corner is the DC and than the increasing frequencies in both
directions(to
>> the sides and downward).
>
>For DCT, there is a choice for each end of having the boundary on or
>half way between sample points. Otherwise,
>
>f(x)=sum A(k)cos(k x pi/8) and g(y)=sum B(l)cos(l y pi/8)
>
>where the appropriate x, y, k, and l, depend on the boundary
>conditions.
>
>Then h(x,y)=sum C(k,l)cos(k x pi/8)cos(l y pi/8)
>
>The x's, y's, k's and l's are either integers or odd
>half integers.
>
>-- glen
>

Thank you very much!
```
______________________________