Forums

How to do Chroma Decimation

Started by Unknown October 19, 2005
Hi,

I am a university student working on a real-time video conferencing
project.  I'm trying to find a compression solution that will be quick
enough to give me the throughput I need, but will also give me the
compression ratio I need.  Each of my video frames are 320x240 pixels
and each pixel has 3 bytes associated with it (R,G,B).  For further
information, see a post that I made to comp.compression below.

A poster on comp.compression suggested that I consider using chroma
decimation as a way to reduce my input size prior to compression:

>Incidentally, if the raw image is RGB, you can compress it down >to 37.5% of its original size just by 4:1:0 chroma decimation. But >from your data rate of 3375 Kb/s and frame rate of 15 fps, I'm >guessing 160x120 luma and 80x60 chroma. In that case you could >still decimate it to 75% of its original size without damaging the >picture quality too much.
First of all, given the metrics above, do you agree with this person's assessment? Secondly, I don't have any experience with the subject of chroma decimation, and I was wondering if anyone knew of some tutorial which would explain the concept and discuss how to implement it. Thanks so much for your help. ------------------- Original post to comp.compression: Hi, I am a university student in a Real-Time Embedded Systems class.A partner and I are working on developing a real-time video conferencing system using 2 Pentium systems (yes, that's original Pentium) running the VxWorks real-time operating system. I recently spent some time investigating compression algorithm choices, and I decided to use arithmetic compression, as it seemed to provide a good balance between compression ratio and complexity of coding. To elaborate, I tested a couple of codecs I found at http://datacompression.info on a test image and found that the result was about 35%-37% smaller than the original. Also, being that this is a semster course and it's mid-October, I have very limited time to implement my system, which precludes choices like JPEG and MPEG. My problem is this: I tested one of the arithmetic coders that I downloaded and its throughput was only 1950.51 Kb/s *on my Pentium 4*. If my target system was a Pentium 4, this would only get me a frame rate of 8.67 frames/second, as compared to my target of 15 frames/second. Being that the code will run on an original Pentium, it will be completely out of the question. So I am looking for an algorithm that meets the following specifications: 1) Can't be too complex to code--something on the level of arithmetic coding or Huffman coding is okay, JPEG and MPEG are out. 2) Algorithm needs to be inherently fast--must achieve 3375 Kb/s or higher throughput on an original Pentium. 3) Must compress input to 35% or less of its original size. Another thing to keep in mind is that given the time constraints, the quality of the code I produce is bound to be less than optimal. My professor has said that students have achieved decent frame rates (barring fast moving images) with run-length encoding plus difference images. Perhaps the algorithm would have run much faster if I had been compressing an image with many contiguous zeroes as a result of differencing. Would it run fast enough? I would appreciate any thoughts on the best choice to meet my requirements. I would really appreciate any references on run-length encoding that would provide background and implementation tips. Also, is there something I'm missing about arithmetic coding? It seemed like such a good algorithm, but in practice, at least with the codec I tried, http://www.cs.mu.oz.au/~alistair/arith_coder/), it was so slow.
<posting0@yahoo.com> wrote in message
news:1129758646.349361.21830@o13g2000cwo.googlegroups.com...
> Hi, > > I am a university student working on a real-time video conferencing > project. I'm trying to find a compression solution that will be quick > enough to give me the throughput I need, but will also give me the > compression ratio I need. Each of my video frames are 320x240 pixels > and each pixel has 3 bytes associated with it (R,G,B). For further > information, see a post that I made to comp.compression below. > > A poster on comp.compression suggested that I consider using chroma > decimation as a way to reduce my input size prior to compression: > > >Incidentally, if the raw image is RGB, you can compress it down > >to 37.5% of its original size just by 4:1:0 chroma decimation. But > >from your data rate of 3375 Kb/s and frame rate of 15 fps, I'm > >guessing 160x120 luma and 80x60 chroma. In that case you could > >still decimate it to 75% of its original size without damaging the > >picture quality too much. > > First of all, given the metrics above, do you agree with this person's > assessment? > > Secondly, I don't have any experience with the subject of chroma > decimation, and I was wondering if anyone knew of some tutorial which > would explain the concept and discuss how to implement it. > > Thanks so much for your help. >
Have a look at http://en.wikipedia.org/wiki/Chroma_Subsampling Basically, you just convert RGB to YUV (brightness[Y] and colour[UV]) and then store the colour information at a lower resolution.
> ------------------- > Original post to comp.compression: > > Hi, > > > I am a university student in a Real-Time Embedded Systems class.A > partner and I are working on developing a real-time video conferencing > system using 2 Pentium systems (yes, that's original Pentium) running > the VxWorks real-time operating system. > > I recently spent some time investigating compression algorithm choices, > and I decided to use arithmetic compression, as it seemed to provide a > good balance between compression ratio and complexity of coding. To > elaborate, I tested a couple of codecs I found at > http://datacompression.info on a test image and found that the result > was about 35%-37% smaller than the original. Also, being that this is > a semster course and it's mid-October, I have very limited time to > implement my system, which precludes choices like JPEG and MPEG. > > My problem is this: > > I tested one of the arithmetic coders that I downloaded and its > throughput was only 1950.51 Kb/s *on my Pentium 4*. If my target > system was a Pentium 4, this would only get me a frame rate of 8.67 > frames/second, as compared to my target of 15 frames/second. Being > that the code will run on an original Pentium, it will be completely > out of the question. > > So I am looking for an algorithm that meets the following > specifications: 1) Can't be too complex to code--something on the level > of arithmetic coding or Huffman coding is okay, JPEG and MPEG are out. > > 2) Algorithm needs to be inherently fast--must achieve 3375 Kb/s or > higher throughput on an original Pentium. > > 3) Must compress input to 35% or less of its original size. > > Another thing to keep in mind is that given the time constraints, the > quality of the code I produce is bound to be less than optimal. > > My professor has said that students have achieved decent frame rates > (barring fast moving images) with run-length encoding plus difference > images. > > Perhaps the algorithm would have run much faster if I had been > compressing an image with many contiguous zeroes as a result of > differencing. Would it run fast enough? > > I would appreciate any thoughts on the best choice to meet my > requirements. > > I would really appreciate any references on run-length encoding that > would provide background and implementation tips. > > Also, is there something I'm missing about arithmetic coding? It > seemed like such a good algorithm, but in practice, at least with the > codec I tried, http://www.cs.mu.oz.au/~alistair/arith_coder/), it was > so slow. >
Hello,

First of all according to your assignment you really do not have to use
lossless compression method. 
Arithmetic encoding by itself is a lossless process.
You could use lossy compression, since it will result in much higher
sompreaaion ratios.

In general a lossy compression consist of three parts. 
In the first part you transform your pixel image data space into a space
where energies tend to concentrate in one area of the space. For example
DCT, DWT, etc.
Second you perform a quantization step in which you throw away the
information that has little affect on the reconstructed image.
In third step you further compress this quantized information into a final
bit-stream. For this you could use arithmetic encoding.

The beauty of this is that in each step you are dealing with less and less
information.

If you don't want to perform any tranform on your original data, here is
what I recommend:

1. Convert your 320x240 RGB color space into YUV 4:2:0 space. Where Y is
320x240 and U, V are 160x120.
2. Split this data into 8x8 blocks. (40x30 blocks for Y, 20x15 blocks for
U, V).
3. For all Y, U, V components and for every block find a difference with
the previous frame.
4. Setup a threshold value for Y, U, V and if a max absolute value of a
difference of corresponding block is lower than that do not transmit that
block. 
4a. Optionally quantize the block difference. 
5. Encode the Y, U, V blocks difference data using arithmetic encoder. 
6. Keep the track of the blocks that you skipped in previous steps and
every so often send them, to avoid error accumulation over a long period
of time. 
7. Assuming that not a lot of things move in your video conferencing setup
and with the right choice of threashold values, this method will give you
good compression ratios.

-Alex

>Hi, > >I am a university student working on a real-time video conferencing >project. I'm trying to find a compression solution that will be quick >enough to give me the throughput I need, but will also give me the >compression ratio I need. Each of my video frames are 320x240 pixels >and each pixel has 3 bytes associated with it (R,G,B). For further >information, see a post that I made to comp.compression below. > >A poster on comp.compression suggested that I consider using chroma >decimation as a way to reduce my input size prior to compression: > >>Incidentally, if the raw image is RGB, you can compress it down >>to 37.5% of its original size just by 4:1:0 chroma decimation. But >>from your data rate of 3375 Kb/s and frame rate of 15 fps, I'm >>guessing 160x120 luma and 80x60 chroma. In that case you could >>still decimate it to 75% of its original size without damaging the >>picture quality too much. > >First of all, given the metrics above, do you agree with this person's >assessment? > >Secondly, I don't have any experience with the subject of chroma >decimation, and I was wondering if anyone knew of some tutorial which >would explain the concept and discuss how to implement it. > >Thanks so much for your help. > >------------------- >Original post to comp.compression: > >Hi, > > >I am a university student in a Real-Time Embedded Systems class.A >partner and I are working on developing a real-time video conferencing >system using 2 Pentium systems (yes, that's original Pentium) running >the VxWorks real-time operating system. > >I recently spent some time investigating compression algorithm choices, >and I decided to use arithmetic compression, as it seemed to provide a >good balance between compression ratio and complexity of coding. To >elaborate, I tested a couple of codecs I found at >http://datacompression.info on a test image and found that the result >was about 35%-37% smaller than the original. Also, being that this is >a semster course and it's mid-October, I have very limited time to >implement my system, which precludes choices like JPEG and MPEG. > >My problem is this: > >I tested one of the arithmetic coders that I downloaded and its >throughput was only 1950.51 Kb/s *on my Pentium 4*. If my target >system was a Pentium 4, this would only get me a frame rate of 8.67 >frames/second, as compared to my target of 15 frames/second. Being >that the code will run on an original Pentium, it will be completely >out of the question. > >So I am looking for an algorithm that meets the following >specifications: 1) Can't be too complex to code--something on the level >of arithmetic coding or Huffman coding is okay, JPEG and MPEG are out. > >2) Algorithm needs to be inherently fast--must achieve 3375 Kb/s or >higher throughput on an original Pentium. > >3) Must compress input to 35% or less of its original size. > >Another thing to keep in mind is that given the time constraints, the >quality of the code I produce is bound to be less than optimal. > >My professor has said that students have achieved decent frame rates >(barring fast moving images) with run-length encoding plus difference >images. > >Perhaps the algorithm would have run much faster if I had been >compressing an image with many contiguous zeroes as a result of >differencing. Would it run fast enough? > >I would appreciate any thoughts on the best choice to meet my >requirements. > >I would really appreciate any references on run-length encoding that >would provide background and implementation tips. > >Also, is there something I'm missing about arithmetic coding? It >seemed like such a good algorithm, but in practice, at least with the >codec I tried, http://www.cs.mu.oz.au/~alistair/arith_coder/), it was >so slow. > >
Hello,

First of all according to your assignment you really do not have to use
lossless compression method. 
Arithmetic encoding by itself is a lossless process.
You could use lossy compression, since it will result in much higher
sompreaaion ratios.

In general a lossy compression consist of three parts. 
In the first part you transform your pixel image data space into a space
where energies tend to concentrate in one area of the space. For example
DCT, DWT, etc.
Second you perform a quantization step in which you throw away the
information that has little affect on the reconstructed image.
In third step you further compress this quantized information into a final
bit-stream. For this you could use arithmetic encoding.

The beauty of this is that in each step you are dealing with less and less
information.

If you don't want to perform any tranform on your original data, here is
what I recommend:

1. Convert your 320x240 RGB color space into YUV 4:2:0 space. Where Y is
320x240 and U, V are 160x120.
2. Split this data into 8x8 blocks. (40x30 blocks for Y, 20x15 blocks for
U, V).
3. For all Y, U, V components and for every block find a difference with
the previous frame.
4. Setup a threshold value for Y, U, V and if a max absolute value of a
difference of corresponding block is lower than that do not transmit that
block. 
4a. Optionally quantize the block difference. 
5. Encode the Y, U, V blocks difference data using arithmetic encoder. 
6. Keep the track of the blocks that you skipped in previous steps and
every so often send them, to avoid error accumulation over a long period
of time. 
7. Assuming that not a lot of things move in your video conferencing setup
and with the right choice of threashold values, this method will give you
good compression ratios.

-Alex

>Hi, > >I am a university student working on a real-time video conferencing >project. I'm trying to find a compression solution that will be quick >enough to give me the throughput I need, but will also give me the >compression ratio I need. Each of my video frames are 320x240 pixels >and each pixel has 3 bytes associated with it (R,G,B). For further >information, see a post that I made to comp.compression below. > >A poster on comp.compression suggested that I consider using chroma >decimation as a way to reduce my input size prior to compression: > >>Incidentally, if the raw image is RGB, you can compress it down >>to 37.5% of its original size just by 4:1:0 chroma decimation. But >>from your data rate of 3375 Kb/s and frame rate of 15 fps, I'm >>guessing 160x120 luma and 80x60 chroma. In that case you could >>still decimate it to 75% of its original size without damaging the >>picture quality too much. > >First of all, given the metrics above, do you agree with this person's >assessment? > >Secondly, I don't have any experience with the subject of chroma >decimation, and I was wondering if anyone knew of some tutorial which >would explain the concept and discuss how to implement it. > >Thanks so much for your help. > >------------------- >Original post to comp.compression: > >Hi, > > >I am a university student in a Real-Time Embedded Systems class.A >partner and I are working on developing a real-time video conferencing >system using 2 Pentium systems (yes, that's original Pentium) running >the VxWorks real-time operating system. > >I recently spent some time investigating compression algorithm choices, >and I decided to use arithmetic compression, as it seemed to provide a >good balance between compression ratio and complexity of coding. To >elaborate, I tested a couple of codecs I found at >http://datacompression.info on a test image and found that the result >was about 35%-37% smaller than the original. Also, being that this is >a semster course and it's mid-October, I have very limited time to >implement my system, which precludes choices like JPEG and MPEG. > >My problem is this: > >I tested one of the arithmetic coders that I downloaded and its >throughput was only 1950.51 Kb/s *on my Pentium 4*. If my target >system was a Pentium 4, this would only get me a frame rate of 8.67 >frames/second, as compared to my target of 15 frames/second. Being >that the code will run on an original Pentium, it will be completely >out of the question. > >So I am looking for an algorithm that meets the following >specifications: 1) Can't be too complex to code--something on the level >of arithmetic coding or Huffman coding is okay, JPEG and MPEG are out. > >2) Algorithm needs to be inherently fast--must achieve 3375 Kb/s or >higher throughput on an original Pentium. > >3) Must compress input to 35% or less of its original size. > >Another thing to keep in mind is that given the time constraints, the >quality of the code I produce is bound to be less than optimal. > >My professor has said that students have achieved decent frame rates >(barring fast moving images) with run-length encoding plus difference >images. > >Perhaps the algorithm would have run much faster if I had been >compressing an image with many contiguous zeroes as a result of >differencing. Would it run fast enough? > >I would appreciate any thoughts on the best choice to meet my >requirements. > >I would really appreciate any references on run-length encoding that >would provide background and implementation tips. > >Also, is there something I'm missing about arithmetic coding? It >seemed like such a good algorithm, but in practice, at least with the >codec I tried, http://www.cs.mu.oz.au/~alistair/arith_coder/), it was >so slow. > >