Hi all, I have a vector of real numbers in Matlab. How do I compress them? Of course this has to be lossless, since I need to be able to recover them. The goal is to study the Shannon rate and entropy of these real numbers, so I decide to compress them and see how much compression ratio I can have. I don't need to write the result into compressed files, so those headers, etc. are just overhead for me which affect me calculating the Entropy... so I just need a bare version of the compress ratio... Any pointers? Thanks a lot!
How do I compress an array of floating numbers in Matlab?
Started by ●April 2, 2010
Reply by ●April 2, 20102010-04-02
On Apr 2, 3:50�pm, Luna Moon <lunamoonm...@gmail.com> wrote:> Hi all, > > I have a vector of real numbers in Matlab. How do I compress them? �Of > course this has to be lossless, since I need to be able to recover > them. > > The goal is to study the Shannon rate and entropy of these real > numbers, so I decide to compress them and see how much compression > ratio I can have. > > I don't need to write the result into compressed files, so those > headers, etc. are just overhead for me which affect me calculating the > Entropy... so I just need a bare version of the compress ratio... > > Any pointers? > > Thanks a lot!Consider the array of numbers in binary form. Rearrange the bits so all the ones are sequential, and do the same for the zeros. The number of ones followed by the number of zeros is your compressed file. John
Reply by ●April 2, 20102010-04-02
John wrote:>On Apr 2, 3:50=A0pm, Luna Moon <lunamoonm...@gmail.com> wrote: >> Hi all, >> >> I have a vector of real numbers in Matlab. How do I compress them?=A0Of>> course this has to be lossless, since I need to be able to recover >> them. >> >> The goal is to study the Shannon rate and entropy of these real >> numbers, so I decide to compress them and see how much compression >> ratio I can have. >> >> I don't need to write the result into compressed files, so those >> headers, etc. are just overhead for me which affect me calculating the >> Entropy... so I just need a bare version of the compress ratio... >> >> Any pointers? >> >> Thanks a lot! > >Consider the array of numbers in binary form. Rearrange the bits so >all the ones are sequential, and do the same for the zeros. The number >of ones followed by the number of zeros is your compressed file.That's hardly optimal (effectively Run-Length Encoding (RLE)), and will, in general, result in a falsely high estimate of "information content". How many PCX images do you see floating around?
Reply by ●April 2, 20102010-04-02
Michael wrote:>John wrote: >>On Apr 2, 3:50=A0pm, Luna Moon <lunamoonm...@gmail.com> wrote: >>> Hi all, >>> >>> I have a vector of real numbers in Matlab. How do I compress them? >=A0Of >>> course this has to be lossless, since I need to be able to recover >>> them. >>> >>> The goal is to study the Shannon rate and entropy of these real >>> numbers, so I decide to compress them and see how much compression >>> ratio I can have. >>> >>> I don't need to write the result into compressed files, so those >>> headers, etc. are just overhead for me which affect me calculating the >>> Entropy... so I just need a bare version of the compress ratio... >>> >>> Any pointers? >>> >>> Thanks a lot! >> >>Consider the array of numbers in binary form. Rearrange the bits so >>all the ones are sequential, and do the same for the zeros. The number >>of ones followed by the number of zeros is your compressed file. > >That's hardly optimal (effectively Run-Length Encoding (RLE)), and will,in>general, result in a falsely high estimate of "information content". How >many PCX images do you see floating around? >Sorry, I should have said "it's throwing away information, and then RLE". So it's going to give nonsense.
Reply by ●April 3, 20102010-04-03
On Apr 2, 3:50�pm, Luna Moon <lunamoonm...@gmail.com> wrote:> Hi all, > > I have a vector of real numbers in Matlab. How do I compress them? �Of > course this has to be lossless, since I need to be able to recover > them. > > The goal is to study the Shannon rate and entropy of these real > numbers, so I decide to compress them and see how much compression > ratio I can have. > > I don't need to write the result into compressed files, so those > headers, etc. are just overhead for me which affect me calculating the > Entropy... so I just need a bare version of the compress ratio... > > Any pointers? >do you know about Huffman coding? it's in Wikipedia. if the floating-point numbers are sorta random, not derived from a "normal-looking" signal, there is not much you can do to compress. if the range of the numbers are limited (at least probabilistically) then Huffman coding might help a little. but i tend to think that the it would be only the exponent bits that would be compressible and there is not much to gain, since the exponent bits are a small portion of the floating-point word. the mantissa bits will look pretty random, and there is not much a lossless scheme can do about that. if the signal is reasonably bandlimited, you can use LPC, predict the next samples (from the previous N samples), and encode the *difference* between the predicted value and what you really have. if the prediction is good, the difference should be small and the number of bits needed to represent it should be small (and you might Huffman code those). i know for audio, lossless compression doesn't gain a lot of saving of space. it might save maybe 50%.> Thanks a lot!FWIW, r b-j
Reply by ●April 3, 20102010-04-03
Luna Moon wrote:> Hi all, > > I have a vector of real numbers in Matlab. How do I compress them? Of > course this has to be lossless, since I need to be able to recover > them. > > The goal is to study the Shannon rate and entropy of these real > numbers, so I decide to compress them and see how much compression > ratio I can have. > > I don't need to write the result into compressed files, so those > headers, etc. are just overhead for me which affect me calculating the > Entropy... so I just need a bare version of the compress ratio... > > Any pointers?Find another approach to getting an answer, maybe. First, most lossless compression algorithms are designed for things like text, executables, and data bases -- they don't do well with floating point numbers, tending to see them as "random" even when they're not. Second, if you measure a bunch of meaningless white noise and put the result into floating point numbers, then put them into a lossless algorithm that _can_ handle floating point, it's not going to compress at all, because the algorithm can't distinguish between white noise and a signal that's chock-full of information. In effect you'll have _given_ it a signal full of information, in great detail, about the noise. I think you're leading yourself down the garden path. -- Tim Wescott Control system and signal processing consulting www.wescottdesign.com
Reply by ●April 3, 20102010-04-03
On Apr 2, 9:19�pm, "Michael Plante" <michael.plante@n_o_s_p_a_m.gmail.com> wrote:> Michael wrote: > >John wrote: > >>On Apr 2, 3:50=A0pm, Luna Moon <lunamoonm...@gmail.com> wrote: > >>> Hi all, > > >>> I have a vector of real numbers in Matlab. How do I compress them? > >=A0Of > >>> course this has to be lossless, since I need to be able to recover > >>> them. > > >>> The goal is to study the Shannon rate and entropy of these real > >>> numbers, so I decide to compress them and see how much compression > >>> ratio I can have. > > >>> I don't need to write the result into compressed files, so those > >>> headers, etc. are just overhead for me which affect me calculating the > >>> Entropy... so I just need a bare version of the compress ratio... > > >>> Any pointers? > > >>> Thanks a lot! > > >>Consider the array of numbers in binary form. Rearrange the bits so > >>all the ones are sequential, and do the same for the zeros. The number > >>of ones followed by the number of zeros is your compressed file. > > >That's hardly optimal (effectively Run-Length Encoding (RLE)), and will, > in > >general, result in a falsely high estimate of "information content". �How > >many PCX images do you see floating around? > > Sorry, I should have said "it's throwing away information, and then RLE". > So it's going to give nonsense.Nobody in here has a sense of humor
Reply by ●April 4, 20102010-04-04
On Apr 3, 7:13�pm, John <sampson...@gmail.com> wrote:> Nobody in here has a sense of humori got it. your file compression scheme sorta rearranged the order of data. but yer right. no sense of humor to be found here. r b-j
Reply by ●April 4, 20102010-04-04
robert bristow-johnson wrote:> On Apr 3, 7:13 pm, John <sampson...@gmail.com> wrote: > > >>Nobody in here has a sense of humor > > > i got it. your file compression scheme sorta rearranged the order of > data. > > but yer right. no sense of humor to be found here.JFYI: http://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
Reply by ●April 4, 20102010-04-04
On Apr 4, 12:52�am, robert bristow-johnson <r...@audioimagination.com> wrote:> On Apr 3, 7:13�pm, John <sampson...@gmail.com> wrote: > > > Nobody in here has a sense of humor > > i got it. �your file compression scheme sorta rearranged the order of > data. > > but yer right. �no sense of humor to be found here. > > r b-jI was one day too late.