On 15 nov, 05:12, Ben Bradley <ben_u_brad...@etcmail.com> wrote:
> On Fri, 07 Nov 2008 00:10:26 -0600, Tim Wescott
>
>
>
> <t...@justseemywebsite.com> wrote:
> >On Thu, 06 Nov 2008 14:19:42 -0800, DigitalSignal wrote:
>
> >> Hi there, A quick question: Is there any way to compress the single
> >> point floating point data? Apparently most of the research and
> >> development work focuses on fixed point compression.
>
> >> James
> >>www.go-ci.com
>
> >Yes. &#4294967295;Set all the values in your vector to zero. &#4294967295;Then transmit the
> >number of samples in your vector.
>
> >Clarify your question and maybe you'll get a meaningful answer.
>
> >Lossy? &#4294967295;Lossless? &#4294967295;Any specific type of input data, such as still
> >pictures, video, generic audio or voice? &#4294967295;There are any number of lossy
> >compression algorithms that are just as meaningful with floating point
> >data as the source stream as with fixed point; but if you're talking
> >lossless compression then you're pretty much down to the algorithms you
> >find in zip, and their aunts, uncles, cousins and in-laws.
>
> &#4294967295; &#4294967295;Actually, there are newer lossless audio compression algorithms
> that provide much improved compression for digital signals (audio,
> seismic data, and similar signals from a digitized transducer) over
> the standard data-processing compression algorithms used in .zip and
> similar formats. Such legacy algorithms bareley give 10 percent
> compression or so on such signal data, hardly better than compressing
> a random file. Newer algorithms rely on the "signal" nature of the
> data (the fact that succesive samples are highly correlated rather
> than being a string of random data) for their compression and can give
> up to 50 percent lossless compression.
>
> &#4294967295; &#4294967295;One of the DVD encoding methods is basically ADPCM with error bits
> included in the data so that each sample is perfectly recreated on
> decompression.
>
> &#4294967295; &#4294967295;Read here, especially the Modeling and Residual Coding parts:http://flac.sourceforge.net/documentation_format_overview.html
>
> &#4294967295; &#4294967295;And here, under Comparisons:http://en.wikipedia.org/wiki/Free_Lossless_Audio_Codec
> "FLAC is specifically designed for efficient packing of audio data,
> unlike general lossless algorithms such as ZIP and gzip. While ZIP may
> compress a CD-quality audio file by 10 - 20%, FLAC achieves
> compression rates of 30 - 50% for most music, with significantly
> greater compression for voice recordings."
>
> &#4294967295; &#4294967295;After seeing the most recent post from the OP and looking at the
> websitehttp://www.go-ci.com/&#4294967295;it seems to me this sort of compression
> is exactly the thing for a device storing data coming from a 24-bit
> A/D. Not sure how a floating-point DSP would handle that, though.

On Fri, 07 Nov 2008 00:10:26 -0600, Tim Wescott
<tim@justseemywebsite.com> wrote:

>On Thu, 06 Nov 2008 14:19:42 -0800, DigitalSignal wrote:
>
>> Hi there, A quick question: Is there any way to compress the single
>> point floating point data? Apparently most of the research and
>> development work focuses on fixed point compression.
>> 
>> James
>> www.go-ci.com
>
>Yes.  Set all the values in your vector to zero.  Then transmit the 
>number of samples in your vector.
>
>Clarify your question and maybe you'll get a meaningful answer.
>
>Lossy?  Lossless?  Any specific type of input data, such as still 
>pictures, video, generic audio or voice?  There are any number of lossy 
>compression algorithms that are just as meaningful with floating point 
>data as the source stream as with fixed point; but if you're talking 
>lossless compression then you're pretty much down to the algorithms you 
>find in zip, and their aunts, uncles, cousins and in-laws.

   Actually, there are newer lossless audio compression algorithms
that provide much improved compression for digital signals (audio,
seismic data, and similar signals from a digitized transducer) over
the standard data-processing compression algorithms used in .zip and
similar formats. Such legacy algorithms bareley give 10 percent
compression or so on such signal data, hardly better than compressing
a random file. Newer algorithms rely on the "signal" nature of the
data (the fact that succesive samples are highly correlated rather
than being a string of random data) for their compression and can give
up to 50 percent lossless compression.

   One of the DVD encoding methods is basically ADPCM with error bits
included in the data so that each sample is perfectly recreated on
decompression.

   Read here, especially the Modeling and Residual Coding parts:
http://flac.sourceforge.net/documentation_format_overview.html

   And here, under Comparisons:
http://en.wikipedia.org/wiki/Free_Lossless_Audio_Codec
"FLAC is specifically designed for efficient packing of audio data,
unlike general lossless algorithms such as ZIP and gzip. While ZIP may
compress a CD-quality audio file by 10 - 20%, FLAC achieves
compression rates of 30 - 50% for most music, with significantly
greater compression for voice recordings."

   After seeing the most recent post from the OP and looking at the
website http://www.go-ci.com/  it seems to me this sort of compression
is exactly the thing for a device storing data coming from a 24-bit
A/D. Not sure how a floating-point DSP would handle that, though.

Hendrik,

CoCo-80 (www.go-ci.com) reaches 130~150dB in general. If we just
conduct a simple data acquisition task, we will not convert it into
floating point so the compression can be done in fixed point format.
The issue is that with all kinds of filtering and spectral analysis
process, the data manipulation gets very complicated. We would rather
to keep all the data in single-precision floating point. This is a
practical matter.

James
www.go-ci.com

Hendrik van der Heijden wrote:

> DigitalSignal schrieb:
> 
>> Sorry, I should make it clearer. We tried to find a way to compress
>> the single precision floating point data streams losslessly. As a
>> general case, the data acquisition system stores time domain data up
>> to a few gigabytes.

Probably because of the excessive oversampling and improper gain 
scaling. Perhaps the data size can be reduced by several times.

>> It is expensive to store the data in the portable
>> device and slow to transfer them.
> 
> 
> I wonder what kind of portable device generates floating point data.
> Is there no ADC or is there FP-based processing which results you'd
> like to store?

Good point. On another note: it is generally impossible to do any 
floating point operation loselessly. So, the first step would be the 
denormalization to integers; then some predictive algorithm, then 
Huffman coding.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

DigitalSignal schrieb:
> Sorry, I should make it clearer. We tried to find a way to compress
> the single precision floating point data streams losslessly. As a
> general case, the data acquisition system stores time domain data up
> to a few gigabytes. It is expensive to store the data in the portable
> device and slow to transfer them.

I wonder what kind of portable device generates floating point data.
Is there no ADC or is there FP-based processing which results you'd
like to store?



Hendrik vdH

On 8 Nov, 05:02, Tim Wescott <t...@justseemywebsite.com> wrote:
> On Fri, 07 Nov 2008 14:12:45 -0600, Vladimir Vassilevsky wrote:

> > Do the processing of the raw data in place. Remove the clutter and store
> > only relevant information.
>
> That's actually a pretty good description of any compression algorithm:
> remove the clutter and save the relevant.
>
> What you consider to be clutter vs. relevant has a big effect on whether
> you use lossless or lossy compression (and what lossy compression
> algorithm you use); beyond that, how you tell the clutter from the
> relevant strongly guides the algorithm.

There is another aspect which maybe happens only on rare
occasions, but is very relevant when it does: The economy
involved in acquiring the data and the psychology of the
responsible decision-makers.

These days memory is cheap (a couple of 100$ per TeraByte of
disk space) but that was not the case in the mid '90s. A friend
of mine wrote his MSc thesis on efficient compression and storage
of seismic data. This was the time when the company we both worked
with installed the 2nd TByte disk system nation-wide, and the guys
brought back truck-loads of Exabyte tapes (each of which stored
5-10 GByte of data and took two hours to load) after they had been
to sea.

So the logistics of data handling and storage was a big deal
at the time.

My friend did a good job with lossy compression. He stored
the essentials contained in the data in far less space than
was needed for the uncompressed data. He worked closely with
the data procesors, so every trick he introduced in his
storage scheme was evaluated for effects both on the storage
and on the seismic images that were processed from the
reconstructed data. As far as I could tell, the processors
were able to get the same from the compressed data as from
the original data.

But my friend never recieved much interest for the method.
Lots of people held the opinion that

"We've spent tens, maybe hundreds, of millions of $$$ collecting
these data. We will not do anything that might compromise their
present usefulness and future value."

Which is a perfectly understandable way of looking at things.
In fact, I think I agree with it.

Rune

On Fri, 07 Nov 2008 14:12:45 -0600, Vladimir Vassilevsky wrote:

> DigitalSignal wrote:
> 
>> Sorry, I should make it clearer. We tried to find a way to compress the
>> single precision floating point data streams losslessly. As a general
>> case, the data acquisition system stores time domain data up to a few
>> gigabytes. It is expensive to store the data in the portable device and
>> slow to transfer them.
> 
> Do the processing of the raw data in place. Remove the clutter and store
> only relevant information.
> 
That's actually a pretty good description of any compression algorithm: 
remove the clutter and save the relevant.

What you consider to be clutter vs. relevant has a big effect on whether 
you use lossless or lossy compression (and what lossy compression 
algorithm you use); beyond that, how you tell the clutter from the 
relevant strongly guides the algorithm.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" gives you just what it says.
See details at http://www.wescottdesign.com/actfes/actfes.html


DigitalSignal wrote:

> Sorry, I should make it clearer. We tried to find a way to compress
> the single precision floating point data streams losslessly. As a
> general case, the data acquisition system stores time domain data up
> to a few gigabytes. It is expensive to store the data in the portable
> device and slow to transfer them.

Do the processing of the raw data in place. Remove the clutter and store 
only relevant information.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

DigitalSignal wrote:

> Sorry, I should make it clearer. We tried to find a way to compress
> the single precision floating point data streams losslessly. As a
> general case, the data acquisition system stores time domain data up
> to a few gigabytes. It is expensive to store the data in the portable
> device and slow to transfer them.

To compress it you (or a compression program) have to find some
pattern to the data such that it can be coded more efficiently.

For fixed point data that pattern will often be high order
zero bits.   LZW and related algorithms will usually find them
and compress them out fairly well.  If, for example, you stored
12 bit random data in 16 bit words, LZW would compress that
down pretty close to 12 bits each.

You say lossless, but in most cases there has already been loss
in the conversion/arithmetic operations on floating point data.
Reasonably often there is no useful information in the low
bits of a floating point value, but you and the compression
algorithm don't know that.

-- glen