comp.dsp | how to measure entropy of music?| page 2

Reply by Eray Ozkural exa ●December 3, 20042004-12-03

"lucy" <losemind@yahoo.com> wrote in message news:<coo4it$2s2$1@news.Stanford.EDU>...
> I have vaguely heard about this method... and I am very interested in it... 
> could anybody give me some pointers?
> 
> After learning how to measure entropy of music... I can begin to measure 
> entropy of texts, etc.. that's going to be fun!

Compress it using a suitable lossless compressor. Provided that the
compressor knows how to detect (some amount of) musical redundancy,
this will give you the Kolmogorov complexity with respect to a
particular non-universal computer, e.g. algorithmic entropy which is
more fundamental than Shannon entropy.

This has already been exploited using the information distance of
Vitanyi et al. Precise clustering of music files has been achieved
with ordinary compressors like bzip2. Find the paper "Algorithmic
Clustering of Music" by Rudi Cilibrasi and Paul Vitanyi. Needless to
say, you have to compress uncompressed wav files...

Regards,

--
Eray Ozkural

Reply by Peter Kootsookos ●December 3, 20042004-12-03

john_bailey@rochester.rr.com (John Bailey) wrote 

> Try:
> http://crl.research.compaq.com/publications/techreports/techreports.html
> search the page for Logan or music.

Some of Beth's papers are available from her now-HPLabs web-site:

http://www.hpl.hp.com/research/crl/publications/papers.html

Ciao,

Peter K.

Reply by glen herrmannsfeldt ●December 3, 20042004-12-03


Eray Ozkural exa wrote:
(snip, and previously snipped discussion of music entropy)

> Compress it using a suitable lossless compressor. Provided that the
> compressor knows how to detect (some amount of) musical redundancy,
> this will give you the Kolmogorov complexity with respect to a
> particular non-universal computer, e.g. algorithmic entropy which is
> more fundamental than Shannon entropy.

> This has already been exploited using the information distance of
> Vitanyi et al. Precise clustering of music files has been achieved
> with ordinary compressors like bzip2. Find the paper "Algorithmic
> Clustering of Music" by Rudi Cilibrasi and Paul Vitanyi. Needless to
> say, you have to compress uncompressed wav files...

I would think a pure sine wave should be low complexity, so maybe
you should compress the Fourier transform.   Maybe that won't
quite do it, but I can imagine wav files of low complexity
audio signals not compressing very well.

Has anyone ever done an FFT of a whole CD?

-- glen

Reply by gowan ●December 3, 20042004-12-03

john_bailey@rochester.rr.com (John Bailey) wrote in message news:<41b07a0d.180894873@news-server.rochester.rr.com>...

<snip>

> 
> The model of entropy that can be readily recognized:
> In music--how often can an informed listener infer the next note in a
> phrase; ie how many bits are needed to specifiy the next note.
> In sound--how many bits are needed to specify the next value of the
> signal.
>
<snip>

The important 20th century composer Paul Hindemith once wrote that he
found music to be interesting to listen to when he had a low success
rate at predicting what would happen next.  I think he was thinking of
higher level structures than single notes, but couldn't the concept of
entropy apply to these higher structures as well?

Reply by Richard Dobson ●December 3, 20042004-12-03

gowan wrote:

...
> The important 20th century composer Paul Hindemith once wrote that he
> found music to be interesting to listen to when he had a low success
> rate at predicting what would happen next.  I think he was thinking of
> higher level structures than single notes, but couldn't the concept of
> entropy apply to these higher structures as well?

It can apply at all levels; even within one note in the hands of a master 
player/singer. The real interest here is the "middle ground" (to borrow a term 
from Schekerian analysis); where a predictable context is established (which in 
turn demands a knowing listener: a cognoscento), in order for a subsequent 
unpredictable event to have a desired rhetorical effect. One of the reasons 
"total serialism" ultimately failed over time as an idiom was that it was easy 
to demonstrate that its outputs were cognitively indistinguishable from randomly 
generated music. So total entropy is bad, and total un-predictability is bad also.

Composers such as Trevor Wishart (waving the flag for UK composers here!), 
working in the medium of electro-acoustic music (aka "tape music",) go to 
considerable lengths to establish a viable cognitive framework (there is 
typically no score or equivalent text to follow, from which to learn structure), 
e.g. by using straight repetition to establish "this is a theme[= signficant 
source material to be noted by the listener]"; where that theme is some abstract 
sound. The idiomatic gestures of this music are unfamiliar to those brought up 
on maintstream Western Art Music - it takes some time to get used to the idea of 
timbre being a primary determinant of musical structure. Schoenberg started it 
with his idea of "klangfarbenmelodie" (melody of tone colours), Varese ran with 
it ("organised Sound"), and current electro-acoustic composers have gone 
super-luminary with it.

Music has been described as "audible mathematics" (obvious Pythagorean notion 
here); but (writing as a confirmed non-mathematician) I wonder how keen 
mathematicians really are on surprises, contradictions, non-sequiturs, etc! 
African musicians say "every wrong note is a new style" (cited by Christopher 
Small in his book "Music of the Common tongue"); but a wrong formula is just 
plain wrong, leaving very little scope within mathematics for rhetoricism. I 
suspect much of the deep background of the debates and spats seen on this list 
reflects a profound need to be rhetorical, expressive and, dare I say it, 
emotional! Music, in contrast, is not a simple binary right/wrong medium: hence 
(when we have that middle ground) it is something one can not only learn, but 
also learn from.

"Discuss..."
:-)

Richard Dobson

Richard Dobson

Reply by Andrew Reilly ●December 3, 20042004-12-03

On Fri, 03 Dec 2004 16:38:23 +0000, John Bailey wrote:
> For written language, the analog is how many bits are needed to
> confirm an informed guess as to the next letter in a text.  Perhaps no
> more than three.  For some TV shows these days, its only two.
> 
> For music, the question gets really challenging as one considers the
> entropy of scores--the parts played by accompanying instruments and
> the choices of these instruments requires a lot of encoding.  In this
> case, the number of bits needed to feed a high level orchestral
> synthesizer might establish a lower bound.

Very much a lower bound, I suspect, because even a fantastic orchestral
synthesizer would be to an orchestra as a typesetter would be to the
handwritten word: you don't only have the notes, you have the playing of
each of them, and the interplay of the playing with the acoustics of the
hall, etc.

Still, it's a neat thought experiment.  And you could indeed do an
interesting analysis of musical scores, rather than muscial recordings.

-- 
Andrew

Reply by John Bailey ●December 3, 20042004-12-03

On Fri, 03 Dec 2004 17:49:47 +0000, Timothy Murphy
<tim@birdsnest.maths.tcd.ie> wrote:

>As a matter of interest,
>do you consider high entropy good or bad?
>
On 3 Dec 2004 14:42:51 -0800, gowan4@hotmail.com (gowan) wrote:
>
>The important 20th century composer Paul Hindemith once wrote that he
>found music to be interesting to listen to when he had a low success
>rate at predicting what would happen next.  I think he was thinking of
>higher level structures than single notes, but couldn't the concept of
>entropy apply to these higher structures as well?

>Presumably a completely random series of notes would have very high entropy,
>while absolute silence has very low entropy.
>I wouldn't have thought either was very enjoyable.

Both of your comments are quite stimulating.  Of course!  The idea
that good music is neither too rote nor too random is not new but the
idea that entropy as a concept allows a deeper investigation of what
people like is exciting.  Conjecture: classes of composers cluster
around various levels of entropy.  On further thought--what is
predictable changes with audience familiarity with a style.  Music
that is on the leading edge of predictability is the most interesting.
Its probably all about what gives our music neurons a good workout.

John Bailey
http://home.rochester.rr.com/jbxroads/mailto.html

Reply by oookhc ●December 4, 20042004-12-04

"lucy" <losemind@yahoo.com> wrote in message news:<coo4it$2s2$1@news.Stanford.EDU>...
> I have vaguely heard about this method... and I am very interested in it... 
> could anybody give me some pointers?
> 
> After learning how to measure entropy of music... I can begin to measure 
> entropy of texts, etc.. that's going to be fun!

   Well, theoretically speaking, you should just treat the "notes" of
the music as "symbols" for the computation of the entropy. Then, 
there is no difference between finding the entropy of text and finding
the entropy of music.

   But if you sample the signal of music, encode it, and compute the
entropy, I doubt if it makes sense to do that given the fact that you
already have the "notes" of the music.

   If you just want to know what is about "entropy of data", you might
just start with any textbook for "Information theory" or some websites
for some ideas, e.g.,

http://www.ScienceOxygen.com/signal.html

http://www.ScienceOxygen.com/electrical.html


    Have fun,

Reply by Andor ●December 4, 20042004-12-04

glen herrmannsfeldt wrote:

> I would think a pure sine wave should be low complexity, so maybe
> you should compress the Fourier transform.  

I think a vocoder does something similar: linear predictive coding
tries to fit an AR model to the input, effectively searching for the
formants (resonant peaks) of the "vocal tract" transfer function. If
you initialize an IIR filter with the LPC  coefficients and the states
(delays) with the signal, you can let it "ring" like an oscillator -
the output will converge to a sum of pure damped sine waves at the
estimated frequencies of the formants. However, instead of storing the
sine waves, the vocoder stores the LPC coefficients and the residue
(prediction error).

To take this back to entropy, the energy of the prediction error could
well be regarded as the entropy of a vocal signal - it gives a good
measure on how unexpected the signal is, given its past history.

Recent work has shown that this method is also suited for general
audio signals, not just speech. The order of the AR model just
increases drastically (for interpolation, orders of 1000 - 3000 are
used). The same applies here for the prediction error and the entropy.

> Has anyone ever done an FFT of a whole CD?

If you take a music CD, I would expect a linear trend of the form
1/f^a. This is a common model for music signals, and it agrees well
with findings of long-range correlated time series.

For speech, I would guess a rectangular pulse response. The formants
vary across a certain frequency range, but are limited from above and
below.

> 
> -- glen

Regards,
Andor

Reply by Andor ●December 4, 20042004-12-04

Eric Jacobsen wrote:
> On Fri, 3 Dec 2004 15:36:04 +0100, Stephan M. Bernsee
> <spam@dspdimension.com> wrote:
> 
> >On 2004-12-03 14:19:03 +0100, Ken Prager <prager_me_@ieee.org> said:
> >> <http://eigenradio.media.mit.edu/>
> >
> >ROTFL!!! This is *really* cool.
> 
> Very interesting.   I doubt any of it is going to make it onto my MP3
> player, but it's an intereting concept.

I just downloaded the eigenradio christmas album - I find it very
relaxing! Included is a neat collection of high resolution fractal
snowflakes (called eigenflakes). I think that is the CD cover :-).

Regards,
Andor