DSPRelated.com
Forums

how to measure entropy of music?

Started by lucy December 2, 2004
"lucy" <losemind@yahoo.com> wrote in message news:<coo4it$2s2$1@news.Stanford.EDU>...
> I have vaguely heard about this method... and I am very interested in it... > could anybody give me some pointers? > > After learning how to measure entropy of music... I can begin to measure > entropy of texts, etc.. that's going to be fun!
Compress it using a suitable lossless compressor. Provided that the compressor knows how to detect (some amount of) musical redundancy, this will give you the Kolmogorov complexity with respect to a particular non-universal computer, e.g. algorithmic entropy which is more fundamental than Shannon entropy. This has already been exploited using the information distance of Vitanyi et al. Precise clustering of music files has been achieved with ordinary compressors like bzip2. Find the paper "Algorithmic Clustering of Music" by Rudi Cilibrasi and Paul Vitanyi. Needless to say, you have to compress uncompressed wav files... Regards, -- Eray Ozkural
john_bailey@rochester.rr.com (John Bailey) wrote 

> Try: > http://crl.research.compaq.com/publications/techreports/techreports.html > search the page for Logan or music.
Some of Beth's papers are available from her now-HPLabs web-site: http://www.hpl.hp.com/research/crl/publications/papers.html Ciao, Peter K.

Eray Ozkural exa wrote:
(snip, and previously snipped discussion of music entropy)

> Compress it using a suitable lossless compressor. Provided that the > compressor knows how to detect (some amount of) musical redundancy, > this will give you the Kolmogorov complexity with respect to a > particular non-universal computer, e.g. algorithmic entropy which is > more fundamental than Shannon entropy.
> This has already been exploited using the information distance of > Vitanyi et al. Precise clustering of music files has been achieved > with ordinary compressors like bzip2. Find the paper "Algorithmic > Clustering of Music" by Rudi Cilibrasi and Paul Vitanyi. Needless to > say, you have to compress uncompressed wav files...
I would think a pure sine wave should be low complexity, so maybe you should compress the Fourier transform. Maybe that won't quite do it, but I can imagine wav files of low complexity audio signals not compressing very well. Has anyone ever done an FFT of a whole CD? -- glen
john_bailey@rochester.rr.com (John Bailey) wrote in message news:<41b07a0d.180894873@news-server.rochester.rr.com>...

<snip>

> > The model of entropy that can be readily recognized: > In music--how often can an informed listener infer the next note in a > phrase; ie how many bits are needed to specifiy the next note. > In sound--how many bits are needed to specify the next value of the > signal. >
<snip> The important 20th century composer Paul Hindemith once wrote that he found music to be interesting to listen to when he had a low success rate at predicting what would happen next. I think he was thinking of higher level structures than single notes, but couldn't the concept of entropy apply to these higher structures as well?
gowan wrote:

...
> The important 20th century composer Paul Hindemith once wrote that he > found music to be interesting to listen to when he had a low success > rate at predicting what would happen next. I think he was thinking of > higher level structures than single notes, but couldn't the concept of > entropy apply to these higher structures as well?
It can apply at all levels; even within one note in the hands of a master player/singer. The real interest here is the "middle ground" (to borrow a term from Schekerian analysis); where a predictable context is established (which in turn demands a knowing listener: a cognoscento), in order for a subsequent unpredictable event to have a desired rhetorical effect. One of the reasons "total serialism" ultimately failed over time as an idiom was that it was easy to demonstrate that its outputs were cognitively indistinguishable from randomly generated music. So total entropy is bad, and total un-predictability is bad also. Composers such as Trevor Wishart (waving the flag for UK composers here!), working in the medium of electro-acoustic music (aka "tape music",) go to considerable lengths to establish a viable cognitive framework (there is typically no score or equivalent text to follow, from which to learn structure), e.g. by using straight repetition to establish "this is a theme[= signficant source material to be noted by the listener]"; where that theme is some abstract sound. The idiomatic gestures of this music are unfamiliar to those brought up on maintstream Western Art Music - it takes some time to get used to the idea of timbre being a primary determinant of musical structure. Schoenberg started it with his idea of "klangfarbenmelodie" (melody of tone colours), Varese ran with it ("organised Sound"), and current electro-acoustic composers have gone super-luminary with it. Music has been described as "audible mathematics" (obvious Pythagorean notion here); but (writing as a confirmed non-mathematician) I wonder how keen mathematicians really are on surprises, contradictions, non-sequiturs, etc! African musicians say "every wrong note is a new style" (cited by Christopher Small in his book "Music of the Common tongue"); but a wrong formula is just plain wrong, leaving very little scope within mathematics for rhetoricism. I suspect much of the deep background of the debates and spats seen on this list reflects a profound need to be rhetorical, expressive and, dare I say it, emotional! Music, in contrast, is not a simple binary right/wrong medium: hence (when we have that middle ground) it is something one can not only learn, but also learn from. "Discuss..." :-) Richard Dobson Richard Dobson
On Fri, 03 Dec 2004 16:38:23 +0000, John Bailey wrote:
> For written language, the analog is how many bits are needed to > confirm an informed guess as to the next letter in a text. Perhaps no > more than three. For some TV shows these days, its only two. > > For music, the question gets really challenging as one considers the > entropy of scores--the parts played by accompanying instruments and > the choices of these instruments requires a lot of encoding. In this > case, the number of bits needed to feed a high level orchestral > synthesizer might establish a lower bound.
Very much a lower bound, I suspect, because even a fantastic orchestral synthesizer would be to an orchestra as a typesetter would be to the handwritten word: you don't only have the notes, you have the playing of each of them, and the interplay of the playing with the acoustics of the hall, etc. Still, it's a neat thought experiment. And you could indeed do an interesting analysis of musical scores, rather than muscial recordings. -- Andrew
On Fri, 03 Dec 2004 17:49:47 +0000, Timothy Murphy
<tim@birdsnest.maths.tcd.ie> wrote:

>As a matter of interest, >do you consider high entropy good or bad? >
On 3 Dec 2004 14:42:51 -0800, gowan4@hotmail.com (gowan) wrote:
> >The important 20th century composer Paul Hindemith once wrote that he >found music to be interesting to listen to when he had a low success >rate at predicting what would happen next. I think he was thinking of >higher level structures than single notes, but couldn't the concept of >entropy apply to these higher structures as well?
>Presumably a completely random series of notes would have very high entropy, >while absolute silence has very low entropy. >I wouldn't have thought either was very enjoyable.
Both of your comments are quite stimulating. Of course! The idea that good music is neither too rote nor too random is not new but the idea that entropy as a concept allows a deeper investigation of what people like is exciting. Conjecture: classes of composers cluster around various levels of entropy. On further thought--what is predictable changes with audience familiarity with a style. Music that is on the leading edge of predictability is the most interesting. Its probably all about what gives our music neurons a good workout. John Bailey http://home.rochester.rr.com/jbxroads/mailto.html
"lucy" <losemind@yahoo.com> wrote in message news:<coo4it$2s2$1@news.Stanford.EDU>...
> I have vaguely heard about this method... and I am very interested in it... > could anybody give me some pointers? > > After learning how to measure entropy of music... I can begin to measure > entropy of texts, etc.. that's going to be fun!
Well, theoretically speaking, you should just treat the "notes" of the music as "symbols" for the computation of the entropy. Then, there is no difference between finding the entropy of text and finding the entropy of music. But if you sample the signal of music, encode it, and compute the entropy, I doubt if it makes sense to do that given the fact that you already have the "notes" of the music. If you just want to know what is about "entropy of data", you might just start with any textbook for "Information theory" or some websites for some ideas, e.g., http://www.ScienceOxygen.com/signal.html http://www.ScienceOxygen.com/electrical.html Have fun,
glen herrmannsfeldt wrote:

> I would think a pure sine wave should be low complexity, so maybe > you should compress the Fourier transform.
I think a vocoder does something similar: linear predictive coding tries to fit an AR model to the input, effectively searching for the formants (resonant peaks) of the "vocal tract" transfer function. If you initialize an IIR filter with the LPC coefficients and the states (delays) with the signal, you can let it "ring" like an oscillator - the output will converge to a sum of pure damped sine waves at the estimated frequencies of the formants. However, instead of storing the sine waves, the vocoder stores the LPC coefficients and the residue (prediction error). To take this back to entropy, the energy of the prediction error could well be regarded as the entropy of a vocal signal - it gives a good measure on how unexpected the signal is, given its past history. Recent work has shown that this method is also suited for general audio signals, not just speech. The order of the AR model just increases drastically (for interpolation, orders of 1000 - 3000 are used). The same applies here for the prediction error and the entropy.
> Has anyone ever done an FFT of a whole CD?
If you take a music CD, I would expect a linear trend of the form 1/f^a. This is a common model for music signals, and it agrees well with findings of long-range correlated time series. For speech, I would guess a rectangular pulse response. The formants vary across a certain frequency range, but are limited from above and below.
> > -- glen
Regards, Andor
Eric Jacobsen wrote:
> On Fri, 3 Dec 2004 15:36:04 +0100, Stephan M. Bernsee > <spam@dspdimension.com> wrote: > > >On 2004-12-03 14:19:03 +0100, Ken Prager <prager_me_@ieee.org> said: > >> <http://eigenradio.media.mit.edu/> > > > >ROTFL!!! This is *really* cool. > > Very interesting. I doubt any of it is going to make it onto my MP3 > player, but it's an intereting concept.
I just downloaded the eigenradio christmas album - I find it very relaxing! Included is a neat collection of high resolution fractal snowflakes (called eigenflakes). I think that is the CD cover :-). Regards, Andor