##
Maximum Entropy Property of the

Gaussian Distribution

### Entropy of a Probability Distribution

The *entropy* of a probability density function (PDF)
is
defined as [48]

(D.29) |

where denotes the logarithm base 2. The entropy of can be interpreted as the average number of bits needed to specify random variables drawn at random according to :

(D.30) |

The term can be viewed as the number of bits which should be assigned to the value . (The most common values of should be assigned the fewest bits, while rare values can be assigned many bits.)

### Example: Random Bit String

Consider a random sequence of 1s and 0s, *i.e.*, the probability of a 0 or
1 is always
. The corresponding probability density function
is

(D.31) |

and the entropy is

(D.32) |

Thus, 1 bit is required for each bit of the sequence. In other words, the sequence cannot be compressed. There is no redundancy.

If instead the probability of a 0 is 1/4 and that of a 1 is 3/4, we get

and the sequence can be compressed about .

In the degenerate case for which the probability of a 0 is 0 and that of a 1 is 1, we get

Thus, the entropy is 0 when the sequence is perfectly predictable.

### Maximum Entropy Distributions

#### Uniform Distribution

Among probability distributions
which are nonzero over a
*finite* range of values
, the maximum-entropy
distribution is the *uniform* distribution. To show this, we
must maximize the entropy,

(D.33) |

with respect to , subject to the constraints

Using the method of *Lagrange multipliers* for optimization in
the presence of constraints [86], we may form the
*objective function*

(D.34) |

and differentiate with respect to (and renormalize by dropping the factor multiplying all terms) to obtain

(D.35) |

Setting this to zero and solving for gives

(D.36) |

(Setting the partial derivative with respect to to zero merely restates the constraint.)

Choosing to satisfy the constraint gives , yielding

(D.37) |

That this solution is a maximum rather than a minimum or inflection point can be verified by ensuring the sign of the second partial derivative is negative for all :

(D.38) |

Since the solution spontaneously satisfied , it is a maximum.

#### Exponential Distribution

Among probability distributions
which are nonzero over a
*semi-infinite* range of values
and having a finite
mean
, the *exponential* distribution has maximum entropy.

To the previous case, we add the new constraint

(D.39) |

resulting in the objective function

Now the partials with respect to are

and is of the form . The unit-area and finite-mean constraints result in and , yielding

(D.40) |

#### Gaussian Distribution

The Gaussian distribution has maximum entropy relative to all
probability distributions covering the entire real line
but having a finite mean
and *finite
variance*
.

Proceeding as before, we obtain the objective function

and partial derivatives

leading to

(D.41) |

For more on entropy and maximum-entropy distributions, see [48].

**Next Section:**

Gaussian Moments

**Previous Section:**

Gaussian Probability Density Function