##
Maximum Entropy Property of the

Gaussian Distribution

### Entropy of a Probability Distribution

The

*entropy*of a probability density function (PDF) is defined as [48]

(D.29) |

where denotes the logarithm base 2. The entropy of can be interpreted as the average number of bits needed to specify random variables drawn at random according to :

(D.30) |

The term can be viewed as the number of bits which should be assigned to the value . (The most common values of should be assigned the fewest bits, while rare values can be assigned many bits.)

### Example: Random Bit String

Consider a random sequence of 1s and 0s,*i.e.*, the probability of a 0 or 1 is always . The corresponding probability density function is

(D.31) |

and the entropy is

(D.32) |

Thus, 1 bit is required for each bit of the sequence. In other words, the sequence cannot be compressed. There is no redundancy. If instead the probability of a 0 is 1/4 and that of a 1 is 3/4, we get

### Maximum Entropy Distributions

#### Uniform Distribution

Among probability distributions which are nonzero over a*finite*range of values , the maximum-entropy distribution is the

*uniform*distribution. To show this, we must maximize the entropy,

(D.33) |

with respect to , subject to the constraints

*Lagrange multipliers*for optimization in the presence of constraints [86], we may form the

*objective function*

(D.34) |

and differentiate with respect to (and renormalize by dropping the factor multiplying all terms) to obtain

(D.35) |

Setting this to zero and solving for gives

(D.36) |

(Setting the partial derivative with respect to to zero merely restates the constraint.) Choosing to satisfy the constraint gives , yielding

(D.37) |

That this solution is a maximum rather than a minimum or inflection point can be verified by ensuring the sign of the second partial derivative is negative for all :

(D.38) |

Since the solution spontaneously satisfied , it is a maximum.

#### Exponential Distribution

Among probability distributions which are nonzero over a*semi-infinite*range of values and having a finite mean , the

*exponential*distribution has maximum entropy. To the previous case, we add the new constraint

(D.39) |

resulting in the objective function

(D.40) |

#### Gaussian Distribution

The Gaussian distribution has maximum entropy relative to all probability distributions covering the entire real line but having a finite mean and*finite variance*. Proceeding as before, we obtain the objective function

(D.41) |

For more on entropy and maximum-entropy distributions, see [48].

**Next Section:**

Gaussian Moments

**Previous Section:**

Gaussian Probability Density Function