## Random Variables & Stochastic Processes

For a full treatment of random variables and stochastic processes
(sequences of random variables), see, *e.g.*, [201]. For
practical every-day signal analysis, the simplified definitions and
examples below will suffice for our purposes.

### Probability Distribution

**Definition: **
A *probability distribution*
may be defined as a
non-negative real function of all possible outcomes of some random
event. The sum of the probabilities of all possible outcomes is
defined as 1, and probabilities can never be negative.

**Example: **
A *coin toss* has two outcomes, ``heads'' (H) or ``tails'' (T),
which are equally likely if the coin is ``fair''. In this case, the
probability distribution is

(C.1) |

where denotes the

*probability*of outcome . That is, the total ``probability mass'' is divided equally between the two possible outcomes heads or tails. This is an example of a

*discrete*probability distribution because all probability is assigned to two discrete points, as opposed to some continuum of possibilities.

### Independent Events

Two probabilistic events
and
are said to be
*independent* if the probability of
and
occurring together equals the
*product* of the probabilities of
and
individually, *i.e.*,

(C.2) |

where denotes the probability of and occurring together.

**Example: **
Successive *coin tosses* are normally independent.
Therefore, the probability of getting heads twice in a row is
given by

(C.3) |

### Random Variable

**Definition: **
A *random variable*
is defined as a real- or complex-valued
function of some random event, and is fully characterized by its
probability distribution.

**Example: **
A random variable can be defined based on a coin toss by defining
numerical values for heads and tails. For example, we may assign 0 to
tails and 1 to heads. The probability distribution for this random
variable is then

**Example: **
A *die* can be used to generate integer-valued random variables
between 1 and 6. Rolling the die provides an underlying random event.
The probability distribution of a fair die is the
*discrete uniform distribution* between 1 and 6. *I.e.*,

(C.5) |

**Example: **
A *pair of dice* can be used to generate integer-valued random
variables between 2 and 12. Rolling the dice provides an underlying
random event. The probability distribution of two fair dice is given by

(C.6) |

This may be called a discrete

*triangular*distribution. It can be shown to be given by the

*convolution*of the discrete uniform distribution for one die with itself. This is a general fact for sums of random variables (the distribution of the sum equals the convolution of the component distributions).

**Example: **
Consider a random experiment in which a sewing needle is dropped onto
the ground from a high altitude. For each such event, the angle of
the needle with respect to north is measured. A reasonable model for
the distribution of angles (neglecting the earth's magnetic field) is
the *continuous uniform distribution* on
, *i.e.*, for
any real numbers
and
in the interval
, with
, the probability of the needle angle falling within that interval
is

(C.7) |

Note, however, that the probability of any

*single*angle is zero. This is our first example of a

*continuous probability distribution*. Therefore, we cannot simply define the probability of outcome for each . Instead, we must define the

*probability density function*(

(C.8) |

To calculate a probability, the PDF must be

*integrated*over one or more

*intervals*. As follows from Lebesgue integration theory (``measure theory''), the probability of any countably infinite set of discrete points is zero when the PDF is finite. This is because such a set of points is a ``set of measure zero'' under integration. Note that we write for discrete probability distributions and for PDFs. A discrete probability distribution such as that in (C.4) can be written as

(C.9) |

where denotes an

*impulse*.

^{C.1}

### Stochastic Process

(Again, for a more complete treatment, see [201] or the like.)

**Definition: **
A *stochastic process*
is defined as a sequence of random
variables
,
.

A stochastic process may also be called a *random process*,
*noise process*, or simply *signal* (when the context
is understood to exclude deterministic components).

### Stationary Stochastic Process

**Definition: **
We define a *stationary* stochastic process
,
as a stochastic process consisting of
*identically distributed* random variables
. In
particular, all statistical measures are *time-invariant*.

When a stochastic process is stationary, we may measure statistical
features by *averaging over time*. Examples below include the
sample mean and sample variance.

### Expected Value

**Definition: **
The *expected value* of a continuous random variable
is denoted
and is defined by

(C.12) |

where denotes the

*probability density function*(PDF) for the random variable v.

**Example: **
Let the random variable
be uniformly distributed between
and
, *i.e.*,

(C.13) |

Then the expected value of is computed as

(C.14) |

Thus, the expected value of a random variable uniformly distributed between and is simply the average of and .

For a stochastic process, which is simply a sequence of random
variables,
means the expected value of
over
``all realizations'' of the random process
. This is also
called an *ensemble average*. In other words, for each ``roll of
the dice,'' we obtain an entire signal
, and to compute
, say, we average
together all of the values of
obtained for all ``dice rolls.''

For a stationary random process
, the random variables
which make it up
are identically distributed. As a result, we may normally compute
expected values by *averaging over time* within a *single
realization* of the random process, instead of having to average
``vertically'' at a single time instant over many realizations of the
random process.^{C.2} Denote time averaging by

(C.15) |

Then, for a stationary random processes, we have . That is, for

*stationary*random signals, ensemble averages equal time averages.

We are concerned only with stationary stochastic processes in this
book. While the statistics of noise-like signals must be allowed
to evolve over time in high quality spectral models, we may require
essentially time-invariant statistics within a single *frame* of
data in the time domain. In practice, we choose our spectrum analysis
window short enough to impose this. For audio work, 20 ms is a
typical choice for a frequency-independent frame length.^{C.3} In a multiresolution system, in which the frame length
can vary across frequency bands, several periods of the band
center-frequency is a reasonable choice. As discussed in
§5.5.2, the minimum number of periods required under
the window for resolution of spectral peaks depends on the window type
used.

### Mean

**Definition: **
The *mean* of a stochastic process
at time
is defined as
the expected value of
:

(C.16) |

where is the probability density function for the random variable .

For a *stationary stochastic process*
, the mean is given by
the expected value of
for any
. *I.e.*,
for all
.

### Sample Mean

**Definition: **
The *sample mean* of a set of
samples from a particular
realization of a *stationary stochastic process*
is defined
as the *average* of those samples:

(C.17) |

For a

*stationary stochastic process*, the sample mean is an

*unbiased estimator*of the mean,

*i.e.*,

(C.18) |

### Variance

**Definition: **
The *variance* or *second central moment* of a stochastic
process
at time
is defined as the expected value of
:

(C.19) |

where is the probability density function for the random variable .

For a *stationary stochastic process*
, the variance is given
by the expected value of
for any
.

### Sample Variance

**Definition: **
The *sample variance* of a set of
samples from a particular
realization of a *stationary stochastic process*
is defined
as *average squared magnitude* after removing the *known mean*:

(C.20) |

The sample variance is a

*unbiased estimator*of the true variance when the

*mean is known*,

*i.e.*,

(C.21) |

This is easy to show by taking the expected value:

When the mean is *unknown*, the sample mean is used in its place:

(C.23) |

The normalization by instead of is necessary to make the sample variance be an

*unbiased*estimator of the true variance. This adjustment is necessary because the sample mean is

*correlated*with the term in the sample variance expression. This is revealed by replacing with in the calculation of (C.22).

**Next Section:**

Correlation Analysis

**Previous Section:**

Relation of Smoothness to Roll-Off Rate