The
correlation operator (defined in §
7.2.5) plays
a major role in
statistical signal processing.
For a proper development, see,
e.g.,
[
27,
33,
65]. This section introduces
only some of the most basic elements of statistical
signal processing
in a simplified manner, with emphasis on illustrating applications of
the
DFT.

Definition: The
circular cross-correlation of two
signals 
and

in

may be defined by
(Note that the ``lag''

is an integer variable, not the constant

.) The
DFT correlation operator `

' was first defined in
§
7.2.5.
The term ``cross-correlation'' comes from
statistics, and what we have defined here is more properly
called a ``sample cross-correlation.''
That is,

is an
estimator8.8 of the true
cross-correlation

which is an assumed statistical property
of the signal itself. This definition of a sample cross-correlation is only valid for
stationary stochastic processes,
e.g., ``steady
noises'' that
sound unchanged over time. The statistics of a stationary stochastic
process are by definition
time invariant, thereby allowing
time-averages to be used for estimating statistics such
as cross-correlations. For brevity below, we will typically
not include ``sample'' qualifier, because all computational
methods discussed will be sample-based methods intended for use on
stationary data segments.
The DFT of the cross-correlation may be called the
cross-spectral
density, or ``cross-power
spectrum,'' or even simply ``cross-spectrum'':
The last equality above follows from the
correlation theorem
(§
7.4.7).
Recall that the cross-
correlation operator is
cyclic (circular)
since

is interpreted modulo

. In practice, we are normally
interested in estimating the
acyclic cross-correlation
between two
signals. For this (more realistic) case, we may define
instead the
unbiased cross-correlation
where we choose

(
e.g.,

) in order to have
enough lagged products

at the highest lag

so that a reasonably accurate average is obtained. Note that the
summation stops at

to avoid cyclic wrap-around of

modulo

. The term ``unbiased'' refers to the fact that the expected
value
8.9[
33] of

is the true cross-correlation

of

and

(assumed to be samples from stationary stochastic
processes).
An unbiased acyclic cross-correlation may be computed faster via
DFT
(
FFT) methods using
zero padding:
where
Note that

and

belong to

while

and

belong to

. The zero-padding may be
causal (as defined in
§
7.2.8)
because the signals are assumed to be be stationary, in which case all
signal statistics are time-invariant. As usual when embedding acyclic
correlation (or
convolution) within the cyclic variant given by the
DFT, sufficient zero-padding is provided so that only zeros are ``time
aliased'' (wrapped around in time) by
modulo indexing.
Cross-correlation is used extensively in audio signal processing for
applications such as
time scale modification,
pitch
shifting,
click removal, and many others.
Autocorrelation
The
cross-correlation of a
signal with itself gives its
autocorrelation:
The autocorrelation function is Hermitian:
When

is real, its autocorrelation is
real and
even
(symmetric about lag zero).
The
unbiased cross-correlation similarly reduces to an unbiased
autocorrelation when

:
 |
(8.2) |
The
DFT of the true autocorrelation function

is the (sampled)
power spectral density (
PSD), or
power spectrum, and may
be denoted
The complete (not sampled) PSD is

, where the
DTFT is defined in Appendix
B (it's just an
infinitely long DFT). The DFT of

thus provides a sample-based
estimate of the PSD:
8.10
We could call

a ``sampled sample power spectral
density''.
At lag zero, the autocorrelation function reduces to the
average
power (mean square) which we defined in §
5.8:
Replacing ``
correlation'' with ``covariance'' in the above definitions
gives corresponding zero-mean versions. For example, we may define
the
sample circular cross-covariance as
where

and

denote the means of

and

,
respectively. We also have that

equals the sample
variance of the signal

:
The
cross-correlation function is used extensively in
pattern
recognition and
signal detection. We know from Chapter
5
that projecting one signal onto another is a means of measuring how
much of the second signal is present in the first. This can be used
to ``detect'' the presence of known signals as components of more
complicated signals. As a simple example, suppose we record

which we think consists of a signal

that we are looking for
plus some additive measurement
noise 
. That is, we assume the
signal model

. Then the projection of

onto

is
(recalling §
5.9.9)
since the projection of random, zero-mean
noise 
onto

is small
with probability one. Another term for this process is
matched filtering. The
impulse response of the ``matched
filter'' for a real signal

is given by

.
8.11 By time-reversing

, we transform the
convolution implemented by filtering into a
sliding cross-
correlation operation between the input signal

and
the sought signal

. (For complex known signals

, the matched
filter is

.) We detect occurrences of

in

by
detecting peaks in

.
In the same way that
FFT convolution is faster than direct convolution
(see Table
7.1), cross-correlation and matched filtering are
generally carried out most efficiently using an FFT algorithm (Appendix
A).
Estimating an
impulse response from input-output measurements is
called
system identification, and a large literature exists on
this topic (
e.g., [
39]).
Cross-correlation can be used to compute the
impulse response

of a
filter from the cross-
correlation of its input and output
signals

and

, respectively. To see this, note that, by
the
correlation theorem,
Therefore, the
frequency response equals the input-output
cross-spectrum divided by the input power
spectrum:
where multiplication and division of
spectra are defined pointwise,
i.e.,

.
A
Matlab program illustrating these relationships is listed in
Fig.
8.13.
Figure 8.13:
FIR system identification example in matlab.
% sidex.m - Demonstration of the use of FFT cross-
% correlation to compute the impulse response
% of a filter given its input and output.
% This is called "FIR system identification".
Nx = 32; % input signal length
Nh = 10; % filter length
Ny = Nx+Nh-1; % max output signal length
% FFT size to accommodate cross-correlation:
Nfft = 2^nextpow2(Ny); % FFT wants power of 2
x = rand(1,Nx); % input signal = noise
%x = 1:Nx; % input signal = ramp
h = [1:Nh]; % the filter
xzp = [x,zeros(1,Nfft-Nx)]; % zero-padded input
yzp = filter(h,1,xzp); % apply the filter
X = fft(xzp); % input spectrum
Y = fft(yzp); % output spectrum
Rxx = conj(X) .* X; % energy spectrum of x
Rxy = conj(X) .* Y; % cross-energy spectrum
Hxy = Rxy ./ Rxx; % should be the freq. response
hxy = ifft(Hxy); % should be the imp. response
hxy(1:Nh) % print estimated impulse response
freqz(hxy,1,Nfft); % plot estimated freq response
err = norm(hxy - [h,zeros(1,Nfft-Nh)])/norm(h);
disp(sprintf(['Impulse Response Error = ',...
'%0.14f%%'],100*err));
err = norm(Hxy-fft([h,zeros(1,Nfft-Nh)]))/norm(h);
disp(sprintf(['Frequency Response Error = ',...
'%0.14f%%'],100*err));
|
Next Section: Power Spectral Density EstimationPrevious Section: Filters and Convolution