## Correlation Analysis

The*correlation operator*(defined in §7.2.5) plays a major role in statistical signal processing. For a proper development, see,

*e.g.*, [27,33,65]. This section introduces only some of the most basic elements of statistical signal processing in a simplified manner, with emphasis on illustrating applications of the DFT.

### Cross-Correlation

**Definition:**The

*circular cross-correlation*of two signals and in may be defined by

*DFT correlation operator*`' was first defined in §7.2.5. The term ``cross-correlation'' comes from

*statistics*, and what we have defined here is more properly called a ``sample cross-correlation.'' That is, is an

*estimator*

^{8.8}of the true cross-correlation which is an assumed statistical property of the signal itself. This definition of a sample cross-correlation is only valid for

*stationary*stochastic processes,

*e.g.*, ``steady noises'' that sound unchanged over time. The statistics of a stationary stochastic process are by definition

*time invariant*, thereby allowing

*time-averages*to be used for estimating statistics such as cross-correlations. For brevity below, we will typically

*not*include ``sample'' qualifier, because all computational methods discussed will be sample-based methods intended for use on stationary data segments. The DFT of the cross-correlation may be called the

*cross-spectral density*, or ``cross-power spectrum,'' or even simply ``cross-spectrum'':

### Unbiased Cross-Correlation

Recall that the cross-correlation operator is*cyclic*(circular) since is interpreted modulo . In practice, we are normally interested in estimating the

*acyclic*cross-correlation between two signals. For this (more realistic) case, we may define instead the

*unbiased cross-correlation*

*e.g.*, ) in order to have enough lagged products at the highest lag so that a reasonably accurate average is obtained. Note that the summation stops at to avoid cyclic wrap-around of modulo . The term ``unbiased'' refers to the fact that the expected value

^{8.9}[33] of is the true cross-correlation of and (assumed to be samples from stationary stochastic processes). An unbiased acyclic cross-correlation may be computed faster via DFT (FFT) methods using zero padding:

*time scale modification*,

*pitch shifting*,

*click removal*, and many others.

### Autocorrelation

The cross-correlation of a signal with itself gives its*autocorrelation*:

*real*and

*even*(symmetric about lag zero). The unbiased cross-correlation similarly reduces to an unbiased autocorrelation when :

The DFT of the true autocorrelation function is the (sampled)

*power spectral density*(PSD), or

*power spectrum*, and may be denoted

*estimate*of the PSD:

^{8.10}

*average power*(mean square) which we defined in §5.8:

*sample circular cross-covariance*as

*variance*of the signal :

### Matched Filtering

The cross-correlation function is used extensively in*pattern recognition*and

*signal detection*. We know from Chapter 5 that projecting one signal onto another is a means of measuring how much of the second signal is present in the first. This can be used to ``detect'' the presence of known signals as components of more complicated signals. As a simple example, suppose we record which we think consists of a signal that we are looking for plus some additive measurement noise . That is, we assume the signal model . Then the projection of onto is (recalling §5.9.9)

*matched filtering*. The impulse response of the ``matched filter'' for a real signal is given by .

^{8.11}By time-reversing , we transform the convolution implemented by filtering into a sliding cross-correlation operation between the input signal and the sought signal . (For complex known signals , the matched filter is .) We detect occurrences of in by detecting peaks in . In the same way that FFT convolution is faster than direct convolution (see Table 7.1), cross-correlation and matched filtering are generally carried out most efficiently using an FFT algorithm (Appendix A).

### FIR System Identification

Estimating an impulse response from input-output measurements is called*system identification*, and a large literature exists on this topic (

*e.g.*, [39]). Cross-correlation can be used to compute the impulse response of a filter from the cross-correlation of its input and output signals and , respectively. To see this, note that, by the correlation theorem,

*i.e.*, . A Matlab program illustrating these relationships is listed in Fig.8.13.

% sidex.m - Demonstration of the use of FFT cross- % correlation to compute the impulse response % of a filter given its input and output. % This is called "FIR system identification". Nx = 32; % input signal length Nh = 10; % filter length Ny = Nx+Nh-1; % max output signal length % FFT size to accommodate cross-correlation: Nfft = 2^nextpow2(Ny); % FFT wants power of 2 x = rand(1,Nx); % input signal = noise %x = 1:Nx; % input signal = ramp h = [1:Nh]; % the filter xzp = [x,zeros(1,Nfft-Nx)]; % zero-padded input yzp = filter(h,1,xzp); % apply the filter X = fft(xzp); % input spectrum Y = fft(yzp); % output spectrum Rxx = conj(X) .* X; % energy spectrum of x Rxy = conj(X) .* Y; % cross-energy spectrum Hxy = Rxy ./ Rxx; % should be the freq. response hxy = ifft(Hxy); % should be the imp. response hxy(1:Nh) % print estimated impulse response freqz(hxy,1,Nfft); % plot estimated freq response err = norm(hxy - [h,zeros(1,Nfft-Nh)])/norm(h); disp(sprintf(['Impulse Response Error = ',... '%0.14f%%'],100*err)); err = norm(Hxy-fft([h,zeros(1,Nfft-Nh)]))/norm(h); disp(sprintf(['Frequency Response Error = ',... '%0.14f%%'],100*err)); |

**Next Section:**

Power Spectral Density Estimation

**Previous Section:**

Filters and Convolution