Geometric Interpretation of Least Squares

Typically, the number of frequency constraints is much greater than the number of design variables (filter coefficients). In these cases, we have an overdetermined system of equations (more equations than unknowns). Therefore, we cannot generally satisfy all the equations, and are left with minimizing some error criterion to find the ``optimal compromise'' solution.

In the case of least-squares approximation, we are minimizing the Euclidean distance, which suggests the geometrical interpretation shown in Fig.4.19.

% latex2html id marker 14494\psfrag{Ax}{{\Large $\mathbf{A}{\underline{\hat{h}}}$}}\psfrag{b}{{\Large ${\underline{d}}$}}\psfrag{column}{{\Large column-space of $\mathbf{A}$}}\psfrag{space}{}\begin{figure}[htbp]
\caption{Geometric interpretation of orthogonal
projection of the vector ${\underline{d}}$\ onto the column-space of $\mathbf{A}$.}
Thus, the desired vector $ {\underline{d}}$ is the vector sum of its best least-squares approximation $ \mathbf{A}{\underline{\hat{h}}}$ plus an orthogonal error $ e$ :

$\displaystyle {\underline{d}}\eqsp \mathbf{A}{\underline{\hat{h}}}+ {\underline{e}}.$ (5.42)

In practice, the least-squares solution $ {\underline{\hat{h}}}$ can be found by minimizing the sum of squared errors:

$\displaystyle \hbox{Minimize}_{\underline{h}}\Vert{\underline{e}}\Vert _2 \eqsp \Vert{\underline{d}}-\mathbf{A}{\underline{h}}\Vert _2$ (5.43)

Figure 4.19 suggests that the error vector $ {\underline{d}}-\mathbf{A}{\underline{\hat{h}}}$ is orthogonal to the column space of the matrix $ \mathbf {A}$ , hence it must be orthogonal to each column in $ \mathbf {A}$ :

$\displaystyle \mathbf{A}^T({\underline{d}}-\mathbf{A}{\underline{\hat{h}}})\eqsp 0 \quad\Rightarrow\quad \mathbf{A}^T\mathbf{A}{\underline{\hat{h}}}\eqsp \mathbf{A}^T{\underline{d}}$ (5.44)

This is how the orthogonality principle can be used to derive the fact that the best least squares solution is given by

$\displaystyle {\underline{\hat{h}}}\eqsp (\mathbf{A}^T\mathbf{A})^{-1}\mathbf{A}^T {\underline{d}}\eqsp \mathbf{A}^\dagger {\underline{d}}$ (5.45)

In matlab, it is numerically superior to use ``h= A $ \backslash$ h'' as opposed to explicitly computing the pseudo-inverse as in ``h = pinv(A) * d''. For a discussion of numerical issues in matrix least-squares problems, see, e.g., [92].

We will return to least-squares optimality in §5.7.1 for the purpose of estimating the parameters of sinusoidal peaks in spectra.

Next Section:
Matlab Support for Least-Squares FIR Filter Design
Previous Section:
Filter Design using Lp Norms