### Auditory Filter Banks

*Auditory filter banks* are non-uniform bandpass filter banks
designed to imitate the frequency resolution of human hearing
[307,180,87,208,255].
Classical auditory filter banks include constant-Q filter banks such
as the widely used *third-octave filter bank*.
Digital constant-Q filter banks have also been developed for audio
applications [29,30].
More recently, constant-Q filter
banks for audio have been devised based on the *wavelet
transform*, including the auditory wavelet filter bank
[110]. Auditory filter banks have also been based
more directly on psychoacoustic measurements, leading to
approximations of the auditory filter frequency response in terms of a
Gaussian function [205], a ``rounded exponential''
[207], and more recently the
*gammatone* (or ``Patterson-Holdsworth'') filter bank
[208,255].
The *gamma-chirp* filter bank further adds a level-dependent
asymmetric correction to the basic gammatone channel frequency
response, thus providing a more accurate approximation to the
auditory frequency response
[112,111].

The output power from an auditory filter bank at a particular time
defines the so-called *excitation pattern* versus frequency at
that time [87,179,305]. It may
be considered analogous to the average power of the physical
excitation applied to the *hair cells* of the inner ear by the
vibrating *basilar membrane* in the cochlea.^{8.6} The shape of the excitation pattern can thus be thought of
as approximating the envelope of the basilar membrane vibration.

The excitation pattern produced from an auditory filter bank, together
with appropriate equalization (frequency-dependent gain) and
nonlinear compression, can be used to define *specific
loudness* as a function of time and frequency
[306,305,177,182,88].

Because the channels of an auditory filter bank are distributed
non-uniformly versus frequency, they can be regarded as a basis for a
*non-uniform sampling* of the frequency axis.
In this point of view, the auditory-filter frequency response becomes
the (frequency-dependent) *interpolation kernel* used to extract
a frequency sample at the filter's center frequency. See
§7.3.3 below for further details.

**Next Section:**

Loudness Spectrogram

**Previous Section:**

Spectrogram of Speech