Bias of Parabolic Peak Interpolation

Since the true window transform is not a parabola (except for the conceptual case of a Gaussian window transform expressed in dB), there is generally some error in the interpolated peak due to this mismatch. Such a systematic error in an estimated quantity (due to modeling error, not noise), is often called a bias. Parabolic interpolation is unbiased when the peak occurs at a spectral sample (FFT bin frequency), and also when the peak is exactly half-way between spectral samples (due to symmetry of the window transform about its midpoint). For other peak frequencies, quadratic interpolation yields a biased estimate of both peak frequency and peak amplitude. (Phase is essentially unbiased [1].)

Since zero-padding in the time domain gives ideal interpolation in the frequency domain, there is no bias introduced by this type of interpolation. Thus, if enough zero-padding is used so that a spectral sample appears at the peak frequency, simply finding the largest-magnitude spectral sample will give an unbiased peak-frequency estimator. (We will learn in §4.8.2 that this is also the maximum likelihood estimator for the frequency of a sinusoid in additive white Gaussian noise.)

While we could choose our zero-padding factor large enough to yield any desired degree of accuracy in peak frequency measurements, it is more efficient in practice to combine zero-padding with parabolic interpolation (or some other simple, low-order interpolator). In such hybrid schemes, the zero-padding is simply chosen large enough so that the bias due to parabolic interpolation is negligible. In §4.8 below, the Quadratically Interpolated FFT (QIFFT) method is described as one such hybrid scheme.

Minimum Zero-Padding for High-Frequency Peaks

Table: Minimum zero-padding factors $ L_{\hbox {min}}=N_{\hbox {min}}/M$ for keeping peak-frequency bias below approximately $ \Delta $ percent of the sampling-rate divided by the window length [1]. This table is overly conservative for peak-frequencies below 1 kHz. Here, $ N_{\hbox {min}}$ denotes the minimum FFT length, and $ M$ denotes the window length. The zero-padding therefore consists of $ N_{\hbox {min}}-M$ zeros. $ L_{\hbox {min}}$ is calculated using the formulas in [1] and rounded to two significant digits.
Window Type $ \mathbf{\Delta}$ (%) $ \mathbf{L_{\hbox{min}}}$
Rectangular $ 1$ 2.1
Gen. Hamming $ 1$ 1.2
Blackman $ 1$ $ 1.0$
Rectangular $ 0.1$ 4.1
Gen. Hamming $ 0.1$ 2.4
Blackman $ 0.1$ $ 1.8$

Table 4.3 gives zero-padding factors sufficient for keeping the bias below $ 0.01\cdot\Delta\cdot f_s/M$ Hz, where $ f_s$ denotes the sampling rate in Hz, and $ M$ is the window length in samples. For fundamental frequency estimation, $ \Delta $ can be interpreted as the relative frequency error `` $ \Delta f/f$'' when the window length is one period. In this case, $ f_s/M$ is the fundamental frequency in Hz. More generally, $ f_s/M$ is the bandwidth of each side-lobe in the DTFT of a length $ M$ rectangular, generalized Hamming, or Blackman window (any member of the Blackman-Harris window family, as elaborated in Chapter 3).

Note from Table 4.3 that the Blackman window requires no zero-padding at all when only $ 1$% accuracy is required in peak-frequency measurement. It should also be understood that a frequency error of $ 0.1$% is inaudible in most audio applications.5.13

Minimum Zero-Padding for Low-Frequency Peaks

Sharper bounds on the zero-padding factor needed for low-frequency peaks (below roughly 1 kHZ) may be obtained based on the measured Just-Noticeable-Difference (JND) in frequency and/or amplitude [259]. In particular, a $ 0.1$% relative-error spec is good above 1 kHz (being conservative by approximately a factor of 2), but overly conservative at lower frequencies where the JND flattens out. Below 1 kHz, a fixed 1 Hz spec satisfies perceptual requirements and gives smaller minimum zero-padding factors than the $ 0.1$% relative-error spec.

The following data, extracted from [259, Table I, p. 89] gives frequency JNDs at a presentation level of 60 dB SPL (the most sensitive case measured):

  f =    [    62,    125,    250,    500,   1000,   2000,   4000];
  dfof = [0.0346, 0.0269, 0.0098, 0.0035, 0.0034, 0.0018, 0.0020];
Thus, the frequency JND at 4 kHz was measured to be two tenths of a percent. (These measurements were made by averaging experimental results for five men between the ages of 20 and 30.) Converting relative frequency to absolute frequency in Hz yields (in matlab syntax):
  df = dfof .* f; % = [2.15, 3.36, 2.45, 1.75, 3.40, 3.60, 8.00];
For purposes of computing the minimum zero-padding factor required, we see that the absolute tuning error due to bias can be limited to 1 Hz, based on measurements at 500 Hz (at 60 dB). Doing this for frequencies below 1 kHz yields the results shown in Table 4.4. Note that the Blackman window needs no zero padding below 125 Hz, and the Hamming/Hann window requires no zero padding below 62.5 Hz.

Table: Minimum zero-padding factors $ L_{\hbox {min}}=N_{\hbox {min}}/M$ for keeping peak-frequency bias below approximately 1 Hz (well under 1.75 Hz), assuming the window length $ M$ to span one period of the fundamental frequency.
Window Type $ f$ (Hz) $ \mathbf{L_{\hbox{min}}}$
Rectangular 1000 4.1
  500 3.3
  250 2.6
  125 2.1
  62.5 1.7
Gen. Hamming 1000 2.4
  500 1.9
  250 1.5
  125 1.2
  62.5 1
Blackman 1000 1.8
  500 1.5
  250 1.2
  125 1
  62.5 1

Matlab for Computing Minimum Zero-Padding Factors

The minimum zero-padding factors in the previous two subsections were computed using the matlab function zpfmin listed in §G.2.4. For example, both tables above are included in the output of the following matlab program:

  for i=1:length(windows)
    w = sprintf("%s",windows(i))
    for j=1:length(freqs)
      f = freqs(j);
      zpfmin(w,1/f,0.01*f)  % 1 percent spec (large for audio)
      zpfmin(w,1/f,0.001*f) % 0.1 percent spec (good > 1 kHz)
      zpfmin(w,1/f,1)       % 1 Hz spec (good below 1 kHz)

In addition to ``perceptually exact'' detection of spectral peaks, there are times when we need to find spectral parameters as accurately as possible, irrespective of perception. For example, one can estimate the stiffness of a piano string by measuring the stretched overtone-frequencies in the spectrum of that string's vibration. Additionally, we may have measurement noise, in which case we want our measurements to be minimally influenced by this noise. The following sections discuss optimal estimation of spectral-peak parameters due to sinusoids in the presence of noise.

Next Section:
Least Squares Sinusoidal Parameter Estimation
Previous Section:
Quadratic Interpolation of Spectral Peaks