Equivalent Rectangular Bandwidth

It also turns out that a first-order conformal map (bilinear transform) can provide a good match to the ERB scale [269] as well. Moore and Glasberg [177] have revised Zwicker's loudness model to better explain (1) how equal-loudness contours change as a function of level, (2) why loudness remains constant as the bandwidth of a fixed-intensity sound increases up to the critical bandwidth, and (3) the loudness of partially masked sounds. The modification that is relevant here is the replacement of the Bark scale by the equivalent rectangular bandwidth (ERB) scale. The ERB of the auditory filter is assumed to be closely related to the critical bandwidth, but it is measured using the notched-noise method [205,206,251,181,87] rather than on classical masking experiments involving a narrow-band masker and probe tone [306,307,304]. As a result, the ERB is said not to be affected by the detection of beats or intermodulation products between the signal and masker. Since this scale is defined analytically, it is also more smoothly behaved than the Bark scale data.

Figure: Bark critical bandwidth and equivalent rectangular bandwidth as a function of frequency. Also plotted is the classical rule of thumb that a critical band is 100 Hz wide for center frequencies below 500 Hz, and 20% of the center frequency above 500 Hz. Also plotted is the empirically determined formula, CB bandwidth in Hz $ \approx 94+71f^{3/2}$ , with $ f$ in kHz [37]. The ERBs are computed from (E.5), and the Bark CB bandwidths were computed by differencing the band-edge frequencies listed in §E.1, plotting each difference over its corresponding band center (also listed in §E.1).

At moderate sound levels, the ERB in Hz is defined by [177]

   ERB$\displaystyle (f) = 0.108 f + 24.7

where $ f$ is center-frequency in Hz, normally in the range 100 Hz to 10 kHz. The ERB is generally narrower than the classical critical bandwidth (CB), being about $ 11$ % of center frequency at high frequencies, and leveling off to about $ 25$ Hz at low frequencies. The classical CB, on the other hand, is approximately $ 20$ % of center frequency, leveling off to $ 100$ Hz below $ 500$ Hz. An overlay of ERB and CB bandwidths is shown in Fig.E.10. Also shown is the approximate classical CB bandwidth, as well as a more accurate analytical expression for Bark bandwidth vs. Hz [3]. Finally, note that the frequency interval [$ 400$ Hz, $ 6.5$ kHz] corresponds to good agreement between the psychophysical ERB and the directly physical audio filter bandwidths defined in terms of place along the basilar membrane [96, p. 2601].

Figure: Bark and ERB frequency warpings for a sampling rate of $ 31$  kHz. a) Linear input frequency scale. b) Log input frequency scale. Note that sampling is uniform across the vertical axis (corresponding to the desired audio frequency scale). As a result, the plotted samples align horizontally rather than vertically.

The ERB scale is defined as the number of ERBs below each frequency

   ERBS$\displaystyle (f) = 21.4 \log_{10}(0.00437 f + 1)

for $ f$ in Hz [177]. An overlay of the normalized Bark and ERB frequency warpings is shown in Fig.E.11. The ERB warping is determined by scaling the inverse of (E.5), evaluated along a uniform frequency grid from zero to the number of ERBs at half the sampling rate, so that dc maps to zero and half the sampling rate maps to $ \pi$ .

Proceeding in the same manner as for the Bark-scale case, allpass coefficients giving a best approximation to the ERB-scale warping were computed for sampling rates near twice the Bark band edge frequencies (chosen to facilitate comparison between the ERB and Bark cases). The resulting optimal map coefficients are shown in Fig.E.12. The allpass parameter increases with increasing sampling rate, as in the Bark-scale case, but it covers a significantly narrower range, as a comparison with Fig.E.3 shows. Also, the Chebyshev solution is now systematically larger than the least-squares solutions, and the least-squares and weighted equation-error cases are no longer essentially identical. The fact that the arctangent formula is optimized for the Chebyshev case is much more evident in the error plot of Fig.E.12b than it was in Fig.E.3b for the Bark warping parameter.

Figure: a) Optimal allpass coefficients $ \rho ^*$ for the ERB case, plotted as a function of sampling rate $ f_s$ . Also shown is the arctangent approximation. b) Same as a) with the arctangent formula subtracted out.

Figure E.13: Root-mean-square and peak frequency-mapping errors (conformal map minus ERB) versus sampling rate for Chebyshev, least squares, weighted equation-error, and arctangent optimal maps. The rms errors are nearly coincident along the lower line, while the peak errors form an upper group well above the rms errors.

The peak and rms mapping errors are plotted versus sampling rate in Fig.E.13. Compare these results for the ERB scale with those for the Bark scale in Fig.E.4. The ERB map errors are plotted in Barks to facilitate comparison. The rms error of the conformal map fit to the ERB scale increases nearly linearly with log-sampling-rate. The ERB-scale error increases very smoothly with frequency while the Bark-scale error is non-monotonic (see Fig.E.4). The smoother behavior of the ERB errors appears due in part to the fact that the ERB scale is defined analytically while the Bark scale is defined more directly in terms of experimental data: The Bark-scale fit is so good as to be within experimental deviation, while the ERB-scale fit has a much larger systematic error component. The peak error in Fig.E.13 also grows close to linearly on a log-frequency scale and is similarly two to three times the Bark-scale errors of Fig.E.4.

Figure: ERB frequency mapping errors versus frequency for the sampling rate $ 31$ kHz.

The frequency mapping errors are plotted versus frequency in Fig.E.14 for a sampling rate of $ 31$ kHz. Unlike the Bark-scale case in Fig.E.5, there is now a visible difference between the weighted equation-error and optimal least-squares mappings for the ERB scale. The figure shows also that the peak error when warping to an ERB scale is about three times larger than the peak error when warping to the Bark scale, growing from 0.64 Barks to 1.9 Barks. The locations of the peak errors are also at lower frequencies (moving from 1.3 and 8.8 kHz in the Bark-scale case to 0.7 and 8.2 kHz in the ERB-scale case).

ERB Relative Bandwidth Mapping Error

Figure: ERB RBME for $ f_s= 31$ kHz, with explicit minimization of RBME.

The optimal relative bandwidth-mapping error (RBME) for the ERB case is plotted in Fig.E.15 for a $ 31$ kHz sampling rate. The peak error has grown from close to 20% for the Bark-scale case to more than 60% for the ERB case. Thus, frequency intervals are mapped to the ERB scale with up to three times as much relative error (60%) as when mapping to the Bark scale (20%). The continued narrowing of the auditory filter bandwidth as frequency decreases on the ERB scale results in the conformal map not being able to supply sufficient stretching of the low-frequency axis. The Bark scale case, on the other hand, is much better provided at low frequencies by the first-order conformal map.

Figure E.16: RMS and peak relative-bandwidth-mapping errors versus sampling rate for Chebyshev, least squares, weighted equation-error, and arctangent optimal maps, with explicit minimization of RBME used in all optimizations. The peak errors form a group lying well above the lower lying rms group.

Figure E.16 shows the rms and peak ERB RBME as a function of sampling rate. Near a 10 kHz sampling rate, for example, the Chebyshev ERB RBME is increased from 12% in the Bark-scale case to around 37%, again a tripling of the peak error. We can also see in Fig.E.16 that the arctangent formula gives a very good approximation to the optimal Chebyshev solution at all sampling rates. The optimal least-squares and weighted equation-error solutions are quite different, with the weighted equation-error solution moving from being close to the least-squares solution at low sampling rates, to being close to the Chebyshev solution at the higher sampling rates. The rms error is very similar in all four cases, as it was in the Bark-scale case, although the Chebyshev and arctangent formula solutions show noticeable increase in the rms error at low sampling rates where they also show a reduction in peak error by 5% or so.

Arctangent Approximations for $ \rho ^*(f_s)$ , ERB Case

For an approximation to the optimal Chebyshev ERB frequency mapping, the arctangent formula becomes

$\displaystyle \rho ^*_{\mathbf\gamma}(f_s) = 0.7446\left[{2\over\pi}\arctan(0.1418f_s)\right]^{{1\over2}}+0.03237,

where $ f_s$ is in kHz. This formula is plotted along with the various optimal $ \rho ^*$ curves in Fig.E.12a, and the approximation error is shown in Fig.E.12b. The performance of the arctangent approximation can be seen in Fig.E.13.

When the optimality criterion is chosen to minimize relative bandwidth mapping error in the ERB case, the arctangent formula optimization yields

$\displaystyle \rho ^*_{\mathbf\gamma}(f_s) = 0.7164\left[{2\over\pi}\arctan(0.09669f_s)\right]^{{1\over2}}+0.08667.

The performance of this formula is shown in Fig.E.16. It follows the optimal Chebyshev map parameter very well.

Next Section:
Directions for Improvements
Previous Section:
Application to Audio Filter Design