###

Inverse Transforming STFT Bin Groups

The solution proposed in [265] is to compute
*multiple* time samples for each high-frequency channel, so that
one hop of the FFT produces all needed samples in each band. That way
all channels can use the same hop-size without redundancy. If the
narrowest band gets one sample per hop, then a band
times as wide
must produce
samples per hop.

A fast way to compute multiple time samples from the frequency-samples
(FFT bins) of a given band is an inverse FFT, as shown in Fig.10.30.
In matlab notation, let `X(1:N)` denote the FFT (length `N`) of the
current frame of data in the STFT, and denote the lower and upper
spectral samples for band `k` by `lsn(k)` and `hsn(k)`,
respectively. Then we may specify the matlab computation of the
full-rate time-domain signal corresponding to band `k` as follows:

BandK = X(lsn(k):hsn(k)); z1 = zeros(1,lsn(k)-1); z2 = zeros(1,N-hsn(k)); BandKzp = [z1, BandK, z2]; % (1) x(k,:) = ifft(BandKzp);where

`x(k,:)`denotes the output signal vector (length

`N`) for the

`k`th filter-bank channel for the current time-domain STFT window.

Let denote the number of FFT bins in band . When is a power of 2, we can apply an inverse FFT only to the nonzero samples in the band:

xd{k} = ifft(BandK);where

`xd{k}`now denotes a

*cell array*for holding the ``downsampled'' signal vectors (since the downsampling factor is typically different for different bands).

We may relate `x(k,:)` and `xd{k}` by noting that, when
`Lk = N/Nk` is an integer, we have that the relation

BandK == alias(BandKzp,Lk) % (2)is true, for each element, where

`alias(x,K)`denotes aliasing of

`K`equal partitions of the vector

`x`(§2.3.12):

^{11.17}

function y = alias(x,L) Nx = length(x); Np = Nx/L; % aliasing-partition length x = x(:); % ensure column vector y = zeros(Np,1); for i=1:L y = y + x(1+(i-1)*Np : i*Np); end

By the *aliasing theorem* (*downsampling theorem*) for
Discrete Fourier Transforms (DFTs) (§2.3.12), the relation
`(2)` in the frequency domain corresponds to

xd{k} == Lk * x(k,1:Lk:end)in the time domain,

*i.e.*,

`xd{k}`is obtained from

`x(k,:)`by means of

*downsampling*by the factor

`Lk`. This produces

`N/Lk == Nk`samples. That is, for a band that is

`Nk`bins wide, we obtain

`Nk`time-domain samples for each STFT frame, when critically sampled. (At the full rate, we obtain

`N`samples from each channel for each frame.)

We see that taking an inverse FFT of only the bins in a given channel
computes the critically downsampled signal for that channel.
Downsampling factors between 1 and `Lk` can be implemented by choosing
a zero-padding factor for the band somewhere between 1 and `Lk`
samples.

Note that this filter bank has the *perfect reconstruction* (PR)
property [287]. That is, the original input signal
can be exactly reconstructed (to within a delay and possible scale
factor) from the channel signals.
The PR property follows immediately
from the exact invertibility of the discrete Fourier transform.
Specifically, we can recover the
original signal frame by taking an FFT of each channel-signal frame
and abutting the spectral bins so computed to form the original frame
spectrum. Of course, the underlying STFT (uniform filter bank) must
be PR as well, as it routinely is in practice.

**Next Section:**

Improving the Channel Filters

**Previous Section:**

Summing STFT Bins