Sign in

username:

password:



Not a member?

Search Online Books



Search tips

Free Online Books



Chapters

See Also

Embedded SystemsFPGAElectronics
Chapter Contents:

Search Spectral Audio Signal Processing

  

Book Index | Global Index


Would you like to be notified by email when Julius Orion Smith III publishes a new entry into his blog?

  

Inverse Transforming STFT Bin Groups

The solution proposed here is to compute multiple time samples for each high-frequency channel, so that one hop of the FFT produces all needed samples in each band, so that all channels can use the same hop size without redundancy. If the narrowest band gets one sample per hop, then a band $ N$ times as wide must produce $ N$ samples per hop.

To efficiently compute multiple time samples from the frequency-samples (FFT bins) of a given band, an inverse FFT can be used, as shown in Fig.9.32. In Matlab/Octave notation, let X(1:N) denote the FFT (length N) of the current frame of data in the STFT, and denote the lower and upper spectral samples for band k by lsn(k) and hsn(k), respectively. Then we may compute (in matlab) the full-rate time-domain signal corresponding to band k as follows:

BandK = X(lsn(k):hsn(k));
z1 = zeros(1,lsn(k)-1);
z2 = zeros(1,N-hsn(k));
BandKzp = [z1, BandK, z2];     % (1)
x(k,:) = ifft(BandKzp);
where x(k,:) denotes the output signal vector (length N) for the kth filter-bank channel for the current time-domain STFT window.

Let $ \texttt{Nk}= \texttt{hsn(k)} - \texttt{lsn(k)} + 1$ denote the number of FFT bins in band $ k$. When $ \texttt{Nk}$ is a power of 2, we can apply an inverse FFT only to the nonzero samples in the band:

xd{k} = ifft(BandK);
where xd{k} now denotes a cell array for holding the ``downsampled'' signal vectors (since the downsampling factor is typically different for different bands).

We may relate x(k,:) and xd{k} by noting that, when Lk = N/Nk is an integer, we have that the relation

BandK == alias(BandKzp,Lk)     % (2)
is true, for each element, where alias(x,K) denotes aliasing of K equal partitions of the vector x:10.12
function y = alias(x,L)
  Nx = length(x);
  Np = Nx/L; % aliasing-partition length
  x = x(:);  % ensure column vector
  y = zeros(Np,1);
  for i=1:L
    y = y + x(1+(i-1)*Np : i*Np);
  end

By the aliasing theorem (downsampling theorem) for Discrete Fourier Transforms (DFTs) (§2.3.12), the relation (2) in the frequency domain corresponds to

xd{k} == Lk * x(k,1:Lk:end)
in the time domain, i.e., xd{k} is obtained from x(k,:) by means of downsampling by the factor Lk. This produces N/Lk == Nk samples. That is, for a band that is Nk bins wide, we obtain Nk time-domain samples for each STFT frame, when critically sampled. (At the full rate, we obtain N samples from each channel for each frame.)

We see that taking an inverse FFT of only the bins in a given channel computes the critically downsampled signal for that channel. Downsampling factors between 1 and Lk can be implemented by choosing a zero-padding factor for the band somewhere between 1 and Lk samples.

Note that this filter bank has the perfect reconstruction (PR) property [266]. That is, the original input signal can be exactly reconstructed (to within a delay and possible scale factor) from the channel signals. The PR property follows immediately from the exact invertibility of the FFT. Specifically, we can recover the original signal frame by taking an FFT of each channel-signal frame and abutting the spectral bins so computed to form the original frame spectrum. Of course, the underlying STFT (uniform filter bank) must be PR as well, as it routinely is in practice.



Subsections
Previous: Summing STFT Bins
Next: Improving the Channel Filters

Order a Hardcopy of Spectral Audio Signal Processing


About the Author: Julius Orion Smith III
Julius Smith's background is in electrical engineering (BS Rice 1975, PhD Stanford 1983). He is presently Professor of Music and Associate Professor (by courtesy) of Electrical Engineering at Stanford's Center for Computer Research in Music and Acoustics (CCRMA), teaching courses and pursuing research related to signal processing applied to music and audio systems. See http://ccrma.stanford.edu/~jos/ for details.


Comments


No comments yet for this page


Add a Comment
You need to login before you can post a comment (best way to prevent spam). ( Not a member? )