Sign in

username:

password:



Not a member?

Search Online Books



Search tips

Free Online Books



Chapters

See Also

Embedded SystemsFPGAElectronics
Chapter Contents:

Search Spectral Audio Signal Processing

  

Book Index | Global Index


Would you like to be notified by email when Julius Orion Smith III publishes a new entry into his blog?

  

Normalized STFT Basis

The Short Time Fourier Transform (STFT) is defined as a time-ordered sequence of DTFTs, and implemented in practice as a sequence of FFTs (see §6.1). Thus, the signal basis functions are naturally defined as the DFT-sinusoids multiplied by time-shifted windows, suitably normalized for unit $ L2$ norm:

$\displaystyle \varphi_{mk}(n) \isdef
\frac{w(n-mR)e^{j\omega_k n}}{\left\Vert\...
...cdot)}\,\right\Vert}
= \frac{w(n-mR) e^{j\omega_k n}}{\sqrt{\sum_n{w^2(n)}}},
$

$\displaystyle \omega_k = \frac{2\pi k}{N}, \quad
k \in [0,N-1], \quad
n\in (-\infty,\infty),\quad w(n)\in{\cal R},
$

and $ N$ is the DFT length.

When successive windows overlap (i.e., the hop size $ R$ is less than the window length $ M$), the basis functions are not orgthogonal. In this case, we may say that the basis set is overcomplete.

The basis signals are orthonormal when $ R=M=N$ and the rectangular window is used ($ w=w_R$). That is, two rectangularly windowed DFT sinusoids are orthogonal when either the frequency bin-numbers or the time frame-numbers differ, provided that the window length $ M$ equals the number of DFT frequencies $ N$ (no zero padding). In other words, we obtain an orthogonal basis set in the STFT when the hop size, window length, and DFT length are all equal (in which case the rectangular window must be used to retain the perfect-reconstruction property). In this case, we can write

$\displaystyle \varphi_{mk}= \hbox{\sc Shift}_{mN}\left[\hbox{\sc ZeroPad}_\infty\left(\varphi_k ^{\hbox{\tiny DFT}}\right)\right],
$

i.e.,

$\displaystyle \varphi_{mk}(n) = \left\{\begin{array}{ll}
\frac{e^{j\omega_k n}}...
...mN \leq n \leq (m+1)N-1 \\ [5pt]
0, & \mbox{otherwise.} \\
\end{array}\right.
$

The coefficient of projection can be written
$\displaystyle \displaystyle
\left<\varphi_{mk},x\right>$ $\displaystyle =$ $\displaystyle \frac{1}{\sqrt{N}} \sum_{n=-\infty}^{\infty}
x(n) w_R(n-mN) e^{-j\omega_k n}$  
  $\displaystyle \isdef$ $\displaystyle \frac{\hbox{STFT}_{N,m,k}(x)}{\sqrt{N}} \isdefs \frac{X_m(\omega_k )}{\sqrt{N}}$  

so that the signal expansion can be interpreted as
$\displaystyle \displaystyle
x(n)$ $\displaystyle =$ $\displaystyle \sum_{m=-\infty}^{\infty}\sum_{k=0}^{N-1} \left<\varphi_{mk},x\right> \varphi_{mk}(n)$  
  $\displaystyle =$ $\displaystyle \sum_{m=-\infty}^{\infty}
w_R(n-mN)\frac{1}{N}\sum_{k=0}^{N-1} X_m(\omega_k )e^{j\omega_k n}$  
  $\displaystyle =$ $\displaystyle \sum_{m=-\infty}^{\infty}
\hbox{\sc Shift}_{mN,n}\left\{\hbox{\sc ZeroPad}_\infty\left[\hbox{DFT}_N^{-1}(X_m)\right]\right\}$  
  $\displaystyle \isdef$ $\displaystyle \hbox{STFT}_{N,n}^{-1}(X)$  

In the overcomplete case, we get a special case of weighted overlap-add7.6):

$\displaystyle \displaystyle
x(n)$ $\displaystyle =$ $\displaystyle \sum_{m=-\infty}^{\infty}\sum_{k=0}^{N-1} \left<\varphi_{mk},x\right> \varphi_{mk}(n)$  
  $\displaystyle =$ $\displaystyle \sum_{m=-\infty}^{\infty} \frac{1}{N}\sum_{k=0}^{N-1} X_m(\omega_k ) w(n-mN)e^{j\omega_k n}$  


Previous: Normalized DTFT Basis
Next: Continuous Wavelet Transform

Order a Hardcopy of Spectral Audio Signal Processing


About the Author: Julius Orion Smith III
Julius Smith's background is in electrical engineering (BS Rice 1975, PhD Stanford 1983). He is presently Professor of Music and Associate Professor (by courtesy) of Electrical Engineering at Stanford's Center for Computer Research in Music and Acoustics (CCRMA), teaching courses and pursuing research related to signal processing applied to music and audio systems. See http://ccrma.stanford.edu/~jos/ for details.


Comments


No comments yet for this page


Add a Comment
You need to login before you can post a comment (best way to prevent spam). ( Not a member? )