On Dec 8, 7:32�pm, "manishp" <58525@dsprelated> wrote:
> Sirs,
>
> I would like to know the reason for having different transforms (fourier,
> cosine, z transform etc.)
>
> are all these related to conversion from time domain to frequency domain?
>
> Thanks,
Windowed Fourier transforms are basically "coherent state" transforms
adapted to the Heisenberg group (i.e. the translation group for the
time-frequency plane). The affine group (or group adapted to the time-
scale plane) yields coherent states that correspond to wavelets.
Windowed Fourier is relatively simple as are the other transforms in
that family. They suffer a rather large problem: they're adapted to a
linear scale and they use a fixed-size windowing. The reason that's
bad is because the amount of action that takes place at a given
frequency is proportional to the number of cycles that happen at that
frequency. So a time window should be of a size inversely proportional
to the frequency. By the Heisenberg relation that means the frequency
window should be proportional to the frequency -- i.e. the frequency
should be put on a logarithmic scale. In other words: as octaves.
Time-scale transforms (meaning wavelets) fix that problem. But they
also have their own problems: badly shaped windows.
This generalizes to arbitrary symmetry groups (like the Euclidean
group for space or Galilean or Poincare' group for space-time). In
that case, the transforms can be used to extract objects in a manner
that is robust against symmetry transforms. For the Euclidean group
that means, for instance, the ability to extract letters based on a
template, independently of how the letter is sheared, resized,
rotated, flipped or shifted. For moving objects, it means the ability
to accurately gauge (and target) objects (like missiles).
The hybrid of the time-frequency and time-scale are the S-transforms.
They have some rather unusual and extremely properties that even the
research literature doesn't (yet) know of. Their main disadvantage is
the difficulty in getting them to run efficiently on computer, though
that's no longer a problem like it once was.
All these transforms have inverses.
I'll do a quick run up to that, since I have an interest in it right
now, because of the above-mentioned "unusual and heretofore unknown
properties." Among other things, it recovers the concept (well-known
to physicists) of "instantaneous frequency" and it leads directly to a
*non-linear* transform that removes the problem of spectral leakage.
Here's the run up. Use the following notations: 1^x = exp(2 pi x).
Fourier transforms written as:
f#(n) = integral f(t) 1^{-nt} dt, f(t) = integral f#(n) 1^{nt} dn.
The short-time Fourier transforms use a windowing function g(t) and
are defined by:
f_F(q, p) = integral f(t) (g(t - q) 1^{p(t - q)})* dt
q = time-domain location, p = frequency-domain location.
()* = conjugation
The inverse is
f(t) = integral f_F(q, p) g(t - q) 1^{p(t - q)} dq dp
which requires the condition integral |g(t)|^2 dt = 1, which is
usually achieved by suitably rescaling the function g(t).
Normally in the literature, you see the transforms written only as
f_F(q, p) = integral f(t) (g(t - q) 1^{pt})* dt
f(q, p) = integral f_F(q, p) g(t - q) 1^{pt} dq dp
where the extra factor 1^{-pq} is lost. It's better to keep it in,
since with it in, the phase of monochromatic signals is kept intact.
The points (q, p) make up the time-frequency plane. So, this provides
a time-frequency spectrum for f(t). Normally you only map the
amplitude |f_F(q, p)| or its square -- which is a really bad thing to
do!
It's better to map the amplitude as brightness and the phase as color.
Then you'll end up seeing some rather interesting (and revealing)
patterns. Colorizing the transform shows the first signs of the
emergence of the Holy Grail that I'm leading up to.
The time-scale transforms work in the time-scale plane, with
coordinates (q, s) where now p is replaced by "scale" s. Since
integrals have the measure (dq ds/s^2), it's better to replace s by p
= 1/s, and treat this like the time-frequency plane. In that case, the
transform is
f_W(q, p) = integral f(t) (|p|^{1/2} g(p(t - q)))* dt
f(q, p) = integral f_W(q, p) |p|^{1/2} g(p(t - q)) dq dp.
The condition is that the one which forces you to make weird choices
for the function g:
integral |g#(n)|^2 dn/|n| = 1.
But the time-windowing is now inversely proportional to p. So it's
working with octaves.
The S-transform fixes the problem with g. In its absolutely most
general form it is defined by
f_S(q, p) = integral f(t) (|p| g(p(t - q)))* dt
and has inverse
f(t) = integral f_S(q, p) 1^{p(t - q)} dq dp
The condition is that the Fourier-transform g#(1) = 1. Consequently,
it is more common to rewrite the windowing function with an extra
factor 1^{p(t - q)} taken out. So the transform then becomes:
f_S(q, p) = integral f(t) |p| (g(p(t - q)) 1^{p(t - q)})* dt.
Under this revision, the sole condition on g is that
1 = g#(0) = integral g(t) dt.
In the literature (as with the short-time Fourier transform), the
1^{p(t - q)} is replaced by 1^{pt}. This simplifies the formula for
the inverse.
This, like the wavelet and Fourier transform, has discrete forms. It's
difficult (but not impossible) to implement S-transforms efficiently.
The best way that comes to mind (and what I'm presently using) is to
do the transform on a p-by-p basis -- in the time domain -- rescaling
f(t) to suit the frequency p and using a simple lookup table for g(t).
Then the integral would be written as
f_S(q, p) = integral f(q + L/p) g(L) 1^{-L} dL.
In particular, the windowing function g(L) = 1 for L between -1/2 and
1/2; g(L) = 0 else has the effect of matching a *single cycle*
centered on t = q to the wave form. This preserves the frequency. For
a monochrome wave f(t) = A 1^{nt + C} it produces the transform f_S(q,
p) = A 1^{nq + C} sinc(n/p - 1).
The same wave form is found when tuning into it at any frequency p in
the same ballpark as n and -- when the phase is color-coded as
mentioned above -- the stripe patterns are clearly seen, and oscillate
at the same frequency, no matter what frequency you tune into them at.
It only needs to be in the same ballpark (because of the sinc factor).
So, under the color-coded spectrograph, it shows up clearly as a
distinct object and you can easily separate it out from whatever other
objects its overlaid on top of. For color-coded phases, you see
clearly-distinguishable candy-stripe patterns corresponding to those
points where there are sound elements.
The reason this happens with the S-transform more than with the others
is due to an unusual property of the transform that hasn't seen the
light of the literature to date. Recall the Physicists' definition
of ...
Instantaneous Frequency = Rate of Change of Phase.
This applies to any waveform. But it applies especially well when the
waveform has been segregated into different bands. The segregation
need not be exact (because of the above-mentioned "stability in the
same ballpark" property). Even the sloppiest segregation will yield
separation of the component objects.
The actual formula for the instantaneous frequency is one seen in
quantum theory. Consider the complex value z = root(A) 1^B. Its
conjugate is z* = A 1^{-B}. Their differentials are:
dz = (dA/A + 2 pi i dB) z
dz* = (dA/A - 2 pi i dB) z*
Thus, combining, you get:
z* dz - dz* z = 4 pi i A^2 dB = 4 pi i z* z dB.
Consequently, the instantaneous frequency n = dB/dt comes out of the
expression
n = 1/(4 pi i z* z) (z* dz/dt - dz*/dt z)
which is basically the same formula used for defining the matter
current density in quantum theory.
The reason this is relevant for the S-transform is that it's lurking
behind the scenes there -- the S-transform has a Parseval identity and
it directly involves the instantaneous frequency.
Go back to the monochrome wave. Its amplitude was modulated to A
sinc(n/p - 1). The amplitude squared, A^2, can be recovered by
integrating with respect to (n/p - 1).
A^2 = integral (A sinc(n/p - 1))^2 d(n/p - 1)
= integral (A sinc(n/p - 1))^2 n/p^2 dp.
= integral f_S(q, p) n/p^2 dp.
This yields the amplitude at a particular time q. Integrating over all
q yields the total "energy" of the wave -- or would, if the wave were
localized (monochrome waves are not). Nonetheless, this generalizes.
The key to the generalization is to replace n by the instantaneous
frequency. The resulting formula is:
integral f(t)* F(t) dt = integral 1/(4 pi i) f_S(q, p)* (d/dq - d/
dq*) F_S(q, p) dp/p^2 dq
where the d/dq* applies to the left to f_S(q, p)*. For a single wave
form this yields:
integral |f(t)|^2 dt = integral |f_S(q, p)|^2 n(q, p) dp/p^2 dq.
So the instantaneous frequency is built right into the very core of
the S-transform, literally.
Finally, this leads up to the Holy Grail. Since the natural frequency
of this component is n(q, p), then it is just as natural to redraw the
spectrograph by moving this amplitude up from frequency p to frequency
n.
What this does, as a result, is essentially plug up the "spectral
leakage" and refocuses the waveform component back to its natural
frequency. The above integral and Parseval identity ultimately yield
the appropriate formula for making that conversion. The re-defined
spectral density is given by
rho(q, n) = integral |f_S(q, p)|^2 delta(n - n(q, p)) dp
which is a non-linear transform conditioned by the S-transform.
This is what I'm in the process of setting up right now.