Wavelets II - Vanishing Moments and Spectral Factorization

Vincent HerrmannOctober 11, 2016

In the previous blog post I described the workings of the Fast Wavelet Transform (FWT) and how wavelets and filters are related. As promised, in this article we will see how to construct useful filters. Concretely, we will find a way to calculate the Daubechies filters, named after Ingrid Daubechies, who invented them and also laid much of the mathematical foundations for wavelet analysis.

Besides the content of the last post, you should be familiar with basic complex algebra, the Fourier transform, Taylor series and binomial coefficients. Just to get started, we recall that the scaling function and the wavelet function are defined by the filter coefficients and via the dilation/refinement equations:

This article is available in PDF format for easy printing

Vanishing Moments and Continuous Functions

The idea of vanishing moments is the additional ingredient we need to create our wavelet functions. In our implementation of the FWT, the input signal is split in two parts. As limit for the late stages, one part was analyzed with and one was analyzed with . If is orthogonal to , i.e. , then the whole signal is represented by the half and can nevertheless be perfectly reconstructed. This scenario is of course great for compression, and in general for data manipulation or analysis. So one measure of quality of the wavelet function is, to how many important functions it stands orthogonal to. In the case of so-called vanishing moments these important target function are polynomials.

The th moment of a function is defined as:

If this moment vanishes, then is orthogonal to . If we say a function has vanishing moments, we usually mean it has vanishing moments for . Because of the linearity of the integration operation, this means that the function is orthogonal to all polynomials up to degree :

If the wavelet function has a certain number of vanishing moments, then discretely shifted versions of can be used to reconstruct all the polynomials that stands orthogonal to. For this is clear: we can analyze a constant signal with the Haar scaling function, which is simply a rectangle. The synthesis scaling function looks exactly the same and the signal can be reconstructed.

vanishing moments daubechies 2 functions
vanishing moments daubechies 2 sum
Figure 1 The separate synthesis Haar scaling functions of a constant signal (top) and their sum reconstructing a segment of the signal (bottom)

With , things are not as obvious. A linear signal cannot be constructed by simply adding together fixed line segments since it has to work for all slopes. But the Daubechies 4 (DB4) functions that we saw in the last post have exactly this property, i.e. two vanishing moments.

vanishing moments daubechies 4 functions
vanishing moments daubechies 4 sum
Figure 2 The separate synthesis DB4 scaling functions of a linear signal (top) and their sum reconstructing a segment of the signal (bottom)

It seems a bit strange that these odd spiky scaling functions sum up to a smooth straight line. But if we add one at a time, we see that they fit together like the pieces of a puzzle to reconstruct a (shifted) segment of the original signal. The DB6 wavelet has three vanishing moments (). So any quadratic signal can be constructed with the DB6 scaling function:

vanishing moments daubechies 6 functions
vanishing moments daubechies 6 sum
Figure 3 The separate synthesis DB6 scaling functions of a quadratic signal (top) and their sum reconstructing a segment of the signal (bottom)

Polynomials of high degrees can approximate many different signals. This means a wavelet function with many vanishing moments stands orthogonal or almost orthogonal to a wide range of different signals. The corresponding scaling function in turn has the expressive power to reconstruct all these signals. This is of course a highly desirable property, and there exist wavelets with any number of vanishing moments. In the rest of this post we will see how to construct the filters resulting in these wavelets. The downside is, as it turns out, that for more vanishing moments we need longer filters in the FWT implementation.

Vanishing Moments and Discrete Filters

In a FWT we do not explicitly use the functions and , instead they emerge from the filters and . We have to transfer the idea of vanishing moments to these filters. To do this, we plug the wavelet equation into the moment definition and let the moments vanish:

We assume that has the necessary moments, but they are not zero. Now we create a new variable :

Let’s look at for the first few moments:


The integral is invariant to changes in , so is constant for all . With that, the requirement for one vanishing moment is:


The second term is zero, we know that from the condition for the th vanishing moment. From the second term follows the new condition for vanishing moment :


Now both the second and third term are zero, based on the previous conditions. And the new condition for vanishing moment is:

In general, for every new vanishing moment , there will be a term proportional to . All other terms will have a factor with , those will automatically zero, based on the previous conditions. With that we have the condition for filter coefficients of a wavelet function with vanishing moments:

Or, because , see :

We see from that the concept of vanishing moments is directly applicable to discrete filters, stands orthogonal to all polynomials up to degree . This means, in a Fast Wavelet Transform (FWT) setup those functions are arbitrarily compressible!

Filter Construction in the Frequency Domain

Recall that we only need to construct for our FWT, the other three filters , and then emerge from the rules - . For we now have two constraints:

  • , this is the shift orthogonality condition that guarantees perfect reconstruction.

  • , which we just saw, where we can set to be the number of vanishing moments we want.

These two conditions yield the famous Daubechies wavelets. Unfortunately, the construction is quite fiddly, we will need some complex algebra and Fourier analysis, a bit of number theory and a little combinatorics. But I hope this is interesting and in the end we will arrive at a relatively simple algorithm. If you want, you can now first take a look at this algorithm and then read on from here to find out why it works.

We now have to move to the frequency domain, and because some trigonometric identities will turn out to be helpful we use the Fourier transform instead of the z-transform. By definition, the Fourier transform of is

We already know the z-transform of the shift orthogonality condition. This was equation from the last article. Converting between the z- and Fourier transform is done using and therefore as well as . Because are real values, there also is and . This means we can write

Now we need to convert the vanishing moment condition to the frequency domain. We can write the th derivative of as

If we set , we get . This means

is equivalent to and .

The easiest way to achieve would be setting to . This unfortunately interferes with . But if we multiply with an arbitrary function , then still holds because of the product rule. We want our coefficients to be real values, this means would have to be a polynomial with real coefficients in as well. This means we can set

and only need to search for a function so that is satisfied (we have divided the base of the first factor by 2, you will see why straightaway).

Now we need our trigonometric identities. in combination with gives us

We can see directly from that . This makes an even function, which means it is symmetric with respect to and can be expressed in terms of only cosine-functions. The same has then to be true for . With and being a polynomial we can write:

Now we can write the shift orthogonality condition as

If we set , which also means , we can write this as

Bezout’s Identity

We are now probably at the most abstract point of the wavelet construction. Why all this effort to get a equation in the form of ? The reason is number theoretical result called Bézout’s identity. You can read about it further elsewhere, but in our case it says that there exists a polynomial of degree so that is true. How exactly it looks like, however, we have to find out ourselves. First we solve for :

The second term has a factor , so there is no way we can model it with a polynomial of degree . The best we can do is create a Taylor expansion at of the first term . For a function , a Taylor approximation at means

Our is as we said. We can write out its th derivative:

This means the coefficient of in our Taylor series is

Now we can write out our solution for :

Verifying that this indeed is a solution of is not trivial, but it can be done. We define

We now need to proof that is for all .

Here we have multiplied every term with , which of course is a legitimate thing to do. If we expand these products, we can write

Let’s just look at :

From this we see that

Now we have

Because this means

and by induction .

Spectral Factorization

Now it is time go back and basically reverse all steps to get our coefficients . With the result and we have

We have defined , this means

From this we need to find . Like any polynomial, can be factorized as

where are the roots of and is a scaling constant. This obviously means

From in combination with we see that , which means and . To get this, we wet set . If we compare and , we see that they fit together: The roots in come in reciprocal pairs, which is just what we want for . To get , we need one half of the roots of . If we don’t want to find the roots of a rational function, we can define the polynomial

which means and find the roots of . We can do this because the factor leaves the existing roots untouched and does not introduce any new roots, since itself has none.

As we said, for any root of , its reciprocal will automatically be a root too. To get the correct , we have to select only one of each pair as . If a root is complex, we also have select its complex conjugate, which will also be a root, to get real coefficients. A simple way to meet both of these conditions is to only select the roots inside the unit circle, meaning its absolute value is less than or equal to 1. This will give us the original Daubechies coefficients. Because has roots, will have a degree of . With the roots selected, we have

and with

Expanding this into the form

gives us the filter coefficients we are looking for. As we see, a Daubechies filter with vanishing moments has coefficients.

Algorithm for calculating Daubechies coefficients

  • Define as the number of vanishing moments you want from your filters
  • Calculate the roots of the polynomial . Do this with your favourite math library, or implement an algorithm that finds the roots using Newton’s method.
  • Set , where , to the roots inside the unit circle, i.e. must be true for all .
  • Expand the polynomial into the form . The values are the coefficients of the lowpass analysis filter.

Example: Calculating the DB6 coefficients

We set , this give us the polynomial

As we see, the real part is symmetric and the imaginary part is antisymmetric with respect to the real axis. Two roots (the small white spheres) lie inside, two outside the unit circle.

quadratic signal and daubechies 6 synthesis functions

quadratic signal and daubechies 6 synthesis functions
Figure 4 The $Q(z)$ polynomial for the DB6 coefficients (red: real, blue: imaginary) and its roots

We can ask WolframAlpha for the roots and get the solutions , , and . Actually, in this case there exist exact solutions, but they are a bit unpractical for this demonstration.

We set and because they are complex conjugates and have an absolute value smaller than 1. Now we have

That gives us the DB6 coefficients:


We saw the significance of vanishing moments to continuous wavelets as well as to the filters of a FWT implementation (these filters define wavelets with the same number of vanishing moments via ). They make sections of signals that resemble polynomials highly compressible and very easy to handle. We then derived an algorithm to calculate Daubechies filter with any number of vanishing moments. This was done by bringing the shift orthogonality and the vanishing moment condition to the frequency domain, resulting in equations and . We combined them to , which we were able to solve using Bezout’s identity. To get back to , we used a procedure called spectral factorization. We factorized using the roots of the polynomial we found earlier.

In practice, you probably won’t need to calculate the Daubechies coefficients, they are available online from DB2 to at least DB20. But there are many other wavelets with pros and cons for different applications. With these two articles you should have the background to understand most papers explaining them, or even invent new wavelets yourself.

For the next post, I am planning to write about the Dual Tree Wavelet Transform, which uses complex wavelets. It is currently mainly used in image recognition, but I think it also has great potential for audio processing. But I am open to suggestions for the next topic as well! Again, any feedback is appreciated. Thank you for reading!

Useful Resources

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: