DSPRelated.com
Forums

Is multistage rate-changing actually more efficient?

Started by Oli Charlesworth July 8, 2009
Hello all,

This is a question which probably boils down to "what am I missing?".

It seems to be commonly stated that for interpolation or decimation,
it's generally more computationally efficient to perform the operation
in multiple stages, i.e. factorise the rate-change factor, and design a
lower-order filter for each stage.  I completely accept that designing
the individual filters in this way achieves a lower complexity, but I'm
beginning to question whether one actually needs to implement the rate
change in separate stages.

Consider a system where we need to decimate by L = L1.L2.  (For the sake
of argument, let's assume that neither L1 nor L2 equals 2, so that we
can't "cheat" by implementing half-band filters.)  Using the typical
approach, we might come up with two decimation filters, H1 and H2, of
length N1 and N2 respectively.  i.e. our system is:

--> H1 --> v L1 --> H2 --> v L2 -->

If implemented as polyphase filters, the number of operations per input
sample is (N1 / L1) + (N2 / L1.L2).

But we could construct an equivalent overall filter H as the convolution
of H1 and H2', where H2' is H2 simply upsampled by L1 by zero-stuffing.
 The equivalent overall system would be:

--> H --> v L1.L2 -->

The order of H is now approximately N1 + L1.N2.  Therefore, the number
of operations per input sample in this approach is (N1 + L1.N2) / L1.L2
= (N1 / L1.L2) + (N2 / L2).

It would seem, therefore, that depending on the particular values being
used, the single stage approach may be as efficient or more efficient
than the multi-stage approach.

What am I missing?  (Or, where have I made a mistake?)


-- 
Oli
On Jul 8, 5:10=A0pm, Oli Charlesworth <ca...@olifilth.co.uk> wrote:
> Hello all, > > This is a question which probably boils down to "what am I missing?". > > It seems to be commonly stated that for interpolation or decimation, > it's generally more computationally efficient to perform the operation > in multiple stages, i.e. factorise the rate-change factor, and design a > lower-order filter for each stage. =A0I completely accept that designing > the individual filters in this way achieves a lower complexity, but I'm > beginning to question whether one actually needs to implement the rate > change in separate stages. > > Consider a system where we need to decimate by L =3D L1.L2. =A0(For the s=
ake
> of argument, let's assume that neither L1 nor L2 equals 2, so that we > can't "cheat" by implementing half-band filters.) =A0Using the typical > approach, we might come up with two decimation filters, H1 and H2, of > length N1 and N2 respectively. =A0i.e. our system is: > > --> H1 --> v L1 --> H2 --> v L2 --> > > If implemented as polyphase filters, the number of operations per input > sample is (N1 / L1) + (N2 / L1.L2). > > But we could construct an equivalent overall filter H as the convolution > of H1 and H2', where H2' is H2 simply upsampled by L1 by zero-stuffing. > =A0The equivalent overall system would be: > > --> H --> v L1.L2 --> > > The order of H is now approximately N1 + L1.N2. =A0Therefore, the number > of operations per input sample in this approach is (N1 + L1.N2) / L1.L2 > =3D (N1 / L1.L2) + (N2 / L2). > > It would seem, therefore, that depending on the particular values being > used, the single stage approach may be as efficient or more efficient > than the multi-stage approach. > > What am I missing? =A0(Or, where have I made a mistake?) > > -- > Oli
Not sure why you can call this "single stage" approach because you have to virtually design H1 and H2 before you can implement this "single stage" approach. In fact, they are the same in terms of efficiency. Consider the conventional approach, i.e., --> H1 --> v L1 --> H2 --> v L2 --> If doing a "noble identity" conversion just using --> v L1 --> H2 You can get this: --> H1 --> H2(z^L1) --> v L1 --> vL2 --> (H2(z^L1) means the H2 upsampling by L1) Thus, H1(z)*H2(z^L1) is the H filter in your approach. So that you can have vL1.L2 too. I don't see why it can be "more efficient" than the multirate design. As an exmaple, say, you have a need to achieve downsample 12 =3D 3 * 4 (can't be 2*6, as you specified). You either can let H1 corresponds to L1=3D3 or L1=3D4 so that L2 is determined respectively. That is, L1=3D3: --> H1 --> v 3 --> H2 --> v 4 ---> L1=3D4: --> H2' --> v 4 ---> H1' --> v 3 --> Note that H2' doesn't equal to H2 because the decimation filters need to be alias-free. These anti-alias feature is what you mean "more efficient"?

Oli Charlesworth wrote:

> Hello all, > > This is a question which probably boils down to "what am I missing?".
This boils down to N versus log(N) operations, exactly like with the FFTs. Look into the classic books such as "Multirate Signal Processing" by Rabiner.
> It seems to be commonly stated that for interpolation or decimation, > it's generally more computationally efficient to perform the operation > in multiple stages, i.e. factorise the rate-change factor, and design a > lower-order filter for each stage.
Athough this is generally true, this doesn't account for the overhead associated with the setup of the each filtering stage. The overhead may or may not be substantial.
> I completely accept that designing > the individual filters in this way achieves a lower complexity, but I'm > beginning to question whether one actually needs to implement the rate > change in separate stages.
That entirely depends on the particularities of your application. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
On Jul 8, 5:10 pm, Oli Charlesworth <ca...@olifilth.co.uk> wrote:
> Hello all, >
.> This is a question which probably boils down to "what am I missing?". .> ... The following statement is false: .> But we could construct an equivalent overall filter H as the convolution .> of H1 and H2', where H2' is H2 simply upsampled by L1 by zero- stuffing. .> The equivalent overall system would be: .> ... .> What am I missing? (Or, where have I made a mistake?) .> .> -- .> Oli The multistage design of H1 has far to wide a transition band for the single stage scheme to work. (This relaxation in transition band is one of the major reasons for multistage implementation efficiency.) The zero-stuffed H2 does not have the correct response for the single stage filter design scheme given. Combining two filters as in the single stage scheme is not an efficient filter design method for comparison to a multistage design. You post enough to know how to query 'multistage decimation' to Google or search comp.dsp for a recommended text. There are a lot of explanations and examples available. Good luck Dale B. Dalrymple
dbd wrote:
> > The zero-stuffed H2 does not have the correct response for the single > stage filter design scheme given.
Yes, but when convolved with H1, it does. Would you not agree that by a Noble identity, the following are equivalent: --> H1 --> v L1 --> H2 --> v L2 --> --> H1 --> H2' --> v L1 --> v L2 --> i.e. we can push H2 through the decimator, where formally H2'(z) = H2(z^L1). If so, it is then a trivial step to then combine the two adjacent filters by convolution, and also combine the two adjacent decimators into one. The resulting system will be mathematically identical to the original two-stage implementation. (In fact, what I've just written mirrors precisely what Verictor has written in another reply.)
> Combining two filters as in the single stage scheme is not an > efficient filter design method for comparison to a multistage design. > > You post enough to know how to query 'multistage decimation' to Google > or search comp.dsp for a recommended text. There are a lot of > explanations and examples available.
I'm aware that designing a single filter straight off is less efficient than designing a separate filter for each stage, and I understand the reason for this is the inverse relationship between normalised transition bandwidth and filter length, and that normalised transition bandwidth increases with the decimation ratio. In that respect, I've done plenty of reading in the past. The problem I face is why the example I've given would appear to demonstrate otherwise! -- Oli
Verictor wrote:
> > Not sure why you can call this "single stage" approach because you > have to virtually design H1 and H2 before you can implement this > "single stage" approach.
Agreed. What I'm questioning is once one has the two filters, does one actually need to bother with two separate filters and two separate decimators.
> In fact, they are the same in terms of efficiency. Consider the > conventional approach, i.e., > > --> H1 --> v L1 --> H2 --> v L2 --> > > If doing a "noble identity" conversion just using > > --> v L1 --> H2 > > You can get this: > > --> H1 --> H2(z^L1) --> v L1 --> vL2 --> (H2(z^L1) means the H2 > upsampling by L1) > > Thus, H1(z)*H2(z^L1) is the H filter in your approach. So that you can > have vL1.L2 too.
Indeed. I completely agree with you up to this point.
> I don't see why it can be "more efficient" than the > multirate design.
Would you agree that: * the order of H1(z)*H2(z^L1) is approximately (N1+L1.N2)? (where the difference is negligible for large L1 and L2). * the number of operations per input sample of a polyphase decimation filter is N/L? From that, it is simple to obtain the number of operations per input sample of the two approaches (as before): (N1 / L1) + (N2 / L1.L2) for the two-stage approach (N1 / L1.L2) + (N2 / L2) for the combined approach Setting these as an inequality and rearranging, I find that when (N1/N2) < (L1 - 1)/(L2 - 1), the combined approach will have fewer operations.
> As an exmaple, say, you have a need to achieve > downsample 12 = 3 * 4 (can't be 2*6, as you specified). You either can > let H1 corresponds to L1=3 or L1=4 so that L2 is determined > respectively. That is, > > L1=3: --> H1 --> v 3 --> H2 --> v 4 ---> > L1=4: --> H2' --> v 4 ---> H1' --> v 3 --> > > Note that H2' doesn't equal to H2 because the decimation filters need > to be alias-free. These anti-alias feature is what you mean "more > efficient"?
I'm not entirely sure what you're trying to show with your example. Could you expand on this a little? -- Oli
Vladimir Vassilevsky wrote:
> > > Oli Charlesworth wrote: > >> This is a question which probably boils down to "what am I missing?". > > This boils down to N versus log(N) operations, exactly like with the > FFTs. Look into the classic books such as "Multirate Signal Processing" > by Rabiner.
I have "Multirate Signal Processing" by Harris, and I've thoroughly read "Multirate Systems and Filter Banks" by Vaidyanathan in the past. Would you recommend Rabiner's book as a useful addition?
>> It seems to be commonly stated that for interpolation or decimation, >> it's generally more computationally efficient to perform the operation >> in multiple stages, i.e. factorise the rate-change factor, and design a >> lower-order filter for each stage. > > Athough this is generally true, this doesn't account for the overhead > associated with the setup of the each filtering stage. The overhead may > or may not be substantial.
Agreed. In particular, two separate filters require significantly less storage for delay lines and filter coefficients than a combined approach. -- Oli
On Jul 9, 12:11 am, Oli Charlesworth <ca...@olifilth.co.uk> wrote:
.> dbd wrote:
.>
.> > The zero-stuffed H2 does not have the correct response for the
single
.> > stage filter design scheme given.
.>
.> Yes, but when convolved with H1, it does.

No. The H1 from an efficiently designed two stage version does not
have a narrow enough transition band to work with your H2' to form a
single stage decimator with the same pass and stop bands.

.> ...
.> The problem I face is why the example I've given would appear to
.> demonstrate otherwise!

You have compared the operations in a two stage filter that works (by
definition) for the desampling task and single stage filter that
doesn't meet specs.

.> Oli

If you have the information to, try actually specifying the filters
for an example before you compare. Or try it with one of the examples
in the references.

Dale B. Dalrymple
On Jul 9, 10:53=A0am, dbd <d...@ieee.org> wrote:
> On Jul 9, 12:11 am, Oli Charlesworth <ca...@olifilth.co.uk> wrote:.> dbd =
wrote:
> > .> > .> > The zero-stuffed H2 does not have the correct response for the > single > .> > stage filter design scheme given. > .> > .> Yes, but when convolved with H1, it does. > > No. The H1 from an efficiently designed two stage version does not > have a narrow enough transition band to work with your H2' to form a > single stage decimator with the same pass and stop bands.
I'm sorry, but yes it does work (try it!). The order of operations can quite clearly be rearranged as I suggested in my previous post (as did Verictor), with the result being mathematically equivalent. If you disagree, then which of the following steps is incorrect? (i.e. which transformation no longer results in a mathematically equivalent system?) 1.) --> H1(z) --> v L1 --> H2(z) --> v L2 --> 2.) --> H1(z) --> H2(z^L1) --> v L1 --> v L2 --> 3.) --> H1(z) --> H2(z^L1) --> v L1.L2 --> 4.) --> H1(z).H2(z^L1) --> v L1.L2 --> If you want to see it work in practice, here's some Matlab code: % Random filters h1 =3D randn(1, 10); h2 =3D randn(1, 15); % Decimation rates L1 =3D 2; L2 =3D 3; % Random samples x =3D randn(1,1000); %%%%%%%%%%%%%%%%%%% % Approach 1 - separate stages %%%%%%%%%%%%%%%%%%% % Filter by h1 t1 =3D filter(h1, 1, x); % Decimate t2 =3D t1(1:L1:end); % Filter by h2 t3 =3D filter(h2, 1, t2); % Decimate y1 =3D t3(1:L2:end); %%%%%%%%%%%%%%%%%%% % Approach 2 - combine %%%%%%%%%%%%%%%%%%% % Combined filter h =3D conv(h1, kron(h2, [1 zeros(1, L1-1)])); % Filter samples by h t4 =3D filter(h, 1, x); % Combined decimate y2 =3D t4(1:(L1*L2):end); %%%%%%%%%%%%%%%%%%% % Compare %%%%%%%%%%%%%%%%%%% cla, hold on, plot(y1, 'b.-'), plot(y2, 'r.-') -- Oli
On Jul 9, 12:13&#4294967295;am, Oli Charlesworth <ca...@olifilth.co.uk> wrote:
> Verictor wrote: > > > Not sure why you can call this "single stage" approach because you > > have to virtually design H1 and H2 before you can implement this > > "single stage" approach. > > Agreed. &#4294967295;What I'm questioning is once one has the two filters, does one > actually need to bother with two separate filters and two separate > decimators. > > > > > > > In fact, they are the same in terms of efficiency. Consider the > > conventional approach, i.e., > > > --> H1 --> v L1 &#4294967295;--> H2 --> v L2 --> > > > If doing a "noble identity" conversion just using > > > --> v L1 --> H2 > > > You can get this: > > > --> H1 --> H2(z^L1) --> v L1 --> vL2 --> &#4294967295; &#4294967295; &#4294967295; (H2(z^L1) means the H2 > > upsampling by L1) > > > Thus, H1(z)*H2(z^L1) is the H filter in your approach. So that you can > > have vL1.L2 too. > > Indeed. &#4294967295;I completely agree with you up to this point. > > > I don't see why it can be "more efficient" than the > > multirate design. > > Would you agree that: > * the order of H1(z)*H2(z^L1) is approximately (N1+L1.N2)? &#4294967295;(where the > difference is negligible for large L1 and L2). > * the number of operations per input sample of a polyphase decimation > filter is N/L? > > From that, it is simple to obtain the number of operations per input > sample of the two approaches (as before): > > (N1 / L1) + (N2 / L1.L2) for the two-stage approach > (N1 / L1.L2) + (N2 / L2) for the combined approach > > Setting these as an inequality and rearranging, I find that when (N1/N2) > < (L1 - 1)/(L2 - 1), the combined approach will have fewer operations. > > > As an exmaple, say, you have a need to achieve > > downsample 12 = 3 * 4 (can't be 2*6, as you specified). You either can > > let H1 corresponds to L1=3 or L1=4 so that L2 is determined > > respectively. That is, > > > L1=3: &#4294967295; &#4294967295;--> H1 --> v 3 --> H2 --> v 4 ---> > > L1=4: &#4294967295; &#4294967295;--> H2' --> v 4 ---> H1' --> v 3 --> > > > Note that H2' doesn't equal to H2 because the decimation filters need > > to be alias-free. These anti-alias feature is what you mean "more > > efficient"? > > I'm not entirely sure what you're trying to show with your example. > Could you expand on this a little? > > -- > Oli- Hide quoted text - > > - Show quoted text -
The reason I raised the downsample-by-12 is because I didn't know that your condition is "you're given two filters H1 and H2". Using the downsample-by-12 example, you can have TWO sets of filters: one is down-by-4 first followed by down-by-3 and the other is down-by-3 first followed by down-by-4. Due to anti-alias requirements, they are not equal in general. However, if you said you're given two filters H1 and H2, I know you meant "let's use the down-by-4 followed by down-by-3 filter set". This part is clear now. I agree to these: * the order of H1(z)*H2(z^L1) is approximately (N1+L1.N2). In fact should be (N1 + L1*N2+1) but not a big deal in the context. * the number of operations per input sample of a polyphase decimation filter is N/L. Along the lines, you can reach (with some approximations) (N1 / L1) + (N2 / L1.L2) for the two-stage approach (1) (N1 / L1.L2) + (N2 / L2) for the combined approach (2) So far, so good. Note that for the combined approach (2), there is only one filter and one decimator. Also note that if you swap N1 and N2 in (2), you get exactly (1). What does "swap" mean when the ineqaulity does hold? Using my down-by-12 example, that means you should use down-by-4 followed by down-by-3, shouldn't use down-by-3 followed down-by-4, as you have said "you're given H1 and H2". In order words, the given H1 and H2 aren't efficient compared to other set. When I said effficient, I mean the performance of the two systems should be comparable. But it is difficult to compare, I think, accurately, as these two sets of filters aren't exactly the same.