Oli Charlesworth wrote:
> Hello all,
> 
> This is a question which probably boils down to "what am I missing?".
> 
> It seems to be commonly stated that for interpolation or decimation,
> it's generally more computationally efficient to perform the operation
> in multiple stages, i.e. factorise the rate-change factor, and design a
> lower-order filter for each stage.  I completely accept that designing
> the individual filters in this way achieves a lower complexity, but I'm
> beginning to question whether one actually needs to implement the rate
> change in separate stages.
> 
> Consider a system where we need to decimate by L = L1.L2.  (For the sake
> of argument, let's assume that neither L1 nor L2 equals 2, so that we
> can't "cheat" by implementing half-band filters.)  Using the typical
> approach, we might come up with two decimation filters, H1 and H2, of
> length N1 and N2 respectively.  i.e. our system is:
> 
> --> H1 --> v L1 --> H2 --> v L2 -->
> 
> If implemented as polyphase filters, the number of operations per input
> sample is (N1 / L1) + (N2 / L1.L2).
> 
> But we could construct an equivalent overall filter H as the convolution
> of H1 and H2', where H2' is H2 simply upsampled by L1 by zero-stuffing.
>  The equivalent overall system would be:
> 
> --> H --> v L1.L2 -->
> 
> The order of H is now approximately N1 + L1.N2.  Therefore, the number
> of operations per input sample in this approach is (N1 + L1.N2) / L1.L2
> = (N1 / L1.L2) + (N2 / L2).
> 
> It would seem, therefore, that depending on the particular values being
> used, the single stage approach may be as efficient or more efficient
> than the multi-stage approach.
> 
> What am I missing?  (Or, where have I made a mistake?)

The number of taps per output sample for the polyphase filter:

N2 + L2N1

The number of taps per output sample for the filter in one stage:

N2L1 + 2N1


Substitute the formula for estimating N1,N2 depending on L1,L2 and see 
that filter in one stage is going to take more.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

On Jul 9, 12:13&#4294967295;am, Oli Charlesworth <ca...@olifilth.co.uk> wrote:
> Verictor wrote:
>
> > Not sure why you can call this "single stage" approach because you
> > have to virtually design H1 and H2 before you can implement this
> > "single stage" approach.
>
> Agreed. &#4294967295;What I'm questioning is once one has the two filters, does one
> actually need to bother with two separate filters and two separate
> decimators.
>
>
>
>
>
> > In fact, they are the same in terms of efficiency. Consider the
> > conventional approach, i.e.,
>
> > --> H1 --> v L1 &#4294967295;--> H2 --> v L2 -->
>
> > If doing a "noble identity" conversion just using
>
> > --> v L1 --> H2
>
> > You can get this:
>
> > --> H1 --> H2(z^L1) --> v L1 --> vL2 --> &#4294967295; &#4294967295; &#4294967295; (H2(z^L1) means the H2
> > upsampling by L1)
>
> > Thus, H1(z)*H2(z^L1) is the H filter in your approach. So that you can
> > have vL1.L2 too.
>
> Indeed. &#4294967295;I completely agree with you up to this point.
>
> > I don't see why it can be "more efficient" than the
> > multirate design.
>
> Would you agree that:
> * the order of H1(z)*H2(z^L1) is approximately (N1+L1.N2)? &#4294967295;(where the
> difference is negligible for large L1 and L2).
> * the number of operations per input sample of a polyphase decimation
> filter is N/L?
>
> From that, it is simple to obtain the number of operations per input
> sample of the two approaches (as before):
>
> (N1 / L1) + (N2 / L1.L2) for the two-stage approach
> (N1 / L1.L2) + (N2 / L2) for the combined approach
>
> Setting these as an inequality and rearranging, I find that when (N1/N2)
> < (L1 - 1)/(L2 - 1), the combined approach will have fewer operations.
>
> > As an exmaple, say, you have a need to achieve
> > downsample 12 = 3 * 4 (can't be 2*6, as you specified). You either can
> > let H1 corresponds to L1=3 or L1=4 so that L2 is determined
> > respectively. That is,
>
> > L1=3: &#4294967295; &#4294967295;--> H1 --> v 3 --> H2 --> v 4 --->
> > L1=4: &#4294967295; &#4294967295;--> H2' --> v 4 ---> H1' --> v 3 -->
>
> > Note that H2' doesn't equal to H2 because the decimation filters need
> > to be alias-free. These anti-alias feature is what you mean "more
> > efficient"?
>
> I'm not entirely sure what you're trying to show with your example.
> Could you expand on this a little?
>
> --
> Oli- Hide quoted text -
>
> - Show quoted text -

The reason I raised the downsample-by-12 is because I didn't know that
your condition is "you're given two filters H1 and H2". Using the
downsample-by-12 example, you can have TWO sets of filters: one is
down-by-4 first followed by down-by-3 and the other is down-by-3 first
followed by down-by-4. Due to anti-alias requirements, they are not
equal in general. However, if you said you're given two filters H1 and
H2, I know you meant "let's use the down-by-4 followed by down-by-3
filter set". This part is clear now.

I agree to these:
* the order of H1(z)*H2(z^L1) is approximately (N1+L1.N2). In fact
should be (N1 + L1*N2+1) but not a big deal in the context.
* the number of operations per input sample of a polyphase decimation
filter is N/L.

Along the lines, you can reach (with some approximations)
(N1 / L1) + (N2 / L1.L2) for the two-stage approach         (1)
(N1 / L1.L2) + (N2 / L2) for the combined approach         (2)

So far, so good. Note that for the combined approach (2), there is
only one filter and one decimator. Also note that if you swap N1 and
N2 in (2), you get exactly (1). What does "swap" mean when the
ineqaulity does hold? Using my down-by-12 example, that means you
should use down-by-4 followed by down-by-3, shouldn't use down-by-3
followed down-by-4, as you have said "you're given H1 and H2". In
order words, the given H1 and H2 aren't efficient compared to other
set. When I said effficient, I mean the performance of the two systems
should be comparable. But it is difficult to compare, I think,
accurately, as these two sets of filters aren't exactly the same.

On Jul 9, 10:53=A0am, dbd <d...@ieee.org> wrote:
> On Jul 9, 12:11 am, Oli Charlesworth <ca...@olifilth.co.uk> wrote:.> dbd =
wrote:
>
> .>
> .> > The zero-stuffed H2 does not have the correct response for the
> single
> .> > stage filter design scheme given.
> .>
> .> Yes, but when convolved with H1, it does.
>
> No. The H1 from an efficiently designed two stage version does not
> have a narrow enough transition band to work with your H2' to form a
> single stage decimator with the same pass and stop bands.

I'm sorry, but yes it does work (try it!).  The order of operations
can quite clearly be rearranged as I suggested in my previous post (as
did Verictor), with the result being mathematically equivalent.  If
you disagree, then which of the following steps is incorrect? (i.e.
which transformation no longer results in a mathematically equivalent
system?)


1.)  --> H1(z) --> v L1 --> H2(z) --> v L2 -->

2.)  --> H1(z) --> H2(z^L1) --> v L1 --> v L2 -->

3.)  --> H1(z) --> H2(z^L1) --> v L1.L2 -->

4.)  --> H1(z).H2(z^L1) --> v L1.L2 -->


If you want to see it work in practice, here's some Matlab code:

% Random filters
h1 =3D randn(1, 10);
h2 =3D randn(1, 15);

% Decimation rates
L1 =3D 2;
L2 =3D 3;

% Random samples
x =3D randn(1,1000);

%%%%%%%%%%%%%%%%%%%
% Approach 1 - separate stages
%%%%%%%%%%%%%%%%%%%

% Filter by h1
t1 =3D filter(h1, 1, x);
% Decimate
t2 =3D t1(1:L1:end);

% Filter by h2
t3 =3D filter(h2, 1, t2);
% Decimate
y1 =3D t3(1:L2:end);


%%%%%%%%%%%%%%%%%%%
% Approach 2 - combine
%%%%%%%%%%%%%%%%%%%

% Combined filter
h =3D conv(h1, kron(h2, [1 zeros(1, L1-1)]));
% Filter samples by h
t4 =3D filter(h, 1, x);
% Combined decimate
y2 =3D t4(1:(L1*L2):end);


%%%%%%%%%%%%%%%%%%%
% Compare
%%%%%%%%%%%%%%%%%%%

cla, hold on, plot(y1, 'b.-'), plot(y2, 'r.-')


--
Oli

On Jul 9, 12:11 am, Oli Charlesworth <ca...@olifilth.co.uk> wrote:
.> dbd wrote:
.>
.> > The zero-stuffed H2 does not have the correct response for the
single
.> > stage filter design scheme given.
.>
.> Yes, but when convolved with H1, it does.

No. The H1 from an efficiently designed two stage version does not
have a narrow enough transition band to work with your H2' to form a
single stage decimator with the same pass and stop bands.

.> ...
.> The problem I face is why the example I've given would appear to
.> demonstrate otherwise!

You have compared the operations in a two stage filter that works (by
definition) for the desampling task and single stage filter that
doesn't meet specs.

.> Oli

If you have the information to, try actually specifying the filters
for an example before you compare. Or try it with one of the examples
in the references.

Dale B. Dalrymple

Vladimir Vassilevsky wrote:
> 
> 
> Oli Charlesworth wrote:
> 
>> This is a question which probably boils down to "what am I missing?".
> 
> This boils down to N versus log(N) operations, exactly like with the
> FFTs. Look into the classic books such as "Multirate Signal Processing"
> by Rabiner.

I have "Multirate Signal Processing" by Harris, and I've thoroughly read
"Multirate Systems and Filter Banks" by Vaidyanathan in the past.  Would
you recommend Rabiner's book as a useful addition?

>> It seems to be commonly stated that for interpolation or decimation,
>> it's generally more computationally efficient to perform the operation
>> in multiple stages, i.e. factorise the rate-change factor, and design a
>> lower-order filter for each stage. 
> 
> Athough this is generally true, this doesn't account for the overhead
> associated with the setup of the each filtering stage. The overhead may
> or may not be substantial.

Agreed.  In particular, two separate filters require significantly less
storage for delay lines and filter coefficients than a combined approach.

-- 
Oli

Verictor wrote:
> 
> Not sure why you can call this "single stage" approach because you
> have to virtually design H1 and H2 before you can implement this
> "single stage" approach.

Agreed.  What I'm questioning is once one has the two filters, does one
actually need to bother with two separate filters and two separate
decimators.

> In fact, they are the same in terms of efficiency. Consider the
> conventional approach, i.e.,
> 
> --> H1 --> v L1  --> H2 --> v L2 -->
> 
> If doing a "noble identity" conversion just using
> 
> --> v L1 --> H2
> 
> You can get this:
> 
> --> H1 --> H2(z^L1) --> v L1 --> vL2 -->       (H2(z^L1) means the H2
> upsampling by L1)
> 
> Thus, H1(z)*H2(z^L1) is the H filter in your approach. So that you can
> have vL1.L2 too.

Indeed.  I completely agree with you up to this point.

> I don't see why it can be "more efficient" than the
> multirate design.

Would you agree that:
* the order of H1(z)*H2(z^L1) is approximately (N1+L1.N2)?  (where the
difference is negligible for large L1 and L2).
* the number of operations per input sample of a polyphase decimation
filter is N/L?

From that, it is simple to obtain the number of operations per input
sample of the two approaches (as before):

(N1 / L1) + (N2 / L1.L2) for the two-stage approach
(N1 / L1.L2) + (N2 / L2) for the combined approach

Setting these as an inequality and rearranging, I find that when (N1/N2)
< (L1 - 1)/(L2 - 1), the combined approach will have fewer operations.

> As an exmaple, say, you have a need to achieve
> downsample 12 = 3 * 4 (can't be 2*6, as you specified). You either can
> let H1 corresponds to L1=3 or L1=4 so that L2 is determined
> respectively. That is,
> 
> L1=3:    --> H1 --> v 3 --> H2 --> v 4 --->
> L1=4:    --> H2' --> v 4 ---> H1' --> v 3 -->
> 
> Note that H2' doesn't equal to H2 because the decimation filters need
> to be alias-free. These anti-alias feature is what you mean "more
> efficient"?

I'm not entirely sure what you're trying to show with your example.
Could you expand on this a little?

-- 
Oli

dbd wrote:
> 
> The zero-stuffed H2 does not have the correct response for the single
> stage filter design scheme given.

Yes, but when convolved with H1, it does.  Would you not agree that by a
Noble identity, the following are equivalent:

--> H1 --> v L1 --> H2 --> v L2 -->

--> H1 --> H2' --> v L1 --> v L2 -->

i.e. we can push H2 through the decimator, where formally H2'(z) = H2(z^L1).

If so, it is then a trivial step to then combine the two adjacent
filters by convolution, and also combine the two adjacent decimators
into one.  The resulting system will be mathematically identical to the
original two-stage implementation.  (In fact, what I've just written
mirrors precisely what Verictor has written in another reply.)

> Combining two filters as in the single stage scheme is not an
> efficient filter design method for comparison to  a multistage design.
> 
> You post enough to know how to query 'multistage decimation' to Google
> or search comp.dsp for a recommended text. There are a lot of
> explanations and examples available.

I'm aware that designing a single filter straight off is less efficient
than designing a separate filter for each stage, and I understand the
reason for this is the inverse relationship between normalised
transition bandwidth and filter length, and that normalised transition
bandwidth increases with the decimation ratio.  In that respect, I've
done plenty of reading in the past.

The problem I face is why the example I've given would appear to
demonstrate otherwise!

-- 
Oli

On Jul 8, 5:10 pm, Oli Charlesworth <ca...@olifilth.co.uk> wrote:
> Hello all,
>
.> This is a question which probably boils down to "what am I
missing?".
.> ...

The following statement is false:

.> But we could construct an equivalent overall filter H as the
convolution
.> of H1 and H2', where H2' is H2 simply upsampled by L1 by zero-
stuffing.
.>  The equivalent overall system would be:

.> ...
.> What am I missing?  (Or, where have I made a mistake?)
.>
.> --
.> Oli

The multistage design of H1 has far to wide a transition band for the
single stage scheme to work. (This relaxation in transition band is
one of the major reasons for multistage implementation efficiency.)

The zero-stuffed H2 does not have the correct response for the single
stage filter design scheme given.

Combining two filters as in the single stage scheme is not an
efficient filter design method for comparison to  a multistage design.

You post enough to know how to query 'multistage decimation' to Google
or search comp.dsp for a recommended text. There are a lot of
explanations and examples available.

Good luck

Dale B. Dalrymple

Oli Charlesworth wrote:

> Hello all,
> 
> This is a question which probably boils down to "what am I missing?".

This boils down to N versus log(N) operations, exactly like with the 
FFTs. Look into the classic books such as "Multirate Signal Processing" 
by Rabiner.

> It seems to be commonly stated that for interpolation or decimation,
> it's generally more computationally efficient to perform the operation
> in multiple stages, i.e. factorise the rate-change factor, and design a
> lower-order filter for each stage. 

Athough this is generally true, this doesn't account for the overhead 
associated with the setup of the each filtering stage. The overhead may 
or may not be substantial.

> I completely accept that designing
> the individual filters in this way achieves a lower complexity, but I'm
> beginning to question whether one actually needs to implement the rate
> change in separate stages.

That entirely depends on the particularities of your application.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

On Jul 8, 5:10=A0pm, Oli Charlesworth <ca...@olifilth.co.uk> wrote:
> Hello all,
>
> This is a question which probably boils down to "what am I missing?".
>
> It seems to be commonly stated that for interpolation or decimation,
> it's generally more computationally efficient to perform the operation
> in multiple stages, i.e. factorise the rate-change factor, and design a
> lower-order filter for each stage. =A0I completely accept that designing
> the individual filters in this way achieves a lower complexity, but I'm
> beginning to question whether one actually needs to implement the rate
> change in separate stages.
>
> Consider a system where we need to decimate by L =3D L1.L2. =A0(For the s=
ake
> of argument, let's assume that neither L1 nor L2 equals 2, so that we
> can't "cheat" by implementing half-band filters.) =A0Using the typical
> approach, we might come up with two decimation filters, H1 and H2, of
> length N1 and N2 respectively. =A0i.e. our system is:
>
> --> H1 --> v L1 --> H2 --> v L2 -->
>
> If implemented as polyphase filters, the number of operations per input
> sample is (N1 / L1) + (N2 / L1.L2).
>
> But we could construct an equivalent overall filter H as the convolution
> of H1 and H2', where H2' is H2 simply upsampled by L1 by zero-stuffing.
> =A0The equivalent overall system would be:
>
> --> H --> v L1.L2 -->
>
> The order of H is now approximately N1 + L1.N2. =A0Therefore, the number
> of operations per input sample in this approach is (N1 + L1.N2) / L1.L2
> =3D (N1 / L1.L2) + (N2 / L2).
>
> It would seem, therefore, that depending on the particular values being
> used, the single stage approach may be as efficient or more efficient
> than the multi-stage approach.
>
> What am I missing? =A0(Or, where have I made a mistake?)
>
> --
> Oli

Not sure why you can call this "single stage" approach because you
have to virtually design H1 and H2 before you can implement this
"single stage" approach.

In fact, they are the same in terms of efficiency. Consider the
conventional approach, i.e.,

--> H1 --> v L1  --> H2 --> v L2 -->

If doing a "noble identity" conversion just using

--> v L1 --> H2

You can get this:

--> H1 --> H2(z^L1) --> v L1 --> vL2 -->       (H2(z^L1) means the H2
upsampling by L1)

Thus, H1(z)*H2(z^L1) is the H filter in your approach. So that you can
have vL1.L2 too. I don't see why it can be "more efficient" than the
multirate design. As an exmaple, say, you have a need to achieve
downsample 12 =3D 3 * 4 (can't be 2*6, as you specified). You either can
let H1 corresponds to L1=3D3 or L1=3D4 so that L2 is determined
respectively. That is,

L1=3D3:    --> H1 --> v 3 --> H2 --> v 4 --->
L1=3D4:    --> H2' --> v 4 ---> H1' --> v 3 -->

Note that H2' doesn't equal to H2 because the decimation filters need
to be alias-free. These anti-alias feature is what you mean "more
efficient"?