comp.dsp | Cascading FIR

Hello

I have two DSP connected in serial mode.
Have FIR with 2000 coefficients .
Need to recalc and divide coefficients of FIR by 1000 for 2 DSP, so that
in serial application give result of FIR with 2000 points.
Delay of FIR here dont work, bcs FIR of each DSP is 1000 points maximum. 


Is this possible?

WBR, Vasilius.


---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Evgeny Filatov ●September 25, 20162016-09-25

On 25.09.2016 17:47, Vasilius wrote:
> Hello
>
> I have two DSP connected in serial mode.
> Have FIR with 2000 coefficients .
> Need to recalc and divide coefficients of FIR by 1000 for 2 DSP, so that
> in serial application give result of FIR with 2000 points.
> Delay of FIR here dont work, bcs FIR of each DSP is 1000 points maximum.
>
>
> Is this possible?
>
> WBR, Vasilius.
>
>
> ---------------------------------------
> Posted through http://www.DSPRelated.com
>

Assuming you have H(z) = sum (in k from 1 to 2000) a_k * z^(-k), you can 
split it in multiple ways. The simplest would be
H(z) = H1(z) + H2(z)*z^(-1000), where
H1(z) = sum (in k from 1 to 1000) a_k * z^(-k), and
H2(z) = sum (in k from 1 to 1000) a_{k+1000} * z^(-k)

It's not exactly "serial", but you just need to add the output of filter 
H1 with the delayed output of filter H2.

I'm afraid to ask why do you need a FIR with 2000 coefficients...

Gene

Reply by Evgeny Filatov ●September 25, 20162016-09-25

On 25.09.2016 18:50, Evgeny Filatov wrote:
> On 25.09.2016 17:47, Vasilius wrote:
>> Hello
>>
>> I have two DSP connected in serial mode.
>> Have FIR with 2000 coefficients .
>> Need to recalc and divide coefficients of FIR by 1000 for 2 DSP, so that
>> in serial application give result of FIR with 2000 points.
>> Delay of FIR here dont work, bcs FIR of each DSP is 1000 points maximum.
>>
>>
>> Is this possible?
>>
>> WBR, Vasilius.
>>
>>
>> ---------------------------------------
>> Posted through http://www.DSPRelated.com
>>
>
> Assuming you have H(z) = sum (in k from 1 to 2000) a_k * z^(-k), you can
> split it in multiple ways. The simplest would be
> H(z) = H1(z) + H2(z)*z^(-1000), where
> H1(z) = sum (in k from 1 to 1000) a_k * z^(-k), and
> H2(z) = sum (in k from 1 to 1000) a_{k+1000} * z^(-k)
>
> It's not exactly "serial", but you just need to add the output of filter
> H1 with the delayed output of filter H2.
>
> I'm afraid to ask why do you need a FIR with 2000 coefficients...
>
> Gene
>

Perhaps I have misunderstood your question. If you have H(z) = sum (in k 
from 0 to 1999) a_k * z^(-k), and you want to partition it as
H(z) = H1(z) * H2(z), where
H1(z) = sum (in k from 0 to 1000) b_k * z^(-k), and
H2(z) = sum (in k from 0 to 999) c_k * z^(-k).
Where you know coefficients a_k, and want to determine coefficients b_k 
and c_k.

Then it's a pretty straightforward task! According to the fundamental 
theorem of algebra the polynomial H(z) can be partitioned as a 
multiplication of first-order complex polynomials, or second-order real 
polynomials. So you just need to find the pairs of complex zeros of 
H(z), and then split them in any convenient way to generate two 
polynomials of less order.

There would be lots of ways to do that, so perhaps you'd like to choose 
some which minimize largest values of coefficients in the resulting filters.

However, the question was whether it's possible or not, and yes, it is 
possible.

HTH,
Gene

Reply by Evgeny Filatov ●September 25, 20162016-09-25

On 25.09.2016 19:24, Evgeny Filatov wrote:
> On 25.09.2016 18:50, Evgeny Filatov wrote:
>> On 25.09.2016 17:47, Vasilius wrote:
>>> Hello
>>>
>>> I have two DSP connected in serial mode.
>>> Have FIR with 2000 coefficients .
>>> Need to recalc and divide coefficients of FIR by 1000 for 2 DSP, so that
>>> in serial application give result of FIR with 2000 points.
>>> Delay of FIR here dont work, bcs FIR of each DSP is 1000 points maximum.
>>>
>>>
>>> Is this possible?
>>>
>>> WBR, Vasilius.
>>>
>>>
>>> ---------------------------------------
>>> Posted through http://www.DSPRelated.com
>>>
>>
>> Assuming you have H(z) = sum (in k from 1 to 2000) a_k * z^(-k), you can
>> split it in multiple ways. The simplest would be
>> H(z) = H1(z) + H2(z)*z^(-1000), where
>> H1(z) = sum (in k from 1 to 1000) a_k * z^(-k), and
>> H2(z) = sum (in k from 1 to 1000) a_{k+1000} * z^(-k)
>>
>> It's not exactly "serial", but you just need to add the output of filter
>> H1 with the delayed output of filter H2.
>>
>> I'm afraid to ask why do you need a FIR with 2000 coefficients...
>>
>> Gene
>>
>
> Perhaps I have misunderstood your question. If you have H(z) = sum (in k
> from 0 to 1999) a_k * z^(-k), and you want to partition it as
> H(z) = H1(z) * H2(z), where
> H1(z) = sum (in k from 0 to 1000) b_k * z^(-k), and
> H2(z) = sum (in k from 0 to 999) c_k * z^(-k).
> Where you know coefficients a_k, and want to determine coefficients b_k
> and c_k.
>
> Then it's a pretty straightforward task! According to the fundamental
> theorem of algebra the polynomial H(z) can be partitioned as a
> multiplication of first-order complex polynomials, or second-order real
> polynomials. So you just need to find the pairs of complex zeros of
> H(z), and then split them in any convenient way to generate two
> polynomials of less order.
>
> There would be lots of ways to do that, so perhaps you'd like to choose
> some which minimize largest values of coefficients in the resulting
> filters.
>
> However, the question was whether it's possible or not, and yes, it is
> possible.
>
> HTH,
> Gene
>

Or perhaps (if it's some standard case like low-pass or band-pass) you'd 
have more luck if you go to the frequency domain, take a square root of 
the frequency response, and try to design a lesser filter with 1000 
coefficients satisfying the lesser requirements of a square-root response.

Gene

Reply by ●September 25, 20162016-09-25

On Sun, 25 Sep 2016 09:47:29 -0500, "Vasilius" <116196@DSPRelated>
wrote:

>Hello
>
>I have two DSP connected in serial mode.
>Have FIR with 2000 coefficients .
>Need to recalc and divide coefficients of FIR by 1000 for 2 DSP, so that
>in serial application give result of FIR with 2000 points.
>Delay of FIR here dont work, bcs FIR of each DSP is 1000 points maximum. 
>
>
>Is this possible?

Is the goal really to just spread the processing across two DSPs, or
is there some contraint that requires the filter computation be split
up like you're describing?

Are the filter coefficients symmetric?

Depending on the answers to the above, there are still lots of ways to
split a long filter between two DSPs.   Some of it depends on what
communication capabilities there are between the two DSPs, but if you
have a blank slate then a lot is possible.

1.  I think the most direct answer to your question is to split the
partial product calculations between the two processors.   In other
words, the sum of the products of the input with the first 1000
coeffiicents is computed in one, and the second 1000 in the other,
and, somewhere those two partial products get added together to to
form the result.   This is definitely do-able and seems to most
directly answer your question if I understand it correctly.  It may
not be the "best" way to do it depending on the system, though.

2.  Calculate every other output of the entire filter in each DSP.
e.g., calculate the even outputs in one and the odd outputs in the
other.   This is a 50% load reduction in each processor compared to
using one DSP.

3.  If the coefficients are symmetric, there is a very common trick to
fold the filter and achieve a 50% reduction in the number of
multiplies required.   If you still need to split load across two
processors, either of the tricks in 1 or 2 will still work on the
folded filter.

There are lots more variations on the above, but I hope you get the
idea.

>WBR, Vasilius.
>
>
>---------------------------------------
>Posted through http://www.DSPRelated.com

Reply by Evgeny Filatov ●September 25, 20162016-09-25

On 25.09.2016 21:27, eric.jacobsen@ieee.org wrote:
>
> 2.  Calculate every other output of the entire filter in each DSP.
> e.g., calculate the even outputs in one and the odd outputs in the
> other.   This is a 50% load reduction in each processor compared to
> using one DSP.
>

I'm afraid polyphase partitioning only works that way in case (in the 
original design) filtering is followed by a decimation by two. If 
there's no decimation, that would be still a couple of 2000-coefficient 
FIR filters each with a half of coefficients equal to zero.

Gene

Reply by ●September 25, 20162016-09-25

On Sun, 25 Sep 2016 21:43:10 +0300, Evgeny Filatov
<filatov.ev@mipt.ru> wrote:

>On 25.09.2016 21:27, eric.jacobsen@ieee.org wrote:
>>
>> 2.  Calculate every other output of the entire filter in each DSP.
>> e.g., calculate the even outputs in one and the odd outputs in the
>> other.   This is a 50% load reduction in each processor compared to
>> using one DSP.
>>
>
>I'm afraid polyphase partitioning only works that way in case (in the 
>original design) filtering is followed by a decimation by two. If 
>there's no decimation, that would be still a couple of 2000-coefficient 
>FIR filters each with a half of coefficients equal to zero.

I think you misunderstood somehow.   Each filter computes half the
output samples.   The full stream is recovered by interleaving the
output of both processors back to the full stream.

This is easily done in a FIR by advancing the samples two stages
through the delay line between output samples rather than one, which
is a pretty simple thing to do in software.

Reply by Evgeny Filatov ●September 25, 20162016-09-25

On 25.09.2016 23:47, eric.jacobsen@ieee.org wrote:
> On Sun, 25 Sep 2016 21:43:10 +0300, Evgeny Filatov
> <filatov.ev@mipt.ru> wrote:
>
>> On 25.09.2016 21:27, eric.jacobsen@ieee.org wrote:
>>>
>>> 2.  Calculate every other output of the entire filter in each DSP.
>>> e.g., calculate the even outputs in one and the odd outputs in the
>>> other.   This is a 50% load reduction in each processor compared to
>>> using one DSP.
>>>
>>
>> I'm afraid polyphase partitioning only works that way in case (in the
>> original design) filtering is followed by a decimation by two. If
>> there's no decimation, that would be still a couple of 2000-coefficient
>> FIR filters each with a half of coefficients equal to zero.
>
> I think you misunderstood somehow.   Each filter computes half the
> output samples.   The full stream is recovered by interleaving the
> output of both processors back to the full stream.
>
> This is easily done in a FIR by advancing the samples two stages
> through the delay line between output samples rather than one, which
> is a pretty simple thing to do in software.
>
>

Thanks for clarification, Eric. I didn't have any idea about whether the 
OP's FIRs are software or hardware.

Gene

Reply by ●September 25, 20162016-09-25

If you use the normal forward FIR structure, you have to send the delayed-by-1000 input data as well as the partial sum-of-products from dsp1 to dsp2, unless you don't care about wasting memory in dsp2. However if you use the transposed form of FIR filtering then you only have to send one number across the boundary. 

Another consideration is that in fixed-point math, the accumulator often overflows or under-flows many times during the sequence of multiply-accumulates but as long as it comes back within range at the end, you don't care. So when you break the FIR in two, you may need to transmit all the bits of the accumulator from dsp1 to dsp2, which might require multiple data transactions, assuming the dsp allows all bits of the accumulator to be sent to the serial port, which is not always the case. 

Bob

Reply by ●September 25, 20162016-09-25

On Sun, 25 Sep 2016 23:58:18 +0300, Evgeny Filatov
<filatov.ev@mipt.ru> wrote:

>On 25.09.2016 23:47, eric.jacobsen@ieee.org wrote:
>> On Sun, 25 Sep 2016 21:43:10 +0300, Evgeny Filatov
>> <filatov.ev@mipt.ru> wrote:
>>
>>> On 25.09.2016 21:27, eric.jacobsen@ieee.org wrote:
>>>>
>>>> 2.  Calculate every other output of the entire filter in each DSP.
>>>> e.g., calculate the even outputs in one and the odd outputs in the
>>>> other.   This is a 50% load reduction in each processor compared to
>>>> using one DSP.
>>>>
>>>
>>> I'm afraid polyphase partitioning only works that way in case (in the
>>> original design) filtering is followed by a decimation by two. If
>>> there's no decimation, that would be still a couple of 2000-coefficient
>>> FIR filters each with a half of coefficients equal to zero.
>>
>> I think you misunderstood somehow.   Each filter computes half the
>> output samples.   The full stream is recovered by interleaving the
>> output of both processors back to the full stream.
>>
>> This is easily done in a FIR by advancing the samples two stages
>> through the delay line between output samples rather than one, which
>> is a pretty simple thing to do in software.
>>
>>
>
>Thanks for clarification, Eric. I didn't have any idea about whether the 
>OP's FIRs are software or hardware.

I think usually a DSP means a DSP processor, like a 56k or a Sharc or
something, which means software.

Either way, it's also easy in hardware.

Previous12 3 Next

Cascading FIR

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group