# Parallel implementation of Multirate filter

Started by November 26, 2012
```Dear users,

I want to implement a multirate filter with the ratio 7/8 (Interpolate by
7, decimate by 8). The multirate should have 8 input lines and 8 output
lines.
At each clock cycle I get 8 parallel samples in space, while they are 8
consecutive samples in time.
What is the architecture that I can use, or where can I look to find out
some clues how to design it?

I found this paper describing the implementation of such filters
http://www.rfel.com/Files/Documents/W11013_Resampling_white_paper.pdf

but I found out that this method will give me 8 input lines and 7 output
lines.

Any help would be appreciated!

Hassan

```
On 11/26/12 11:13 AM, Hassans wrote:
>
> I want to implement a multirate filter with the ratio 7/8 (Interpolate by
> 7, decimate by 8). The multirate should have 8 input lines and 8 output
> lines.
> At each clock cycle I get 8 parallel samples in space, while they are 8
> consecutive samples in time.

i am still trying to decode the meaning of this.

is this image processing and you get 8 pixels every clock?  or is this a
case of what i might call "block processing" where your clock ticks once
every 8 samples (in time)?

```
On Mon, 26 Nov 2012 10:13:18 -0600, Hassans wrote:

> Dear users,
>
> I want to implement a multirate filter with the ratio 7/8 (Interpolate
> by 7, decimate by 8). The multirate should have 8 input lines and 8
> output lines.
> At each clock cycle I get 8 parallel samples in space, while they are 8
> consecutive samples in time.
> What is the architecture that I can use, or where can I look to find out
> some clues how to design it?
>
> I found this paper describing the implementation of such filters
> http://www.rfel.com/Files/Documents/W11013_Resampling_white_paper.pdf
>
> but I found out that this method will give me 8 input lines and 7 output
> lines.
>
> Any help would be appreciated!

This is worded like homework -- what are you really doing?

Why eight lines out when the filter is decimating at a 7:8 ratio?

What do you mean by "eight samples in space, eight samples in time"?
What's your source data?  Are you doing the decimation in both time and
space, meaning you're going from 8x8 to 7x7?

```
```>> I want to implement a multirate filter with the ratio 7/8 (Interpolate
>> by 7, decimate by 8).

Would using 8 independent filters in parallel solve the "8 input 8 output"
part?

```
```>>> I want to implement a multirate filter with the ratio 7/8 (Interpolate
>>> by 7, decimate by 8).
>
>Would using 8 independent filters in parallel solve the "8 input 8
output"
>part?
>
>

I have an ADC that samples at a frequency (Fs1). This ADC has 8 parallel
output lines with a clock (Fs1/8) each. Using an FPGA I read 8 samples at
each clock cycle with a rate of (Fs1/8).

It is not possible to rearrange the 8 samples in serial and process them
with a clock of (Fs1) due to hardware limits. What I want to do is to
change the rate of the data to (Fs2) which has a ratio of (Fs2 = 7/8*Fs1).

So I am obliged to build a Multirate filter that has 8 parallel inputs. And
because the final sampling rate required (Fs2) is still high enough  for
the FPGA to hold, then I am obliged to have 8 output lines with a rate of
(Fs2/8) each.

Hopefully this clarifies my problem.

```
```what I'd do is to start with a conventional polyphase interpolate-by-7 FIR
filter:

y1 = x1 c1 + x2  c8 + x3 c15 + x4 c22 + x5 c29 + ...
y2 = x1 c2 + x2  c9 + x3 c16 + x4 c23 + x5 c30 + ...
y3 = x1 c3 + x2 c10 + x3 c17 + x4 c24 + x5 c33 + ...

then decimate by 8. That is, calculate only y1, y9, y17, y25 etc, and

At this point I've got a single-input single-output 7 up 8 down polyphase
resampler.
Next, take the remaining equations for eight consecutive (decimated) y
samples, and implement them in parallel. What remains is to distribute the
parallel inputs, a mere formality.

This is just a quick "lunch break study", maybe someone else comes up with
a better solution.
```
```
```>what I'd do is to start with a conventional polyphase interpolate-by-7
FIR
>filter:
>
>y1 = x1 c1 + x2  c8 + x3 c15 + x4 c22 + x5 c29 + ...
>y2 = x1 c2 + x2  c9 + x3 c16 + x4 c23 + x5 c30 + ...
>y3 = x1 c3 + x2 c10 + x3 c17 + x4 c24 + x5 c33 + ...
>
>then decimate by 8. That is, calculate only y1, y9, y17, y25 etc, and
>
>At this point I've got a single-input single-output 7 up 8 down polyphase
>resampler.
>Next, take the remaining equations for eight consecutive (decimated) y
>samples, and implement them in parallel. What remains is to distribute
the
>parallel inputs, a mere formality.
>
>This is just a quick "lunch break study", maybe someone else comes up
with
>a better solution.
>

Good lunch break study - this is precisely how it should be done.  Make
sure to design your filter with a convenient number of coefficients.
-Doug

```
```Still, for a structure that generates 7 output samples by consuming 8 input
samples, it seems more straightforward to design for 7 parallel outputs.

You'd get 7 simple filters with fixed coefficients, and have a common clock
rate at input and output.
What remains to be done for 8 outputs is to shuffle whole samples from 7 to
8 parallel registers at a slightly lower rate.

Both approaches will do the same, but I think I'd prefer the second one for
implementation.
Apparently, this is the approach taken in the paper (haven't read it yet -
has to wait until dinner :-)

```
On Tue, 27 Nov 2012 04:37:27 -0600, mnentwig wrote:

> what I'd do is to start with a conventional polyphase interpolate-by-7
> FIR filter:
>
> y1 = x1 c1 + x2  c8 + x3 c15 + x4 c22 + x5 c29 + ... y2 = x1 c2 + x2  c9
> + x3 c16 + x4 c23 + x5 c30 + ... y3 = x1 c3 + x2 c10 + x3 c17 + x4 c24 +
> x5 c33 + ...
>
> then decimate by 8. That is, calculate only y1, y9, y17, y25 etc, and
>
> At this point I've got a single-input single-output 7 up 8 down
> polyphase resampler.
> Next, take the remaining equations for eight consecutive (decimated) y
> samples, and implement them in parallel. What remains is to distribute
> the parallel inputs, a mere formality.
>
> This is just a quick "lunch break study", maybe someone else comes up
> with a better solution.

That looks right to me, once I got my morning-impaired brain wrapped
around it.

Hassans:  Don't be surprised that you'll need to save the previous eight
samples or perhaps more: that's kind of a requirement for any filtering.
Internally you'll probably want a step in your pipeline that has a vector
of the current N * 8 samples all lined up and ready to go into whatever
the next step in the pipeline is.

If the data is going by too fast to be put into a serial stream then even
after you get the algorithm ironed out it's still going to be challenging
to get the filter working.  I foresee a lot of pipelining, a big FPGA,
and a lot of picky book-keeping to get the filter to execute fast
enough.

I strongly suggest that you make sure that you have a very firm grasp of
the algorithm you're trying to implement before you start trying to make
it work at speed and with the 7:8 decimation in your output clock.  I'd
probably want to make sure that I had an accurate behavioral
representation simulated in the HDL of my choice that was rock-solid and
with no attempt to make it synthesizable).  Then when I'd made the "real"
filter I'd make sure to test against the "bone-head" implementation in
simulation.  Failing an HDL test article, I'd make sure to simulate the
filter action in Matlab or Scilab (or Excell or whatever) and test the
input/output behavior of the synthesizable model against that.

```