# Half-band filter on Xilinx FPGA

November 30, 20105 comments

### 1. DSP48 Slice in Xilinx FPGA

There are many DSP48 Slices in most Xilinx® FPGAs, one DSP48 slice in Spartan6® FPGA is shown in Figure 1, the structure may different depending on the device, but broadly similar.

Figure 1: A whole DSP48A1 Slice in Spartan6 (www.xilinx.com)

### 2. Symmetric Systolic Half-band FIR

Figure 2: Symmetric Systolic Half-band FIR Filter

### 3. Two-channel Symmetric Systolic Half-band FIR

Figure 3: 2-Channel Symmetric Systolic Half-band FIR Filter

y1[n] =(x[n]+x[n-14])h0 + (x[n-2]+x[n-12])h2 + (x[n-4]+x[n-10])h4 + (x[n-6]+x[n-8])h6

y2[n] =x[n-7]h7

y[n]=y1[n]+y2[n]

The symmetric systolic FIR filter is considered an optimal solution for parallel filter architectures.

The advantages to using the Systolic FIR filter are:

• Highest Performance: Maximum performance can be achieved with this structure because there is no high fanout input signal. Larger filters can be routing-limited if the number of coefficients exceeds the number of DSP slices in a column on a device.
• Efficient Mapping to the DSP48 Slice: Mapping is enabled by the adder chain structure of the Systolic FIR Filter. This extendable structure supports large and small FIR filters.
• No External Logic: No external FPGA fabric is required, enabling the highest possible performance.

The disadvantage to using the Systolic FIR filter is:

• Higher Latency: The latency of the filter is a function of how many coefficients are in the filter. The larger the filter, the higher the latency:

from Log2N (add tree) to N (Systolic) : N is the number of multiplies.

### 5. Rounding

The number of bits on the output of the filters must be reduced to a manageable width.Truncation introduces an undesirable DC data shift due to the nature of two’s complement numbers. The DC shift can be improved with the use of symmetric rounding, where positive numbers are rounded up and negative numbers are rounded down.

The rounding is achieved in the following manner:

For positive numbers: Binary Data Value + 0.10000...and then truncate

For negative numbers: Binary Data Value + 0.01111... and then truncate

/**************************************************************
module name  : Half_band_fir_down2
version   : 0.3
author   : Lyons Zhang (
fpgaplayer@gmail.com )
description  : 2-Channel interleaved input/output half-band filter downsample 2 ,the filter is a little long,because I'd like to show the original appearance of the practical application.

revision history:
---------------------------------------------------------------
1. 2010-11-11, initial version

2. 2010-11-30, second version, revise by advice of old cfelton

3. 2011-03-28, third version, symmetric rounding used to improved DC shift
---------------------------------------------------------------
//coefficient of Half Band Filter : ( 1,14,14 )
//output data of Half Band Filter : ( 2,28,24 ) ---> ( 1,14,10 )
***************************************************************/

`timescale  1ns/1ps

module  Half_band_fir_down2
(
//input signals
Clk,                        //FPGA's master clock
Reset,                    //Global reset
Data_in_0,              //Data_in_0 rate is Clk/2
Data_in_1,              //Data_in_1 rate is Clk/2
Index,                    //Index reverse every Clk period

//output signals
Data_out_0,            //Data_out_0 rate is Clk/4
Data_out_1,            //Data_out_1 rate is Clk/4
Hb_overflow_0,        //Indicate filter overflow, be of use in practise
Hb_overflow_1         //Indicate filter overflow, be of use in practise
);

/*****************************************************/
/*--- Input and Output Ports declaration -----*/
/*****************************************************/
input                  Clk;
input                  Reset;
input   [14:0]      Data_in_0;
input   [14:0]      Data_in_1;
input                  Index;

output  [13:0]      Data_out_0;
output  [13:0]      Data_out_1;
output                Hb_overflow_0;
output                Hb_overflow_1;
/*****************************************************/
/*-------  Ports Type declaration            --------*/
/*****************************************************/
reg     [13:0]      Data_out_0;
reg     [13:0]      Data_out_1;
reg                   Hb_overflow_0;
reg                   Hb_overflow_1;
/*****************************************************/
/*------- parameter declaration              --------*/
/*****************************************************/

//Coefficients of half band filter
parameter   Coef_c0_c46  = -17'd8;
parameter   Coef_c2_c44  =  17'd27;
parameter   Coef_c4_c42  = -17'd68;
parameter   Coef_c6_c40  =  17'd146;
parameter   Coef_c8_c38  = -17'd281;
parameter   Coef_c10_c36 =  17'd499;
parameter   Coef_c12_c34 = -17'd837;
parameter   Coef_c14_c32 =  17'd1353;
parameter   Coef_c16_c30 = -17'd2161;
parameter   Coef_c18_c28 =  17'd3547;
parameter   Coef_c20_c26 = -17'd6561;
parameter   Coef_c22_c24 =  17'd20727;
parameter   Coef_c23       =  17'd32768;
/*****************************************************/
/------    Variable declaration                 --------*/
/*****************************************************/

wire        [16:0]          Mult_b[11:0];
wire        [33:0]          Dsp_r[11:0];

reg         [14:0]          Data_in_reg[92:0];

reg         [33:0]          Data_result;             //Expand to 34 bits not only avoid overflow,
wire        [33:0]          Data_result_carry;    //But also for several gains by cut different LSBs

reg                            Index_1;
reg         [13:0]          Data_out_0_reg;

/*****************************************************/
/*-------               Main Code            --------*/
/*****************************************************/

always @ ( posedge Clk or posedge Reset )
begin
if ( Reset == 1'b1 )
Data_in_reg[0] <= 15'b0;
else if ( Index == 1'b1 )
Data_in_reg[0] <= Data_in_0;
else
Data_in_reg[0] <= Data_in_1;
end

genvar Numd;
generate
for (Numd = 1; Numd <= 92; Numd = Numd + 1)
begin : U_data_in_reg
always @ ( posedge Clk or posedge Reset )
begin
if ( Reset == 1'b1 )
Data_in_reg[Numd] <= 15'b0;
else
Data_in_reg[Numd] <= Data_in_reg[Numd-1];
end
end

endgenerate

//adder A between DSPs has 5 registers
assign Add_a[0]   = Data_in_reg[0];
assign Add_a[1]   = Data_in_reg[5];
assign Add_a[2]   = Data_in_reg[10];
assign Add_a[3]   = Data_in_reg[15];
assign Add_a[4]   = Data_in_reg[20];
assign Add_a[5]   = Data_in_reg[25];
assign Add_a[6]   = Data_in_reg[30];
assign Add_a[7]   = Data_in_reg[35];
assign Add_a[8]   = Data_in_reg[40];
assign Add_a[9]   = Data_in_reg[45];
assign Add_a[10]  = Data_in_reg[50];
assign Add_a[11]  = Data_in_reg[55];

//adder D between DSPs has 3 registers
assign Add_d[0]   = Data_in_reg[92];
assign Add_d[1]   = Data_in_reg[89];
assign Add_d[2]   = Data_in_reg[86];
assign Add_d[3]   = Data_in_reg[83];
assign Add_d[4]   = Data_in_reg[80];
assign Add_d[5]   = Data_in_reg[77];
assign Add_d[6]   = Data_in_reg[74];
assign Add_d[7]   = Data_in_reg[71];
assign Add_d[8]   = Data_in_reg[68];
assign Add_d[9]   = Data_in_reg[65];
assign Add_d[10]  = Data_in_reg[62];
assign Add_d[11]  = Data_in_reg[59];

assign Add_c[0]   = 33'b0;
assign Add_c[1]   = Dsp_r[0][32:0];
assign Add_c[2]   = Dsp_r[1][32:0];
assign Add_c[3]   = Dsp_r[2][32:0];
assign Add_c[4]   = Dsp_r[3][32:0];
assign Add_c[5]   = Dsp_r[4][32:0];
assign Add_c[6]   = Dsp_r[5][32:0];
assign Add_c[7]   = Dsp_r[6][32:0];
assign Add_c[8]   = Dsp_r[7][32:0];
assign Add_c[9]   = Dsp_r[8][32:0];
assign Add_c[10]  = Dsp_r[9][32:0];
assign Add_c[11]  = Dsp_r[10][32:0];

//Mult_b are filter coefficient
assign Mult_b[0]  = Coef_c0_c46;
assign Mult_b[1]  = Coef_c2_c44;
assign Mult_b[2]  = Coef_c4_c42;
assign Mult_b[3]  = Coef_c6_c40;
assign Mult_b[4]  = Coef_c8_c38;
assign Mult_b[5]  = Coef_c10_c36;
assign Mult_b[6]  = Coef_c12_c34;
assign Mult_b[7]  = Coef_c14_c32;
assign Mult_b[8]  = Coef_c16_c30;
assign Mult_b[9]  = Coef_c18_c28;
assign Mult_b[10] = Coef_c20_c26;
assign Mult_b[11] = Coef_c22_c24;

//* (A+D)*B+C
//14 pipelines

genvar Num;
generate
for (Num = 0; Num <= 11; Num = Num + 1)
begin : U_DDC_HF2_DSP
DDC_HF2_DSP     DDC_HF2_DSP
(
//* Inputs
.clk        ( Clk         ),
.a          ( Add_a[Num]  ),
.d          ( Add_d[Num]  ),
.b          ( Mult_b[Num] ),
.c          ( Add_c[Num]  ),
//* Outputs
.p          ( Dsp_r[Num]  )
);
end
endgenerate

//Data_in_reg_57 add 3 pipelines
always @ ( posedge Clk or posedge Reset )
begin
if ( Reset == 1'b1 )
Data_result <= 34'd0;
else
Data_result <= Dsp_r[11] + { {4{Data_in_reg[60][14]}},Data_in_reg[60],15'b0 };
end

assign  Data_result_carry = ( Data_result[33]== 1'b0 ) ? ( Data_result + 34'h000010000 ) : ( Data_result + 34'h00000ffff );     //Symmetric Rounding

always @ ( posedge Clk or posedge Reset )
begin
if ( Reset == 1'b1 )
Index_1 <= 1'b0;
else if ( Index == 1'b0 )
Index_1 <= ~Index_1;
else
;
end

always @ ( posedge Clk or posedge Reset )
begin
if ( Reset == 1'b1 )
Data_out_0_reg <= 14'b0;
else if ( ( Index == 1'b1 ) && ( Index_1 == 1'b1 ) )
begin
if ( ( Data_result_carry[33:30] == 4'b0000 ) || ( Data_result_carry[33:30] == 4'b1111 ) )
Data_out_0_reg <= Data_result_carry[30:17];
else if ( Data_result_carry[33] == 1'b0 )
Data_out_0_reg <= 14'h1fff;
else
Data_out_0_reg <= 14'h2000;
end
else
;
end

always @ ( posedge Clk or posedge Reset )
begin
if ( Reset == 1'b1 )
Data_out_0 <= 14'b0;
else
Data_out_0 <= Data_out_0_reg;
end

always @ ( posedge Clk or posedge Reset )
begin
if ( Reset == 1'b1 )
Data_out_1 <= 14'b0;
else if ( ( Index == 1'b0 ) && ( Index_1 == 1'b1 ) )
begin
if ( ( Data_result_carry[33:30] == 4'b0000 ) || ( Data_result_carry[33:30] == 4'b1111 ) )
Data_out_1 <= Data_result_carry[30:17];
else if ( Data_result_carry[33] == 1'b0 )
Data_out_1 <= 14'h1fff;
else
Data_out_1 <= 14'h2000;
end
else
;
end

always @ ( posedge Clk or posedge Reset )
begin
if ( Reset == 1'b1 )
Hb_overflow_0 <= 1'b0;
else if ( ( Index == 1'b1 ) && ( Index_1 == 1'b1 ) )
begin
if ( ( Data_result_carry[33:30] == 4'b0000 ) || ( Data_result_carry[33:30] == 4'b1111 ) )
Hb_overflow_0 <= 1'b0;
else
Hb_overflow_0 <= 1'b1;
end
else
;
end

always @ ( posedge Clk or posedge Reset )
begin
if ( Reset == 1'b1 )
Hb_overflow_1 <= 1'b0;
else if ( ( Index == 1'b0 ) && ( Index_1 == 1'b1 ) )
begin
if ( ( Data_result_carry[33:30] == 4'b0000 ) || ( Data_result_carry[33:30] == 4'b1111 ) )
Hb_overflow_1 <= 1'b0;
else
Hb_overflow_1 <= 1'b1;
end
else
;
end

endmodule

/*****************************************************/
/*-------         the end              --------*/
/*****************************************************/

In the codes I use DSP48 as figure 4.

figure 4 DDC_HF2_DSP(generated by DSP48) in the code

References:

[1]www.xilinx.com, DSP:Designing for Optimal Results,2005.

[ - ]
Comment by November 29, 2010
The length of the code could be reduced by using a generate for the delay taps as well. It would make the HDL more readable in a blog type format. reg [14:0] Data_in_reg[0:92]; ... genvar Numd; generate for (Numd = 1; Numd <= 92; Numd = Numd + 1) begin : G_DELAY always @ ( posedge Clk or posedge Reset ) begin if ( Reset == 1'b1 ) Data_in_reg[Numd] <= 15'b0; else Data_in_reg[Numd] <= Data_in_reg[Numd-1]; end end endgenerate
[ - ]
Comment by November 29, 2010
hi,cfelton,thank you very much for your advice.There are many good guys in dsprelated.com, make the codes better is also my mind.
[ - ]
Comment by December 9, 2010
Hi, You guys are doing two things I have been wanting to do but need guidance. 1. I have been trying to write a fft/dft module that I can call and pass 90 values to and produce an output spectrum represented by a flat file of values that I can then, display graphically. 2. I purchased my Spartan FPGA board last year and am waiting for a class to be developed at a college near here. But, what I wanted to do is analyze a quadrature signal (two audio channels) and extract a single frequency. Example of this is done in pc's. (see http://wb4mak.com) I have the receivers that output the quadrature signal and the (small) fpga. Can we pack a "radio tuning" system using fpga into one of these trainers? I want to eliminate the need for a PC to run the fft for softrock radios. I think you guys have the talent to point me in the right direction. Thanks Tom - Huntsville, Alabama USA Amateur Radio - WA4FYN
[ - ]
Comment by December 12, 2010
tomm, There are FFT/IFFT cores that are part of the Xilinx and Altera design software (coregen and SOPC builder). I have a basic FFT core here, http://www.myhdl.org/doku.php/users:cfelton:projects:recursivefft , but it is not very optimized. I have run the FFT on boards like, http://www.dsptronics.com/buy.html.
[ - ]
Comment by December 13, 2010
hi,tomm, âfft/dft can call and pass 90 valuesâ, I think you may find some information,although there are some notes.After some time I will share some verilog code about FFT/IFFT. 'analyze a quadrature signal (two audio channels) and extract a single frequency' is not a pice of cake, C.Richard Johnson Jr. introduced a method called 'PLL Analysis' to achieve this goal in his book 'Telecommunication Breakdown',can you tell me your email, I can send an article in the book about the topic.

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: