DSPRelated.com
Forums

Re: DSP Trick: Fixed Point DC Blocking Filter with Noise-Shaping

Started by Darrell March 7, 2008
r b-j posted a trick back in 1999 to remove the DC of a signal, which
I recently ran across in DSP guru.  It was basically a differentiator
followed by a leaky integrator with some tricks to remove the
quantization noise.  I was wanting to give it a try but I saw
something in the 56K assembly code which looked incorrect.  The
original post can be found at http://groups.google.com/group/comp.dsp/msg/9ee3ca0606fbcb27.

        do      #CHUNK_SIZE,filterLoop
        sub     -x0,a           x:(r1),x0       a,y0            ;
truncated state y[n-1] -> y0, a = y[n-1] - x[n-1]
        add      x0,a           y0,x:(r1)+                      ;
store y[n-1] to output,       a = y[n-1] - x[n-1] + x[n]
        mac
y1,y0,a                                        ;
a = y[n-1] - x[n-1] + x[n] + (pole-1)*y[n-1]
                                                                ;
= (x[n]-x[n-1]) + pole*y[n-1] = y[n]
filterLoop                                                      ; now
x[n] -> x[n-1], y[n] -> y[n-1] implicitly

From what I know of 56K assembly, the "sub -x0,a" is the same as "add
x0,a", which doesn't fit the algorithm so far as I can tell.  I
realize it has been eight (!) years, but I figured someone might still
be hanging around who has used this seemingly neat trick before :-).

Thanks,
Darrell
On Mar 7, 3:09 pm, Darrell <darrel...@gmail.com> wrote:
> r b-j posted a trick back in 1999 to remove the DC of a signal, which > I recently ran across in DSP guru. It was basically a differentiator > followed by a leaky integrator with some tricks to remove the > quantization noise. I was wanting to give it a try but I saw > something in the 56K assembly code which looked incorrect. The > original post can be found athttp://groups.google.com/group/comp.dsp/msg/9ee3ca0606fbcb27. > > do #CHUNK_SIZE,filterLoop > sub -x0,a x:(r1),x0 a,y0 ; > truncated state y[n-1] -> y0, a = y[n-1] - x[n-1] > add x0,a y0,x:(r1)+ ; > store y[n-1] to output, a = y[n-1] - x[n-1] + x[n] > mac > y1,y0,a ; > a = y[n-1] - x[n-1] + x[n] + (pole-1)*y[n-1] > ; > = (x[n]-x[n-1]) + pole*y[n-1] = y[n] > filterLoop ; now > x[n] -> x[n-1], y[n] -> y[n-1] implicitly > > From what I know of 56K assembly, the "sub -x0,a" is the same as "add > x0,a", which doesn't fit the algorithm so far as I can tell. I > realize it has been eight (!) years, but I figured someone might still > be hanging around who has used this seemingly neat trick before :-).
i think i made a "transcription" mistake in posting. the original file is: (watch out for word-wrap on the end-line comments that google groups wraps) ; this DC blocking filter is a differentiator (that kills the DC) ; followed by a "leaky" integrator. the differentiator has a ; zero placed right on DC (z = 1) and the integrator has its pole ; very close to DC (z = pole = approx. 1). since the integrator ; can itself generate DC because of a limit cycle problem, first ; order noise shaping is used (with a zero at DC) to eliminate ; any DC output from the integrator in the case where the input ; that goes to silence. for pipelining and efficiency reasons, ; the output is delayed one sample with this code. ; ; it is coded in a way to process the samples in-place but that ; is easily changed. ; CHUNK_SIZE equ 4 NUM_INPUTS equ 8 sampleArray equ $50 stateArray equ $80 pole equ 0.9999 ; 0 <= pole < 1 ; pole = [1 - sin(w)]/cos(w) ; where w = 2*pi*F_cutoff/Fs ; w = arccos[2*pole/(pole^2 + 1)] org p:$40 move #<sampleArray,r1 move #<stateArray,r6 move #<2,n6 move #>(pole-1.0),y1 move #>(pole+1.0)/2.0,x1 ; forward coefficient do #NUM_INPUTS,channelLoop ; move y:(r6)+,r1 ; get pointer to input/output CHUNK move y:(r6)+,x0 ; get feedforward state move y:(r6)+,a ; get feedback state (hi word) move y:(r6)-n6,a0 ; get feedback state (lo word) do #CHUNK_SIZE,filterLoop ; truncated state y[n-1] -> y0, a = y[n-1] - (pole+1)/2*x[n-1] mac -x1,x0,a x:(r1),x0 a,y0 ; store y[n-1] to output, a = y[n-1] - (pole+1)/2*x[n-1] + (pole+1)/ 2*x[n] mac x1,x0,a y0,x:(r1)+ ; a = y[n-1] - (pole+1)/2*x[n-1] + (pole+1)/2*x[n] + (pole-1)*y[n-1] ; = (pole+1)/2*(x[n]-x[n-1]) + pole*y[n-1] = y[n] mac y1,y0,a filterLoop ; now x[n] -> x[n-1], y[n] -> y[n-1] implicitly move x0,y:(r6)+ ; save feedforward state move a,y:(r6)+ ; save feedback state (hi word) move a0,y:(r6)+ ; save feedback state (lo word) channelLoop anyway, there is a slight scaling done to the input because this DC block filter used BLT transform to be designed. since x1 is so close to 1, a simpler one to look at would be (with comments removed): CHUNK_SIZE equ 4 NUM_INPUTS equ 8 sampleArray equ $50 stateArray equ $80 pole equ 0.9999 ; 0 <= pole < 1 ; pole = [1 - sin(w)]/cos(w) ; where w = 2*pi*F_cutoff/Fs ; w = arccos[2*pole/(pole^2 + 1)] org p:$40 move #<sampleArray,r1 move #<stateArray,r6 move #<2,n6 move #>(pole-1.0),y1 do #NUM_INPUTS,channelLoop move y:(r6)+,x0 move y:(r6)+,a move y:(r6)-n6,a0 do #CHUNK_SIZE,filterLoop sub x0,a x:(r1),x0 a,y0 add x0,a y0,x:(r1)+ mac y1,y0,a filterLoop move x0,y:(r6)+ move a,y:(r6)+ move a0,y:(r6)+ channelLoop i am not sure, but Tim Wescott made it one instruction less and i redid (speculatively) the 56K code. try this: http://groups.google.com/group/comp.dsp/msg/3759bc4014951a03 if you get that one (with 2 instructions in the inner-most loop) to work on a 56K or derivative, please lettuce know. L8r, r b-j
On Mar 7, 5:54&#4294967295;pm, robert bristow-johnson <r...@audioimagination.com>
wrote:
> if you get that one (with 2 instructions in the inner-most loop) to > work on a 56K or derivative, please lettuce know.
Thanks a bunch! Will definitely let the group know when I get it going :-). Darrell
robert bristow-johnson wrote:



I've also used it, adapted for an FPGA of course.
On Mar 10, 9:43 pm, Ray Andraka <r...@andraka.com> wrote:
> robert bristow-johnson wrote:
??
> > I've also used it, adapted for an FPGA of course.
Tim's Quicker DC Blocker? (or, if this precedes Tim's post, the equivalent by another name?) r b-j
robert bristow-johnson wrote:
> On Mar 10, 9:43 pm, Ray Andraka <r...@andraka.com> wrote: > >>robert bristow-johnson wrote: > > ?? > >>I've also used it, adapted for an FPGA of course. > > > Tim's Quicker DC Blocker? (or, if this precedes Tim's post, the > equivalent by another name?) > > r b-j
No, the one you posted on DSP_guru: http://www.dspguru.com/comp.dsp/tricks/alg/dc_block.htm
On Mar 11, 4:05 pm, Ray Andraka <r...@andraka.com> wrote:
> robert bristow-johnson wrote: > > On Mar 10, 9:43 pm, Ray Andraka <r...@andraka.com> wrote: > > >>robert bristow-johnson wrote: > > > ?? > > >>I've also used it, adapted for an FPGA of course. > > > Tim's Quicker DC Blocker? (or, if this precedes Tim's post, the > > equivalent by another name?) > > > r b-j > > No, the one you posted on DSP_guru:http://www.dspguru.com/comp.dsp/tricks/alg/dc_block.htm
oh, that's too bad. i haven't done anything for real with Tim's Quicker DC Blocker and am still curious if some other guinea pig has tried it out. r b-j
On Mar 7, 1:09 pm, Darrell <darrel...@gmail.com> wrote:
> r b-j posted a trick back in 1999 to remove the DC of a signal, which > I recently ran across in DSP guru. It was basically a differentiator > followed by a leaky integrator with some tricks to remove the > quantization noise. I was wanting to give it a try but I saw > something in the 56K assembly code which looked incorrect. The > original post can be found athttp://groups.google.com/group/comp.dsp/msg/9ee3ca0606fbcb27. > > do #CHUNK_SIZE,filterLoop > sub -x0,a x:(r1),x0 a,y0 ; > truncated state y[n-1] -> y0, a = y[n-1] - x[n-1] > add x0,a y0,x:(r1)+ ; > store y[n-1] to output, a = y[n-1] - x[n-1] + x[n] > mac > y1,y0,a ; > a = y[n-1] - x[n-1] + x[n] + (pole-1)*y[n-1] > ; > = (x[n]-x[n-1]) + pole*y[n-1] = y[n] > filterLoop ; now > x[n] -> x[n-1], y[n] -> y[n-1] implicitly
Interesting. For years now I've been using a somewhat different DC blocking structure in my chip designs. Something like this: - DC Estimate is subtracted from input signal - subtracter output goes to perfect integrator where residual DC accumulates - Output of integrator is scaled by shifting to control loop time constant & residual error - Output of shifter is DC estimate which feeds back to subtracter Any thoughts on the comparative advantages of r b-j's approach vs this one? EB
On Mar 10, 9:43&#4294967295;pm, Ray Andraka <r...@andraka.com> wrote:
> I've also used it, adapted for an FPGA of course.
Just out of curiosity, did you essentially do what Steve did to convert the multiply to a shift? Other than that I really don't see anything that isn't FPGA friendly. I have all three algorithms working (in Matlab, using the fixed point toolbox). With the data sets I'm using, I see about a 12 dB improvement in suppression using Robert's initial approach, and about a 9-10 dB improvement using the Tim/Steve approach. As someone (Robert?) observed, Steve's version is the same as Tim's, converted to feed-forward form and using a shift for the multiply. I really like it for a HW implementation, since you are using 1-R you get a great selection of R values. There is nothing stopping one from doing the same thing with Robert's version :-). I haven't had time to dig in too deeply on what is going on with the estimate and why Robert's seems to work better. Just as an FYI, the data set I am using has a DC that is varying slowly (by my standards!) over time. The total error is about 16 dB above the noise floor. I'm also modifying the R value to go from a fast attack to steady state value (on the order of 0.995 for the initial 50 ms down to 0.9999 by 500 ms). Darrell
On Mar 13, 8:44 am, Darrell <darrel...@gmail.com> wrote:
> On Mar 10, 9:43 pm, Ray Andraka <r...@andraka.com> wrote: > > > I've also used it, adapted for an FPGA of course. > > Just out of curiosity, did you essentially do what Steve did to > convert the multiply to a shift? Other than that I really don't see > anything that isn't FPGA friendly. > > I have all three algorithms working (in Matlab, using the fixed point > toolbox). With the data sets I'm using, I see about a 12 dB > improvement in suppression using Robert's initial approach, and about > a 9-10 dB improvement using the Tim/Steve approach. As someone > (Robert?) observed, Steve's version is the same as Tim's, converted to > feed-forward form and using a shift for the multiply. I really like > it for a HW implementation, since you are using 1-R you get a great > selection of R values. There is nothing stopping one from doing the > same thing with Robert's version :-). > > I haven't had time to dig in too deeply on what is going on with the > estimate and why Robert's seems to work better. Just as an FYI, the > data set I am using has a DC that is varying slowly (by my standards!) > over time. The total error is about 16 dB above the noise floor. I'm > also modifying the R value to go from a fast attack to steady state > value (on the order of 0.995 for the initial 50 ms down to 0.9999 by > 500 ms). > > Darrell
Here is my VHDL version of the dc blocker (used in an FPGA). Notice that it uses only shifts to implement multiplication. entity dc_block_if is Port ( data_in : in signed(13 downto 0); clk : in STD_LOGIC; reset : in STD_LOGIC; data_out : out signed(13 downto 0)); end dc_block_if; architecture arch1 of dc_block_if is signal error : unsigned(6 downto 0); signal data_in_dly : signed(13 downto 0); signal dat_out : signed(13 downto 0); constant MAX : signed(22 downto 0) := "00011111111111111111111"; constant MIN : signed(22 downto 0) := "11100000000000000000000"; begin -- Single pole, high-pass filter data_out <= dat_out; dcBlockProc : process (clk, reset) variable vDiff: signed(20 downto 0); variable vProd: signed(20 downto 0); variable vOutScaled: signed(22 downto 0); begin if (reset = '1') then error <= "0000000"; data_in_dly <= "00000000000000"; dat_out <= "00000000000000"; elsif rising_edge (clk) then data_in_dly <= data_in; vDiff := (data_in - data_in_dly) & "0000000"; -- scale data up by 128 because pole is scaled by 128 -- multiply by pole value of 127 vProd := dat_out & "0000000"; -- multiply by 128 vProd := vProd - dat_out; -- subtract dat_out to produce a multiply of 127 vOutScaled := (vProd(20) & vProd(20) & vProd) + vDiff; vOutScaled := vOutScaled + signed('0' & error); -- saturate on overflow if (vOutScaled > MAX) then vOutScaled := MAX; elsif (vOutScaled < MIN) then vOutScaled := MIN; else vOutScaled := vOutScaled; end if; error <= unsigned(vOutScaled(6 downto 0)); -- keep fractional part by taking remainder of integer division dat_out <= vOutScaled(20 downto 7); -- divide by 128 end if; end process; end arch1; Darol Klawetter