DSPRelated.com
Forums

DC Blocking Filter with Noise Shaping

Started by Sinclair June 12, 2012
>That doesn't justify storing it as a long. >-- >Randy Yates >Digital Signal Labs >http://www.digitalsignallabs.com >
To me it does.
On 6/19/2012 10:20 PM, robert bristow-johnson wrote:
> On 6/19/12 8:43 PM, Sinclair wrote:
...
>> Finally, a big thanks to Robert for putting his C implementation in >> the public domain. As has already been clarified, you can find it here >> >> http://www.dspguru.com/dsp/tricks/fixed-point-dc-blocking-filter-with-noise-shaping >> >> >> I also found an interesting mini-paper on the subject here (warning >> link to - PDF): >> >> http://www.digitalsignallabs.com/dcblock.pdf >> >> which includes a signal flow graph that produces an identical >> difference equation to that used by Robert. Suffice to say that I have >> learned a great deal from both - and from the replies to date. > > note 5: > > Tim, i just don't know what else to do. everytime this DC-blocking with > noise-shaping issue comes up, i *always* point them to your algorithm, > *including* writing the code so it should be easy for them to see that > your alg is so much simpler and more efficient. > > then i *always* reiterate that "Tim's method is better."
...
> is *better* and *more* efficient than the code i supplied to dspguru. > that's what you should be using, Rodney. > > maybe i'll give up plugging your new improved alg, Tim. but it does > sorta grate me to see it passed over without comment.
Robert, Why don't you ask Grant to replace your code with Tim's appropriately credited, or at least a link to it? Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
"niarn" <niaren9@n_o_s_p_a_m.gmail.com> writes:

>>That doesn't justify storing it as a long. >>-- >>Randy Yates >>Digital Signal Labs >>http://www.digitalsignallabs.com >> > > To me it does.
So let me see if I've got this right: You would just as soon waste memory as conserve it? That is, even though changing this 32-bit integer to a 16-bit integer affects nothing except memory usage, you'd just as soon use more memory? Keep in mind the question here is not whether or not this is a lot of memory, but rather given that everything else is equal (which it is), which way would be more optimum. Even a little, tiny bit more optimum. It may indeed be too small to fire up your text editor, make the change, recompile, and re-FLASH, but is it more optimum? -- Randy Yates Digital Signal Labs http://www.digitalsignallabs.com
On 6/20/12 10:35 AM, Randy Yates wrote:
> "niarn"<niaren9@n_o_s_p_a_m.gmail.com> writes: > >>> That doesn't justify storing it as a long. >>> >> >> To me it does. > > So let me see if I've got this right: You would just as soon waste > memory as conserve it? > > That is, even though changing this 32-bit integer to a 16-bit integer > affects nothing except memory usage, you'd just as soon use more memory? > > Keep in mind the question here is not whether or not this is a lot of > memory, but rather given that everything else is equal (which it is), > which way would be more optimum. Even a little, tiny bit more optimum. > > It may indeed be too small to fire up your text editor, make the change, > recompile, and re-FLASH, but is it more optimum?
niarn and i see it the same way. there are different dimensions of efficiency. code compactness, memory use, *and* execution time. so, Randy, you would rather waste the execution time to do this short A = (short)(32768.0*(1.0 - pole)); ... acc -= (long)A*prev_y; or implicitly: short A = (short)(32768.0*(1.0 - pole)); ... acc -= A*prev_y; and save a measley 2 bytes? the problem is, in my opinion, that the C language errs in its type promotion. they should have defined from the very beginning that a short times a short is a long. but they didn't do that. so when you want to do a MAC instruction that is so simple in a DSP, you have to waste the bits (and possibly some time) to promote everybody to long before doing a long times long. (also, now that type "long long" exists, it should be the natural product of two longs getting multiplied.) it's not as bad as MATLAB's oversight regarding indices, but it is one of about 4 or 5 mistakes that Dennis Ritchie made from the beginning. (the others were the lack of an integer power operator, the use of "^" for EOR, that the shift operators ">>" and "<<" have lower precedence than "+" and "*", and that there is no basic "complex" type. but the worst is that the implicit word size in the middle of an equations should increase when words are multiplied, *unless* there is an explicit cast or it goes into a word that is not made wider.) -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
On 6/20/12 8:51 AM, Jerry Avins wrote:
> On 6/19/2012 10:20 PM, robert bristow-johnson wrote: >> >> Tim, i just don't know what else to do. everytime this DC-blocking with >> noise-shaping issue comes up, i *always* point them to your algorithm, >> *including* writing the code so it should be easy for them to see that >> your alg is so much simpler and more efficient. >> >> then i *always* reiterate that "Tim's method is better." > > ... > >> is *better* and *more* efficient than the code i supplied to dspguru. >> that's what you should be using, Rodney. >> >> maybe i'll give up plugging your new improved alg, Tim. but it does >> sorta grate me to see it passed over without comment. > > Robert, > > Why don't you ask Grant to replace your code with Tim's appropriately > credited, or at least a link to it? >
i think i did long ago when Tim first put it up. i can't remember if i got a response and Grant does not seem to hang out here anymore. i'll write him again. On 6/20/12 7:40 AM, Randy Yates wrote:
> robert bristow-johnson<rbj@audioimagination.com> writes: >> [...] >> so i repeat, this alg: >> >> ______________ >> >> /* >> >> y[n] = x[n] - Quantize{ w[n] } >> >> = x[n] - (w[n] + e[n]) >> >> w[n+1] = w[n] + (1-pole)*y[n] >> */ >> >> short x[], y[]; >> double pole = 0.9999; >> long w, A, curr_y; >> unsigned long n, num_samples; >> >> A = (long)(32768.0*(1.0 - pole)); >> w = 0; >> >> for (n=0; n<num_samples; n++) >> { >> curr_y = (long)x[n] - (w>>15); // quantization happens here >> w += A*curr_y; >> y[n] = (short)curr_y; >> } >> >> ______________ >> >> is *better* and *more* efficient than the code i supplied to dspguru. >> that's what you should be using, Rodney. > > I would agree it is more efficient. I don't agree it is "better" in all > ways.
it's more efficient, and it's mathematically equivalent. right down to the bit.
> > Q15 is not the best use of 16 bits to represent A since A has lots of > zeros to the right of the binary point. Something like Q19, etc. > (depends on the value of A) would be more accurate. The alternative is > to subtract the y[n-1] term separately and add the term p*y[n-1], where > representing p by a Q15 is a very good use of all 16 bits.
can't speak for Tim, but i imagine that neither Tim nor i want to waste the execution time to sign extend either A or curr_y just to make absolutely damn sure that the sign extension to A*curr_y is not done too late. it would be fatal if A*curr_y and the result was merely a short (because both A and curr_y are short) and then that short is sign extended to add to w. that's what we *really* do not want to do. maybe curr_y does not have to be a long and you get: curr_y = x[n] - (short)(w>>15); // quantization happens here w += A*curr_y; y[n] = curr_y; if A is a long, then the multiplication A*curr_y should be a long. i dunno if the compiler would find a faster MAC instruction knowing that one of the multiplicands is a short, but i doubt it.
> However, grouping your (1-p) this way does save computations.
the purpose was so that the accumulator we start with at the beginning of the current sample is precisely the same as the accumulator we ended up with in the previous sample. that makes it efficient. for a biquad (Direct Form 1) it would look like y[n] = Quantize{ y[n-1] + (a1-1)*y[n-1] + a2*y[n-2] + b0*x[n] + b1*x[n-1] + b2*x[n-2] } so, if you store y[n] as a double-wide word, that is the real output y[n] in the high bits and the negative of the added error, -e[n], in the low bits, you can do this fraction saving without bothering to do a zero-extension of the truncated bits you're feeding back. but you have to adjust one coefficient. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
robert bristow-johnson <rbj@audioimagination.com> writes:

> On 6/20/12 10:35 AM, Randy Yates wrote: >> "niarn"<niaren9@n_o_s_p_a_m.gmail.com> writes: >> >>>> That doesn't justify storing it as a long. >>>> >>> >>> To me it does. >> >> So let me see if I've got this right: You would just as soon waste >> memory as conserve it? >> >> That is, even though changing this 32-bit integer to a 16-bit integer >> affects nothing except memory usage, you'd just as soon use more memory? >> >> Keep in mind the question here is not whether or not this is a lot of >> memory, but rather given that everything else is equal (which it is), >> which way would be more optimum. Even a little, tiny bit more optimum. >> >> It may indeed be too small to fire up your text editor, make the change, >> recompile, and re-FLASH, but is it more optimum? > > niarn and i see it the same way.
Then you're both making errors.
> there are different dimensions of efficiency. code compactness, > memory use, *and* execution time.
There are, yes.
> so, Randy, you would rather waste the execution time to do this > > short A = (short)(32768.0*(1.0 - pole)); > ... > acc -= (long)A*prev_y; > > or implicitly: > > short A = (short)(32768.0*(1.0 - pole)); > ... > acc -= A*prev_y; > > and save a measley 2 bytes? > > the problem is, in my opinion, that the C language errs in its type > promotion. they should have defined from the very beginning that a > short times a short is a long. but they didn't do that. so when you > want to do a MAC instruction that is so simple in a DSP, you have to > waste the bits (and possibly some time) to promote everybody to long > before doing a long times long. (also, now that type "long long" > exists, it should be the natural product of two longs getting > multiplied.)
You are making a mistake. prev_y is already long. Your typecast (in the first case) of A does nothing. C's arithmetic conversion conventions ensure A will be converted to long int before multiplying. -- Randy Yates Digital Signal Labs http://www.digitalsignallabs.com
Randy Yates <yates@digitalsignallabs.com> writes:

> robert bristow-johnson <rbj@audioimagination.com> writes: > >> On 6/20/12 10:35 AM, Randy Yates wrote: >>> "niarn"<niaren9@n_o_s_p_a_m.gmail.com> writes: >>> >>>>> That doesn't justify storing it as a long. >>>>> >>>> >>>> To me it does. >>> >>> So let me see if I've got this right: You would just as soon waste >>> memory as conserve it? >>> >>> That is, even though changing this 32-bit integer to a 16-bit integer >>> affects nothing except memory usage, you'd just as soon use more memory? >>> >>> Keep in mind the question here is not whether or not this is a lot of >>> memory, but rather given that everything else is equal (which it is), >>> which way would be more optimum. Even a little, tiny bit more optimum. >>> >>> It may indeed be too small to fire up your text editor, make the change, >>> recompile, and re-FLASH, but is it more optimum? >> >> niarn and i see it the same way. > > Then you're both making errors. > >> there are different dimensions of efficiency. code compactness, >> memory use, *and* execution time. > > There are, yes. > >> so, Randy, you would rather waste the execution time to do this >> >> short A = (short)(32768.0*(1.0 - pole)); >> ... >> acc -= (long)A*prev_y; >> >> or implicitly: >> >> short A = (short)(32768.0*(1.0 - pole)); >> ... >> acc -= A*prev_y; >> >> and save a measley 2 bytes? >> >> the problem is, in my opinion, that the C language errs in its type >> promotion. they should have defined from the very beginning that a >> short times a short is a long. but they didn't do that. so when you >> want to do a MAC instruction that is so simple in a DSP, you have to >> waste the bits (and possibly some time) to promote everybody to long >> before doing a long times long. (also, now that type "long long" >> exists, it should be the natural product of two longs getting >> multiplied.) > > You are making a mistake. prev_y is already long. Your typecast (in > the first case) of A does nothing. C's arithmetic conversion conventions > ensure A will be converted to long int before multiplying.
Or are you saying that the process of doing the promotion/conversion is going to take time (whether implicit or not)? Ummm, maybe in some strange CPU in another universe. In no machine I can think of will it make a difference. -- Randy Yates Digital Signal Labs http://www.digitalsignallabs.com
On 6/20/12 12:11 PM, Randy Yates wrote:

> Or are you saying that the process of doing the promotion/conversion is > going to take time (whether implicit or not)? > > Ummm, maybe in some strange CPU in another universe. In no machine I can > think of will it make a difference.
really? sign-extension is an instruction that costs no CPU clocks? -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
>On 6/20/12 12:11 PM, Randy Yates wrote: > >> Or are you saying that the process of doing the promotion/conversion is >> going to take time (whether implicit or not)? >> >> Ummm, maybe in some strange CPU in another universe. In no machine I
can
>> think of will it make a difference. > >really? sign-extension is an instruction that costs no CPU clocks? > >-- > >r b-j rbj@audioimagination.com > >"Imagination is more important than knowledge." > > >
This is when I realize that although hardware design can be a pain - mostly because Verilog is a horrible language on all aspects - it's very nice to not have to bother about this 'CPU' instruction stuff:)
robert bristow-johnson <rbj@audioimagination.com> writes:

> On 6/20/12 8:51 AM, Jerry Avins wrote: >> On 6/19/2012 10:20 PM, robert bristow-johnson wrote: >>> >>> Tim, i just don't know what else to do. everytime this DC-blocking with >>> noise-shaping issue comes up, i *always* point them to your algorithm, >>> *including* writing the code so it should be easy for them to see that >>> your alg is so much simpler and more efficient. >>> >>> then i *always* reiterate that "Tim's method is better." >> >> ... >> >>> is *better* and *more* efficient than the code i supplied to dspguru. >>> that's what you should be using, Rodney. >>> >>> maybe i'll give up plugging your new improved alg, Tim. but it does >>> sorta grate me to see it passed over without comment. >> >> Robert, >> >> Why don't you ask Grant to replace your code with Tim's appropriately >> credited, or at least a link to it? >> > > i think i did long ago when Tim first put it up. i can't remember if > i got a response and Grant does not seem to hang out here anymore. > > i'll write him again. > > > On 6/20/12 7:40 AM, Randy Yates wrote: >> robert bristow-johnson<rbj@audioimagination.com> writes: >>> [...] >>> so i repeat, this alg: >>> >>> ______________ >>> >>> /* >>> >>> y[n] = x[n] - Quantize{ w[n] } >>> >>> = x[n] - (w[n] + e[n]) >>> >>> w[n+1] = w[n] + (1-pole)*y[n] >>> */ >>> >>> short x[], y[]; >>> double pole = 0.9999; >>> long w, A, curr_y; >>> unsigned long n, num_samples; >>> >>> A = (long)(32768.0*(1.0 - pole)); >>> w = 0; >>> >>> for (n=0; n<num_samples; n++) >>> { >>> curr_y = (long)x[n] - (w>>15); // quantization happens here >>> w += A*curr_y; >>> y[n] = (short)curr_y; >>> } >>> >>> ______________ >>> >>> is *better* and *more* efficient than the code i supplied to dspguru. >>> that's what you should be using, Rodney. >> >> I would agree it is more efficient. I don't agree it is "better" in all >> ways. > > it's more efficient, and it's mathematically equivalent. right down > to the bit.
Yup, I made a mistake. Although now I would say it's not equivalent but rather superior! The whole equivalent thing hinges on the fact that round(n + x) = n + round(x), when n is an integer and x is a real. The fact is, you could use a better representation for A than (16,15). However, if you stick with (16,15), it isn't any worse than doing the update with P. And if you use a better representation, such as (12,19), you could get a bit better precision by doing something like acc -= (A * prev_y) >> 4; --RY -- Randy Yates Digital Signal Labs http://www.digitalsignallabs.com