DSPRelated.com
Forums

Conceptual model for pointer arithmetic in FIR filter implementation?

Started by Chris Bore October 13, 2004
glen herrmannsfeldt wrote:

  ...

> Well, this was on comp.arch not so long ago. RISC processors > don't tend to have autoincrement mode, and, except for a few > special instructions, x86 doesn't, either. Then there is the > question of which notation the optimizers recognize.
I would imagine that Chris Bore's course would be heavily into DSPs. http://www.bores.com/ has one of the better on-line DSP courses. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������

Jerry Avins wrote:

> glen herrmannsfeldt wrote:
>>Well, this was on comp.arch not so long ago. RISC processors >>don't tend to have autoincrement mode, and, except for a few >>special instructions, x86 doesn't, either. Then there is the >>question of which notation the optimizers recognize.
> I would imagine that Chris Bore's course would be heavily into DSPs. > http://www.bores.com/ has one of the better on-line DSP courses.
The discussion on comp.arch may not have emphasized DSPs, so it might be different. Though much has changed in compiler technology over the years since PDP-11's were new. It is likely that if it matters the compiler will generate autoincrement code for array reference source. It might even do it better than for pointer increment source. Another consideration is human readability. It is easier for most people to read indexed source, knowing that index variables are changing and array names are not, than to try to remember which pointer is pointing where. Especially bad are off-by-one errors that are easy to make in these problems. In addition, the two samples in the original post both used indexing. -- glen

Jerry Avins wrote:

(after I wrote)

>>Well, this was on comp.arch not so long ago. RISC processors >>don't tend to have autoincrement mode, and, except for a few >>special instructions, x86 doesn't, either. Then there is the >>question of which notation the optimizers recognize.
> I would imagine that Chris Bore's course would be heavily into DSPs. > http://www.bores.com/ has one of the better on-line DSP courses.
There is even a free web course, which looks pretty nice. It does mention the autoincrement mode, and that some compilers will generate it given index mode source. -- glen
Chris Bore wrote:
> In preparing a class on DSP, with programming, I want to explain the > following piece of code that implements the inner loop of an FIR: > > *(y+n) = 0; > for (k = 0; k <= n; k++) { > *(y+n) += *(h+k)*(*(x+n-k)); > > An equivalent implementation using arrays would be: > > y[n] = 0; > for (k = 0; k <= n; k++) { > y[n] += h[k] * x[n - k];
Chris, if this is to be run on a DSP, I would suggest coding the inner loop in assembler instead of C. There are two advantages to this: 1. If you program a DSP with C, every now and then you will be faced with having to write a certain part of the code (inner loops, initialization of peripherals or DMAs etc.) with assembler. The ability to include inline assembler code in a C program is crucial for programming DSPs, and this is a good chance to teach it. 2. In SHARC assembler, the FIR loop looks like this: lcntr=FIR_TAPS-1, do (pc,1) until lce; f12=f0*f4, f8=f8+f12, f4=dm(i1,m6), f0=pm(i9,m14); What could be clearer? And you are guaranteed the most efficient implementation (without having to bother about optimization levels of the compiler) taking advantage of the DSPs special capabilities (which I'm sure are also a topic in your class). Regards, Andor
This has already been mentioned a few times, but the two methods are by 
DEFINITION equivalent.  The "array notation" was created to make code more 
readable and *(y+n) is exactly the same as y[n].  This is more of a C 
concept that a DSP concept so I would not even mention the "pointer method" 
in your class.  If you ask me it's just poorly written code!

Now as Jerry was saying you can sometimes get better efficiencies by using 
an autoincrement mode.  However, neither "method" is using autoincrement. 
If you wanted to try optimizing the code you could do something like y[n] = 
x[n-k]*h[k++].  However, any decent compiler should take advantage of the 
processor's addressing modes in order to generate efficient code for you.

By the way, as Dirk pointed out why is the compare in this loop k<n, 
shouldn't it be k<NUM_TAPS?

Brad

"Chris Bore" <chris_bore@yahoo.co.uk> wrote in message 
news:468debd2.0410130644.3ce25dcf@posting.google.com...
> In preparing a class on DSP, with programming, I want to explain the > following piece of code that implements the inner loop of an FIR: > > *(y+n) = 0; > for (k = 0; k <= n; k++) { > *(y+n) += *(h+k)*(*(x+n-k)); > > An equivalent implementation using arrays would be: > > y[n] = 0; > for (k = 0; k <= n; k++) { > y[n] += h[k] * x[n - k]; > > Now, the array approach is clearly closer to the math (it is very close to > the difference equation). > > The pointer approach is because the programmer declares y as a pointer: > > fractional* y = dstSamps; > > and dynamically allocates memory using a function that returns a pointer: > > So, the programmer chose to keep his code consistent with his declaration > (ie to continue to use pointers). > > Apart from the dynamic allocation, the two ways of writing result in > identical code produced by the compiler (as they should..). > > So my initial reaction is that the pointer code is > implementation-oriented, > while the array code is math-oriented. I can show how the array code > arises, > then discuss the implementation details and so explain the programmer's > choices. > > But then I tried to think of a conceptual model that would let me directly > explain the pointer code. For example, modelling not from difference > equations but from a diagram showing the lists of coefficients, samples > etc. > But here I kept finding I had to come back to implementation details - "it > points to an address in memory" whereas I want to explain in terms of the > math (or more correctly the application). > > Question: can anyone suggest a pointer to discussion or explanation of DSP > that directly uses a conceptual model close to pointers? > > Second question: is the programmer right to keep his code consistent with > declaration, or is there a higher imperative that means he should have > used > a more easily understood conceptual model using arrays? > > Thanks, > > Chris > ============================== > Chris Bore > BORES Signal Processing > chris@bores.com > www.bores.com
glen herrmannsfeldt wrote:

> There is discussion over which came first, the PDP-11 > autoincrement addressing mode or C's ++ operator, but it > does make sense on the PDP-11, especially with only eight > registers including SP and PC.
Here is what Dennis Ritchie says about it: [++ and --] were not in the earliest versions of B, but appeared along the way. People often guess that they were created to use the auto-increment and auto-decrement address modes provided by the DEC PDP-11 on which C and Unix first became popular. This is historically impossible, since there was no PDP-11 when B was developed. The PDP- 7, however, did have a few `auto-increment' memory cells, with the property that an indirect memory reference through them incremented the cell. This feature probably suggested such operators to Thompson [...] http://www.cs.bell-labs.com/who/dmr/chist.html Martin -- Quidquid latine dictum sit, altum viditur.
Martin Eisenberg wrote:

> glen herrmannsfeldt wrote: > > >>There is discussion over which came first, the PDP-11 >>autoincrement addressing mode or C's ++ operator, but it >>does make sense on the PDP-11, especially with only eight >>registers including SP and PC. > > > Here is what Dennis Ritchie says about it: > > [++ and --] were not in the earliest versions of B, but appeared > along the way. People often guess that they were created to use the > auto-increment and auto-decrement address modes provided by the DEC > PDP-11 on which C and Unix first became popular. This is historically > impossible, since there was no PDP-11 when B was developed. The PDP- > 7, however, did have a few `auto-increment' memory cells, with the > property that an indirect memory reference through them incremented > the cell. This feature probably suggested such operators to Thompson > [...] > > http://www.cs.bell-labs.com/who/dmr/chist.html > > > Martin
There were auto-increment cells and auto-decrement cells in the Nova 1200 (with which I did my first processor-based A/D conversion) but they were distinct. Stack pointers as we know them were not yet common coin. Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
I remember how good a newsgroup comp.dsp was - I drifted away a few
years ago but am glad to be back. Thank you for informed and helpful
responses.

Yes, the two ways of writing code are equivalent. When I questioned
the author, he gave a plausible explanation. He allocates the memory
dynamically, so declares y (for instance) as a pointer (which malloc
will then return pointing to the allocated memory) - and so decided it
was consistent to stick with the way he declared. I am not convinced -
Kernighan has many examples where he declares a pointer but uses it
like an array, presumably precisely because it is the same - but there
is a certain logic to it.

I used to advocate writing C that 'matched the architecture'- hence
the explanations of this on our web course - but I am less convinced
now. Compilers will do this, and I think most now do - anyway, the
programmer should not assume that certain C will result in definite
assembler. I am also now much more aware that not all DSP is done on
'DSP Procesors' nor even in C.

I'm more interested now because I have been dealing with DSP on
non-traditional DSPs (eg VLIW, conventional RISC) for some time, and
in environments where code readability and separation of code from
architecture (implementation) is MUCH more highly valued than
(sometimes only imagined) efficiency. So my question was to gain the
views of a community I respect (comp.dsp) and the responses remind me
that I should participate here more often.

There was one other part to the question, though - is there an
intuitively compelling, and 'math/application-oriented', conceptual
model of DSP that would use pointers to lists rather thn elements of
arrays as its basic model?

Thanks very much,

Chris


"Brad Griffis" <bradgriffis@hotmail.com> wrote in message news:<putbd.7870$5b1.7808@newssvr17.news.prodigy.com>...
> This has already been mentioned a few times, but the two methods are by > DEFINITION equivalent. The "array notation" was created to make code more > readable and *(y+n) is exactly the same as y[n]. This is more of a C > concept that a DSP concept so I would not even mention the "pointer method" > in your class. If you ask me it's just poorly written code! > > Now as Jerry was saying you can sometimes get better efficiencies by using > an autoincrement mode. However, neither "method" is using autoincrement. > If you wanted to try optimizing the code you could do something like y[n] = > x[n-k]*h[k++]. However, any decent compiler should take advantage of the > processor's addressing modes in order to generate efficient code for you. > > By the way, as Dirk pointed out why is the compare in this loop k<n, > shouldn't it be k<NUM_TAPS? > > Brad > > "Chris Bore" <chris_bore@yahoo.co.uk> wrote in message > news:468debd2.0410130644.3ce25dcf@posting.google.com... > > In preparing a class on DSP, with programming, I want to explain the > > following piece of code that implements the inner loop of an FIR: > > > > *(y+n) = 0; > > for (k = 0; k <= n; k++) { > > *(y+n) += *(h+k)*(*(x+n-k)); > > > > An equivalent implementation using arrays would be: > > > > y[n] = 0; > > for (k = 0; k <= n; k++) { > > y[n] += h[k] * x[n - k]; > > > > Now, the array approach is clearly closer to the math (it is very close to > > the difference equation). > > > > The pointer approach is because the programmer declares y as a pointer: > > > > fractional* y = dstSamps; > > > > and dynamically allocates memory using a function that returns a pointer: > > > > So, the programmer chose to keep his code consistent with his declaration > > (ie to continue to use pointers). > > > > Apart from the dynamic allocation, the two ways of writing result in > > identical code produced by the compiler (as they should..). > > > > So my initial reaction is that the pointer code is > > implementation-oriented, > > while the array code is math-oriented. I can show how the array code > > arises, > > then discuss the implementation details and so explain the programmer's > > choices. > > > > But then I tried to think of a conceptual model that would let me directly > > explain the pointer code. For example, modelling not from difference > > equations but from a diagram showing the lists of coefficients, samples > > etc. > > But here I kept finding I had to come back to implementation details - "it > > points to an address in memory" whereas I want to explain in terms of the > > math (or more correctly the application). > > > > Question: can anyone suggest a pointer to discussion or explanation of DSP > > that directly uses a conceptual model close to pointers? > > > > Second question: is the programmer right to keep his code consistent with > > declaration, or is there a higher imperative that means he should have > > used > > a more easily understood conceptual model using arrays? > > > > Thanks, > > > > Chris > > ============================== > > Chris Bore > > BORES Signal Processing > > chris@bores.com > > www.bores.com
Thanks, Jerry.

I do not think your thinking has gone astray. But is it thinking from
just one side of the problem? The pointer notation is in fact close to
the hardware model of what happens (provided the architecture is
indeed what the programmer thinks it is). Which is helpful for a
programmer thinking 'how do I implement this?'. But it is not close to
the way the math would be written. So if you are coming up from the
hardware implementation the pointers make sense, whereas if you are
coming down from the math they may not be so immediately obvious.

A second point is perhaps, that the C code here is written with
assumptions about the underlying architecture (the programmer who
wrote this explained to me that 'all DSP procesors work like this..').
For example a particular processor may not offer an autoincrement
register - VLIWs typically don't. So the assumption may be too narrow:
'all DSP programs are written for DSP processors and all DSP
processors have this simple architecture so this code will result in
efficient use of the machine....'.

Chris
============================
Chris Bore
BORES Signal Processing
www.bores.com
chris@bores.com

Jerry Avins <jya@ieee.org> wrote in message news:<2t5lg5F1rgltnU1@uni-berlin.de>...
> Jon Harris wrote: > > > I tend to agree with JaaC and Glen. To me, the pointer notation obfuscates the > > math/DSP concepts. If you are teaching a class on the C language, then show the > > pointer method. But for understanding DSP, go with the array method. If you > > want to throw in the pointer method at the end as an alternate coding method > > that _might_ be more efficient in some cases, then do so. I wouldn't make a big > > deal about it, but it might be nice to mention it because the students may come > > across it when looking at other's code in the future. > > I don't want to be seen as arguing against this, but it doesn't really > make sense to me. I'd like to say why not, and be educated. > > It seems to me that the natural way to write that code in assembler > (which is my model of "what the machine does") is by loading an address > into an autoincrement register and fetching successive elements as > needed. "x = *pointer++" is practically a paraphrase of that. The array > approach would put the base address on a fixed register and use indexed > addressing with an autoincrement autoincrement register. That's two > registers instead of one, with the possibility that the compiler will > actually calculate an effective address at run time for each reference. > That would not only be slow, but conceptually cumbersome. > > Where has my thinking gone astray? > > Jerry
The question of 'DSP is always done with DSP processors' is
interesting. What about VLIWs? FPGAs? I've been very interested for
the past few years in a lot of work being done using fast parallel
architectures to do DSP-like things (specifically, video processing,
audio compression and de-compression, etc). These devices do not have
MAC units or autoincrement registers but they do run at 9 GOPS and
they are used routinely (in set top boxes and TVs) to do real time
processing on audio, video and other signals at rates up to 108 MHz.
At the other end of the scale, much low-power audio DSP is in fact
done on ARM processors. So I am less sure than I used to be that DSP
is about the rather narrow range of DSP processors from the main
stables.

Chris
=====================================
Chris Bore
BORES Signal Processing
www.bores.com
chris@bores.com

Tim Wescott <tim@wescottnospamdesign.com> wrote in message news:<10mrbdnh6ld0927@corp.supernews.com>...
> glen herrmannsfeldt wrote: > > > > > > Jerry Avins wrote: > > > > (mostly previously snipped discussion about array and pointer > > notation in C) > > > >> I don't want to be seen as arguing against this, but it doesn't really > >> make sense to me. I'd like to say why not, and be educated. > > > > > >> It seems to me that the natural way to write that code in assembler > >> (which is my model of "what the machine does") is by loading an address > >> into an autoincrement register and fetching successive elements as > >> needed. "x = *pointer++" is practically a paraphrase of that. > > > > > > Well, that assumes that you have an autoincrement register. > > There is discussion over which came first, the PDP-11 > > autoincrement addressing mode or C's ++ operator, but it > > does make sense on the PDP-11, especially with only eight > > registers including SP and PC. VAX has autoincrement mode, > > and also good indexed modes. You probably don't use either > > of those very often. > > On a DSP? Oh yes you do. If it doesn't have a MAC instruction that > lets you multiply two numbers, add the result to an accumulator, load > their temporary registers, increment their pointers (and possibly wrap > them in hardware) all in one clock cycle -- then it's a poor excuse for > a DSP. > > > > > The array > > >> approach would put the base address on a fixed register and use indexed > >> addressing with an autoincrement autoincrement register. That's two > >> registers instead of one, with the possibility that the compiler will > >> actually calculate an effective address at run time for each reference. > >> That would not only be slow, but conceptually cumbersome. > > > > > > Well, first of all the OP gave two programs that will both > > compile the same way, though one looked pointer like and one > > array like. Many processors now have indexed addressing modes > > that scale by the size of the item addressed, and have enough > > registers that it doesn't matter so much. The loop variable > > needs to be in a register already, so it really isn't another > > register. Most likely any extra computation can be done > > during the time needed to do the memory load/store. > > > > Well, this was on comp.arch not so long ago. RISC processors > > don't tend to have autoincrement mode, and, except for a few > > special instructions, x86 doesn't, either. Then there is the > > question of which notation the optimizers recognize. > > > > -- glen > > > But digital signal processors do, and this is comp.dsp.