DSPRelated.com
Forums

horrible C indexing

Started by Grant Griffin April 1, 2006
Philip Martel wrote:
> "Clay S. Turner" <Physics@Bellsouth.net> wrote in message > news:r4xXf.1558$ki.1418@bignews1.bellsouth.net... > > Hello Grant, > > > > It is great to see your wit has not waned. Thanks for writing on this > > April Fool's Day. Maybe one day we will get indexing based on the > > irrational numbers - but that uncountable thing will likely get in the way > > ;-) > > > I would like to propose indexing based on imaginary numbers....
as long as we can define the base index, why not? r b-j
Praetorian wrote:
> Grant Griffin wrote: >> Am I the only person who's noticed that C's zero-based indexing is >> nothing short of...well...horrible?
...
> > Hahahaha, > Thanks Grant, that just made my day, I actually thought you were being > serious until got to the April 0th part. Brilliant!!!
Thanks, Praetorian, and to all my other comp.dsp friends who responded so positively! Actually, I was hoping that someone would take it seriously, or at least challenge its factual errors. But I guess I'll settle for kind words instead. ;-) The old-timers here could probably have smelled that one even before reading it due to the strangely familiar title. If not, the moment they saw me saying something positive about Matlab, that would have been a dead give-away to them. (BTW, I wonder how many Lear jets Clive has bought with all his Matloot? ;-) I'm not usually much of an April Fooler, but I was inspired to this mischief by Slashdot (http://www.slashdot.org), which turned the most awful pink color yesterday in a purported attempt to increase the proportion of female readers. (My monitor is pretty awful, so when I saw that color my first thought was that it had fritzed completely!) =g2 _____________________________________________________________________ Grant R. Griffin Publisher of dspGuru http://www.dspguru.com Iowegian International Corporation http://www.iowegian.com See http://www.iowegian.com/img/contact.gif for e-mail address
Erik de Castro Lopo wrote:
...
> Here where I am, this arrived on April 2nd, so I initially didn't > recognise it for what it was :-).
Look, if you wanna discuss 2-based indexing, Erik, that's fine, but please start a new thread. ;-) =g2 _____________________________________________________________________ Grant R. Griffin Publisher of dspGuru http://www.dspguru.com Iowegian International Corporation http://www.iowegian.com See http://www.iowegian.com/img/contact.gif for e-mail address
robert bristow-johnson wrote:
...
> and after TMW gets that down, perhaps they can learn how to count from > -32 or -512.
Heck, if they ever succeed at that, they might even be able to do stuff like this: ActivePython 2.4 Build 244 (ActiveState Corp.) based on Python 2.4 (#60, Feb 9 2005, 19:03:27) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> x = {} >>> x[0] = 'zero' >>> x[1] = 'one' >>> x[2] = 2 >>> x[1+2j] = 1+2j >>> x['three'] = 3 >>> x[-512] = -512 >>> print x {0: 'zero', 1: 'one', 2: 2, (1+2j): (1+2j), 'three': 3, -512: -512} (no kidding!) =g2 _____________________________________________________________________ Grant R. Griffin Publisher of dspGuru http://www.dspguru.com Iowegian International Corporation http://www.iowegian.com See http://www.iowegian.com/img/contact.gif for e-mail address
Grant Griffin wrote:
> robert bristow-johnson wrote: > ... > >> and after TMW gets that down, perhaps they can learn how to count from >> -32 or -512. > > > Heck, if they ever succeed at that, they might even be able to do stuff > like this: > > ActivePython 2.4 Build 244 (ActiveState Corp.) based on > Python 2.4 (#60, Feb 9 2005, 19:03:27) [MSC v.1310 32 bit (Intel)] > on win32 > Type "help", "copyright", "credits" or "license" for more > information. > >>> x = {} > >>> x[0] = 'zero' > >>> x[1] = 'one' > >>> x[2] = 2 > >>> x[1+2j] = 1+2j > >>> x['three'] = 3 > >>> x[-512] = -512 > >>> print x > {0: 'zero', 1: 'one', 2: 2, (1+2j): (1+2j), 'three': 3, > -512: -512} > > (no kidding!) > > =g2
The obvious solution is to index from -infinity. Then you avoid the need to perform a bounds check for attempts to access elements before the start of the array. :-) Steve
Press - Release

April II, MMVI
(April 1, 2006 for those barbarian 0 index DSP guys)

After careful consideration, Mathwork's Clive Moler had this announcement 
to make:

It's been brought to our attention that many of our DSP customers are 
unhappy with our choice of one-based indexing. We were relieved when dsp-
guru, Grant Grifffn made such a fine argument supporting our long held 
convention. Since Mr Griffin is clearly the leader of all the DSP 
community, being the dsp-guru and all, we believe we can now put this 
matter to rest.

We also did some historical research. The Romans created many engineering 
marvels during the height of their empire and they did not use 0 in their 
numbering system. It is our belief that DSP barbarians (with their 
heretical 0 based numbering notions), invaded from the north and set back 
Western Civilization for over a thousand years.

Therefore to protect our current civilization from a similar fate, 
Mathworks has decided that we have a responsibility to change our numbering 
system to respond only  to Roman Numerals. We know that this might be a 
small inconvenience to a few of you but we are sure that you will all go 
along when you consider the bigger issues.

Supporting Argument Follows: 


Grant Griffin <nospam@yahoo.com> wrote in news:4a503$442e975f$4088dbc7$1304
@EVERESTKC.NET:

> Am I the only person who's noticed that C's zero-based indexing is > nothing short of...well...horrible? And it doesn't end there: C has > been so influential that its insanity has spread to tons of other > languages: C++, Java, C#, Python, Perl--heck, even LISP. > > Am I the only one here who's suffered a needless off-by-one bug as a > result of C's horrible zero-based indexing? Let's see a show of hands... > > But more important, to those of us in DSP, is the fact that C's indexing > doesn't conform to Matlab--which wisely employs one-based indexing in > accordance with the centurys-old convention of linear algebra. Now, one > might ask, "who died and left Matlab in charge?" Well, nobody, I guess. > But if the Matlab folks hadn't gotten this right, why would so many > DSP folks shell out *thousands* for Matlab, when zero-based alternatives > like SciPy (http://scipy.org/) are absolutely, 100%, free? There can be > only one answer: one-based indexing. > > OK, I can already hear some of you about to cite the old "legacy code" > chestnut as a reason C can't be changed. True, there's a lot of C code > out there. But there's absolutely no reason that C can't switch to > one-based with a zero-based compatibility mode. OK, I know, I know: C++ > programmers can easily create their own one-based array object. But > what sense does it make for countless programmers to home-brew a > solution when Dennis Ritchie could just get off his duff and fix the > problem once-and-for-all? After all, Grace Hopper did it for FORTRAN. > > Sound radical? I don't think so: let's face it, we live in a one-based > world. For example, ask a child to count their fingers, and they'll > invariably assign "one" to the first (er, "zeroth") finger. > > Still not convinced? OK, what about the calendar?--what if today was > "April 0th"? > >=g2 > _____________________________________________________________________ > > Grant R. Griffin > Publisher of dspGuru http://www.dspguru.com > Iowegian International Corporation http://www.iowegian.com > See http://www.iowegian.com/img/contact.gif for e-mail address >
-- Al Clark Danville Signal Processing, Inc. -------------------------------------------------------------------- Purveyors of Fine DSP Hardware and other Cool Stuff Available at http://www.danvillesignal.com
:-)

(hey guyz, pleeeze crosspost these to c.s-s.m.  otherwize it's just our
private pleasure.)

Al Clark wrote:
> Press - Release > > April II, MMVI > (April 1, 2006 for those barbarian 0 index DSP guys) > > After careful consideration, Mathwork's Clive Moler had this announcement > to make: > > It's been brought to our attention that many of our DSP customers are > unhappy with our choice of one-based indexing. We were relieved when dsp- > guru, Grant Grifffn made such a fine argument supporting our long held > convention. Since Mr Griffin is clearly the leader of all the DSP > community, being the dsp-guru and all, we believe we can now put this > matter to rest. > > We also did some historical research. The Romans created many engineering > marvels during the height of their empire and they did not use 0 in their > numbering system. It is our belief that DSP barbarians (with their > heretical 0 based numbering notions), invaded from the north and set back > Western Civilization for over a thousand years. > > Therefore to protect our current civilization from a similar fate, > Mathworks has decided that we have a responsibility to change our numbering > system to respond only to Roman Numerals. We know that this might be a > small inconvenience to a few of you but we are sure that you will all go > along when you consider the bigger issues. > > Supporting Argument Follows: > > > Grant Griffin <nospam@yahoo.com> wrote in news:4a503$442e975f$4088dbc7$1304 > @EVERESTKC.NET: > > > Am I the only person who's noticed that C's zero-based indexing is > > nothing short of...well...horrible? And it doesn't end there: C has > > been so influential that its insanity has spread to tons of other > > languages: C++, Java, C#, Python, Perl--heck, even LISP. > > > > Am I the only one here who's suffered a needless off-by-one bug as a > > result of C's horrible zero-based indexing? Let's see a show of hands... > > > > But more important, to those of us in DSP, is the fact that C's indexing > > doesn't conform to Matlab--which wisely employs one-based indexing in > > accordance with the centurys-old convention of linear algebra. Now, one > > might ask, "who died and left Matlab in charge?" Well, nobody, I guess. > > But if the Matlab folks hadn't gotten this right, why would so many > > DSP folks shell out *thousands* for Matlab, when zero-based alternatives > > like SciPy (http://scipy.org/) are absolutely, 100%, free? There can be > > only one answer: one-based indexing. > > > > OK, I can already hear some of you about to cite the old "legacy code" > > chestnut as a reason C can't be changed. True, there's a lot of C code > > out there. But there's absolutely no reason that C can't switch to > > one-based with a zero-based compatibility mode. OK, I know, I know: C++ > > programmers can easily create their own one-based array object. But > > what sense does it make for countless programmers to home-brew a > > solution when Dennis Ritchie could just get off his duff and fix the > > problem once-and-for-all? After all, Grace Hopper did it for FORTRAN. > > > > Sound radical? I don't think so: let's face it, we live in a one-based > > world. For example, ask a child to count their fingers, and they'll > > invariably assign "one" to the first (er, "zeroth") finger. > > > > Still not convinced? OK, what about the calendar?--what if today was > > "April 0th"? > > > >=g2 > > _____________________________________________________________________ > > > > Grant R. Griffin > > Publisher of dspGuru http://www.dspguru.com > > Iowegian International Corporation http://www.iowegian.com > > See http://www.iowegian.com/img/contact.gif for e-mail address > > > > > > -- > Al Clark > Danville Signal Processing, Inc. > -------------------------------------------------------------------- > Purveyors of Fine DSP Hardware and other Cool Stuff > Available at http://www.danvillesignal.com
On Sun, 02 Apr 2006 00:14:10 -0600, Grant Griffin wrote:
> I'm not usually much of an April Fooler, but I was inspired to this > mischief by Slashdot (http://www.slashdot.org), which turned the most > awful pink color yesterday in a purported attempt to increase the > proportion of female readers. (My monitor is pretty awful, so when I > saw that color my first thought was that it had fritzed completely!)
Slashdot often has awful colour schemes anyway (see any news on BSD, for example). The change of sub-title from "news for nerds, stuff that matters" into "OMG!!! Ponies!!!" surrounded by hearts, should have been a pretty big givaway. :-) I always thought that April fools' tricking was supposed to be over by twelve o'clock, but from where I was looking at it (Sydney), it seemed to go on for two days, at least. Re: factual errors and C addressing: It's worse than that, apparently. If you care, you can find a big argument that I had last month in comp.arch.embedded and comp.lang.c (google groups will find it here: http://groups.google.com/group/comp.lang.c/browse_frm/thread/368d3d324f757051/bb18f6cccd56427d?&hl=en ) The gist of that thread is basically that the old C "base shift" technique[*] is officially undefined, which means that an increasing number of super-conformance-checking compilers and run-time systems are most likely to increase their efforts to stop you from doing it. I've seen at least one compiler that claims to allow running C code on a JVM. I bet that will trap if you do that, too. As will AS/400 machines, but I doubt that anyone is trying to write DSP code on those. [*] base shift: often used for doing simple-minded translations from Fortran code (which has 1-based array indexing) to C. Used throughout the original editions of "Numerical Recipes in C", which is quite a famous book. The trick is to do something like this: double buf[10]; base1_routine(buf - 1); /* or base1_routine(&buf[-1]); equivalent */ void base1_routine(double a[]) { int i; for (i = 1; i <= 10; i++) a[i] = foo(i, a[i]); } An AS/400 machine, which has object-level memory checking *when pointers are formed* (not when un-allocated memory is accessed, as it isn't here), will trap on the second line, when (buf-1) is formed. Therefore (apparently), it's undefined in the C standard. There was some argument that such constructions caused problems in ia286 systems too, but I never met an example where it did... Consequently, loops backwards over arrays using *p-- are also undefined, because that last post-decrement will form the same "undefined" pointer. The standard makes a special exception for the "one past the end" case, because they didn't feel too bad about forcing the AS/400 compiler people to over-allocate all objects by one byte, to cover this situation, and because it's such a common idiom... Yeah, I reckon I'm just about as annoyed by the C standards committee as r.b-j is at the Matlab folks... What are you going to do? Go back to coding in assembler, full time? At least DSP assembly code for your favourite DSP says "this is not expected to run on an AS/400", writ large. Cheers, -- Andrew
Andrew Reilly wrote:

> Re: factual errors and C addressing: >
...
> The gist of that thread is basically that the old C "base shift" > technique[*] is officially undefined, which means that an increasing > number of super-conformance-checking compilers and run-time systems are > most likely to increase their efforts to stop you from doing it. I've > seen at least one compiler that claims to allow running C code on a JVM. > I bet that will trap if you do that, too. As will AS/400 machines, but I > doubt that anyone is trying to write DSP code on those. > > [*] base shift: often used for doing simple-minded translations from > Fortran code (which has 1-based array indexing) to C. Used throughout > the original editions of "Numerical Recipes in C", which is quite a > famous book. The trick is to do something like this: > > double buf[10]; > base1_routine(buf - 1); /* or base1_routine(&buf[-1]); equivalent */ > > void base1_routine(double a[]) > { > int i; > for (i = 1; i <= 10; i++) > a[i] = foo(i, a[i]); > }
so, Andrew, are they saying that buf-1 is not guaranteed to be an address that you could form a pointer with and index off of? perhaps buf[] is aligned to the base of some memory space and buf-1 is not defined, but doesn't ANSI C guarantee that (buf-1)+1 = buf for pointers? i guess it violate one of these? number 1 specifically: 1. p = s - 1 is valid. Not guaranteed. Careless coding. 2. cast (int) p is meaningful. Not guaranteed. 3. Use of 2's complement arithmetic. 4. ints have no trap representations or hidden bits. 5. 4 == sizeof(int) && 8 == CHAR_BIT. 6. size_t is actually int. 7. sizeof(int) is a power of 2. 8. int alignment depends on a zeroed bit field. is p = s+1 valid? because this should be valid in C: double *h, acausal_impulse_response[1024]; h = &(acausal_impulse_response[512]); // or acausal_impulse_response+512 double x[REALLY_BIG_INTEGER]; // later... for(n=0; n<REALLY_BIG_INTEGER; n++) { double accumulator = 0.0; for(int i=-512; i<512; i++) { accumulator += x[n-i]*h[i]; } y[n] = accumulator; } and even do similar offsetting for x[n] which might be zero-padded by 512 on both sides. any ANSI C should be able to handle that, no? we oughta be able to know that arrays are contiguous and point to anywhere in the middle and use that pointer name as an array name and allow negative indexing from that middle pointer. can that ever be a problem in ANSI C? r b-j
On Sun, 02 Apr 2006 22:19:15 -0700, robert bristow-johnson wrote:
> Andrew Reilly wrote: >> [*] base shift: often used for doing simple-minded translations from >> Fortran code (which has 1-based array indexing) to C. Used throughout >> the original editions of "Numerical Recipes in C", which is quite a >> famous book. The trick is to do something like this: >> >> double buf[10]; >> base1_routine(buf - 1); /* or base1_routine(&buf[-1]); equivalent */ >> >> void base1_routine(double a[]) >> { >> int i; >> for (i = 1; i <= 10; i++) >> a[i] = foo(i, a[i]); >> } > > so, Andrew, are they saying that buf-1 is not guaranteed to be an > address that you could form a pointer with and index off of? perhaps > buf[] is aligned to the base of some memory space and buf-1 is not > defined, but doesn't ANSI C guarantee that (buf-1)+1 = buf for > pointers?
Only, apparently, if buf is already pointing to at least the +1'th element of a properly allocated array object.
> i guess it violate one of these? number 1 specifically: > > 1. p = s - 1 is valid. Not guaranteed. Careless coding. 2. cast > (int) p is meaningful. Not guaranteed. 3. Use of 2's complement > arithmetic. 4. ints have no trap representations or hidden bits. 5. > 4 == sizeof(int) && 8 == CHAR_BIT. 6. size_t is actually int. 7. > sizeof(int) is a power of 2. > 8. int alignment depends on a zeroed bit field.
Aah, that's the list from the original article at the google link, isn't it? Most of the argument was about the first item, because it surprised me completely (and seemingly a few others, too.) Most of those I don't mind, one way or the other. I prefer not to use size_t specifically because it's usually unsigned, on the platforms that I use, and unsigneds cause more problems than they solve, in my book. Avoiding 2's compliment arithmetic dependencies is (IMO) too hard to bother with (CDC Cybers can get knotted, now that code runs faster on pocket telephones...). Doing alignment with logical operations is the only sensible way to do it, etc...
> is p = s+1 valid? because this should be valid in C:
Only if s points to somewhere strictly *within* an array allocation. Undefined if s is already at the "one past the end" position.
> double *h, acausal_impulse_response[1024]; > > h = &(acausal_impulse_response[512]); > > // or acausal_impulse_response+512 > > double x[REALLY_BIG_INTEGER]; > > // later... > > for(n=0; n<REALLY_BIG_INTEGER; n++) > { > double accumulator = 0.0; > for(int i=-512; i<512; i++) > { > accumulator += x[n-i]*h[i]; > } > y[n] = accumulator; > } > } > and even do similar offsetting for x[n] which might be zero-padded by > 512 on both sides. any ANSI C should be able to handle that, no?
I think that ISO C is fine with that if the outer loop is n=512; n < REALLY_BIG_INTEGER - 512; n++, but I suspect that's what you meant. You still can't try to access x outside the allocation range. Using negative indices for h should be fine. Where it annoys me is if you code an FIR like so: for (i = 0; i < N; i++) acc += h[i] * *x--; because you've got x stored in time-order for some reason, or because you haven't flipped your coefficients, then (according to the standard) you have to arrange to have allocated an extra, useless element of the state buffer, because otherwise that loop will step into "undefined" territory off the beginning of the array, and your run-time may cause a trap, or worse. Of course, the C compilers and platforms that most of us (in comp.dsp) use do no such thing. They tend to do the obvious; just leave x pointing to the "element" before the allocated array. It doesn't get accessed, that value of x is never used after being computed, so no harm, no foul, right? Well, because the standard says that's invoking undefined behavior, one day someone will build a compiler or run-time that tests for it, and a whole bunch of perfectly fine DSP code will break... [There's an exactly similar problem with any pointer-based code that walks forwards with non-unit stride, and finishes more than one "element" past the top end of the array, too.]
> we > oughta be able to know that arrays are contiguous and point to anywhere > in the middle and use that pointer name as an array name and allow > negative indexing from that middle pointer. can that ever be a problem > in ANSI C?
I don't think so. The problem (undefined behavior) only arises if you form a pointer to something outside an object, other than one element past the object, even if you don't subsequently access the memory at that location. Forming a pointer to any location that you *can* access seems to be OK. One lesson from this is that it really is better to use an index based coding style in C, than a pointer-based one. They're not equivalent after all. Cheers, -- Andrew