DSPRelated.com
Forums

Possible anomaly with 21161 SHARC

Started by Jon Harris May 21, 2004
I spent a bunch of time tracking down an issue with the 21161 SHARC, so I
thought I'd mention it to you guys.  The issue has to do with the timing of the
automatic context switching on returning from interrupts, specifically with SIMD
mode and a delayed RTI instruction.

On the SHARC, when certain interrupts are received, the MODE1 and ASTAT
registers are automatically pushed onto the stack, and then popped off when the
interrupt routine is finished.  This all happens transparently to the user.
When returning from an interrupt, you can use the syntax RTI (DB); (delayed
branch) so that you can squeeze in 2 more instruction instead of suffering a
2-cycle pipeline hit.  It appears that the timing of the automatic popping of
the MODE1 register is such that the 2nd delayed instruction executes with the
incorrect SIMD mode setting.  An example will illustrate.

main code:
assume SIMD mode is always turned on here

interrupt code:
interrupt happens, MODE1 is pushed
interrupt code explicitly turns SIMD off and does processing
at the end of the ISR:
RTI (DB);
 r0 = 1234;           //first instruction
 dm(0x50000) = r0;    //second instruction
//return actually happens here

The result of this code is that the second instruction, dm(0x50000) = r0
performs the additional (aka implicit) write dm(0x50001) = s0 because SIMD mode
was on in the interrupted code.  This will corrupt location 0x50001, which is
what was happening to me.

Does this sound like a hardware anomaly to you guys?  If so, who should I report
it to?  The documentation seems to be silent on the timing of the popping of the
status registers and a delayed RTI.



Hi Jon,

that is not an anomaly of the chip.
The "effect latency" of the MODEx and ASTATx-registers
is one cycle. (HW-RefMan Table 3-2). So the first instr. in
the delayed branch is executed with the unchanged state of
MODE1 and ASTAT, the second instr. with the popped values.

Mattias



Jon Harris wrote:
...
> On the SHARC, when certain interrupts are received, the MODE1 and ASTAT > registers are automatically pushed onto the stack, and then popped off when the > interrupt routine is finished. This all happens transparently to the user. > When returning from an interrupt, you can use the syntax RTI (DB); (delayed > branch) so that you can squeeze in 2 more instruction instead of suffering a > 2-cycle pipeline hit.
Jon, are you sure the SHARC does not behave the same way? I would think that the automatic pop of the status stack at the RTI has the same latency as the "pop sts" instruction, or a direct write to the mode1 register, namely one instruction cycle. This would be consistent with you finding that the second instruction after the RTI already reacts to the popped mode1 register. But, as you said, the manual is silent about this. I think I also noticed different behaviour in the simulator as compared to the target hardware in this issue. Regards, Andor
"Andor Bariska" <an2or@nospam.net> wrote in message
news:40b1c1fa$1@pfaff2.ethz.ch...
> Jon Harris wrote: > ... > > On the SHARC, when certain interrupts are received, the MODE1 and ASTAT > > registers are automatically pushed onto the stack, and then popped off when
the
> > interrupt routine is finished. This all happens transparently to the user. > > When returning from an interrupt, you can use the syntax RTI (DB); (delayed > > branch) so that you can squeeze in 2 more instruction instead of suffering a > > 2-cycle pipeline hit. > > Jon, are you sure the SHARC does not behave the same way?
I'm confused about this. I'm only talking about the SHARC.
> I would think > that the automatic pop of the status stack at the RTI has the same > latency as the "pop sts" instruction, or a direct write to the mode1 > register, namely one instruction cycle. > > This would be consistent with you finding that the second instruction > after the RTI already reacts to the popped mode1 register. > > But, as you said, the manual is silent about this. I think I also > noticed different behaviour in the simulator as compared to the target > hardware in this issue.
Another odd thing I noticed is that the effect of the register/DAG primary/secondary select bits also in MODE1 seems to be different than the SIMD bit. I haven't confirmed this for sure, but it sure looks like that 2nd instruction after the RTI still uses the interrupt code's register/DAG select setting rather than that of the interrupted code. I guess the bottom line here is to use RTI(DB) with caution!
Jon Harris wrote:

> ... > I guess the bottom line here is to use RTI(DB) with caution!
So do I... I was just tracing a bug around such a RTI(db) command, and I found that it works fine if I code it as: pop sts; rti; while it fails if I modify it to rti(db) pop sts; nop; I watched another related issue (which meanwhile made it into the anomaly list): the integrated status push of mode1 and astat fails under certain circumstances. Eventually, an additional <push sts>-cmd (and the corresponding pop sts at the return location) helped. Newest versions of VisualDSP should handle this correctly. Concerning your initial topic: I checked the anomaly list, but didn't find a corresponding entry. However, in the anomaly list for the 21160 (issue 16) reveals a difference depending on how mode1 register is written. In combination with the strange behaviour I watched, I guess that there are risks around rti(db) in combination with the status stack recovery. While it's recommended to either place a nop after a pop sts or avoid critical accesses immediately after, I assume that this is necessary too, if you deal with the implicit pop sts which rti(db) does, if recurring from distinct interrupts like IRQ0..2 Problems might increase, if you have nested interrupts enabled. I'm wondering if it works correctly, if a higher priority interrupt occurs during the rti(db) ...?? Bernhard
Jon Harris wrote:
> "Andor Bariska" <an2or@nospam.net> wrote in message
...
> > Jon, are you sure the SHARC does not behave the same way? > > I'm confused about this. I'm only talking about the SHARC.
Sorry. I thought you were differentiating between SHARC (2106x) family and Hammerhead (21161). It was my confusion.
> > I would think > > that the automatic pop of the status stack at the RTI has the same > > latency as the "pop sts" instruction, or a direct write to the mode1 > > register, namely one instruction cycle. > > > > This would be consistent with you finding that the second instruction > > after the RTI already reacts to the popped mode1 register. > > > > But, as you said, the manual is silent about this. I think I also > > noticed different behaviour in the simulator as compared to the target > > hardware in this issue. > > Another odd thing I noticed is that the effect of the register/DAG > primary/secondary select bits also in MODE1 seems to be different than the SIMD > bit. I haven't confirmed this for sure, but it sure looks like that 2nd > instruction after the RTI still uses the interrupt code's register/DAG select > setting rather than that of the interrupted code.
I'm sure this isn't the case with the SHARC - don't know (but suspect the same) for the Hammerhead.
> > I guess the bottom line here is to use RTI(DB) with caution!
Definitely. I once had a bug very similar to yours - popping the status stack in the second instruction after a RTI (DB) :-). The simple solution was to exchange last with second to last instruction in the interrupt routine. Regards, Andor
Andor <an2or@mailcircuit.com> wrote in message
news:ce45f9ed.0405242316.a96ccef@posting.google.com...
> Jon Harris wrote: > > "Andor Bariska" <an2or@nospam.net> wrote in message > ... > > > Jon, are you sure the SHARC does not behave the same way? > > > > I'm confused about this. I'm only talking about the SHARC. > > Sorry. I thought you were differentiating between SHARC (2106x) family > and Hammerhead (21161). It was my confusion.
I see. In my usage, SHARC refers to anything in the 21x6x family. It looks to me that ADI really isn't using the Hammerhead name much anymore. It doesn't seem to appear in 21161 documentation, and a web search at analog.com gets only a few hits. I'm going to use part numbers for here on out to avoid ambiguity. I have code for both 21065L and 21161. The problem was originally noticed on the 21161 in conjunction with SIMD mode. I then went back to look at the 21065L code which is a bit different. The strange thing was that it looked like it should have had a similar problem with register and DAG select but didn't!
> > > I would think > > > that the automatic pop of the status stack at the RTI has the same > > > latency as the "pop sts" instruction, or a direct write to the mode1 > > > register, namely one instruction cycle. > > > > > > This would be consistent with you finding that the second instruction > > > after the RTI already reacts to the popped mode1 register. > > > > > > But, as you said, the manual is silent about this. I think I also > > > noticed different behaviour in the simulator as compared to the target > > > hardware in this issue. > > > > Another odd thing I noticed is that the effect of the register/DAG > > primary/secondary select bits also in MODE1 seems to be different than the
SIMD
> > bit. I haven't confirmed this for sure, but it sure looks like that 2nd > > instruction after the RTI still uses the interrupt code's register/DAG
select
> > setting rather than that of the interrupted code. > > I'm sure this isn't the case with the SHARC - don't know (but suspect > the same) for the Hammerhead.
When I have a chance I'd like to re-try this on both 21161 and 21065L and see what the behavior is.
> > I guess the bottom line here is to use RTI(DB) with caution! > > Definitely. I once had a bug very similar to yours - popping the > status stack in the second instruction after a RTI (DB) :-). The > simple solution was to exchange last with second to last instruction > in the interrupt routine.
Glad I'm not alone! The fact that we've both suffered from this indicates that ADI should include some info on this in their documentation.
"Jon Harris" <goldentully@hotmail.com> wrote in message
news:2hh6p7Fd0kbfU1@uni-berlin.de...
> Andor <an2or@mailcircuit.com> wrote in message > news:ce45f9ed.0405242316.a96ccef@posting.google.com... > > I have code for both 21065L and 21161. The problem was originally noticed on > the 21161 in conjunction with SIMD mode. I then went back to look at the
21065L
> code which is a bit different. The strange thing was that it looked like it > should have had a similar problem with register and DAG select but didn't! > > > > > I would think > > > > that the automatic pop of the status stack at the RTI has the same > > > > latency as the "pop sts" instruction, or a direct write to the mode1 > > > > register, namely one instruction cycle. > > > > > > > > This would be consistent with you finding that the second instruction > > > > after the RTI already reacts to the popped mode1 register. > > > > > > > > But, as you said, the manual is silent about this. I think I also > > > > noticed different behaviour in the simulator as compared to the target > > > > hardware in this issue. > > > > > > Another odd thing I noticed is that the effect of the register/DAG > > > primary/secondary select bits also in MODE1 seems to be different than the > SIMD > > > bit. I haven't confirmed this for sure, but it sure looks like that 2nd > > > instruction after the RTI still uses the interrupt code's register/DAG > select > > > setting rather than that of the interrupted code. > > > > I'm sure this isn't the case with the SHARC - don't know (but suspect > > the same) for the Hammerhead. > > When I have a chance I'd like to re-try this on both 21161 and 21065L and see > what the behavior is.
OK, I just tested this and have a definitive answer. The upshot is that changes to the alternate register bits in MODE1 do not take place on the second instruction after RTI (DB), but changes to the SIMD bit does. This sounds like an anomaly to me--either one or the other is wrong. Here is my test (in a mix of pseudo code and assembler): Test A: Main loop code is running with alternate registers selected Interrupt code: set r14 and r15 primary and alternate to known values select primary registers RTI(DB); dm(test1) = r14; // instruction 1 dm(test2) = r15; // instruction 2 Results: test1 and test2 both receive the values in primary registers. This means that the change to MODE1 did NOT take effect until after instruction 2. Behavior was the same between 21065L and 21161. Test B: Main loop code is running with SIMD bit turned on Interrupt code: turn off SIMD mode set r14/s14 and r15/s15 primary and alternate to known values RTI(DB); dm(test1) = r14; // instruction 1 dm(test2) = r15; // instruction 2 Results: test1 receives the value in primary r14. Location test1+1 is not affected (no SIMD write). test2 receives the value in primary r15. Location test2+1 receives the values in primary s15 (SIMD write occurred). This means that the change to MODE1 took effect just before instruction 2. This was on the 21161. I could not try Test B on the 21065L because it doesn't have SIMD mode. Conclusion: the effect latency for MODE1 is not consistent among various bits in the case of an automatic pop of STS in an RTI(DB) instruction. The SIMD bit takes affect before the alternate register select bits.
Jon Harris wrote:

> ... > Conclusion: the effect latency for MODE1 is not consistent among > various bits in > the case of an automatic pop of STS in an RTI(DB) instruction. > The SIMD bit takes affect before the alternate register select > bits.
Thanks for this examination. If this is true (and I see no reason to doubt it...), it will certainly be a great help to avoid conflicts. Bernhard
Jon,

In response to your earlier question of how to report an anomaly, send a
message to dsp.support@analog.com with the details, including your test
code.  This will get the process started.

The more people report their problems, the better.  It won't even be
considered to be fixed in new silicon if they don't know about it, and even
if it never gets fixed, at least if it's documented people shouldn't get
burned (as badly) by it.  All silicon has anomalies, some companies do
better than others at making them available to the users.  ADI does a decent
job, posting them all on their web site.

While some people are shocked to learn all products (including DSPs) have
problems, good engineers understand that everything has problems, and
knowing about them is a heck of a lot better than not.

Ron

-----------
Ron Huizen
BittWare

"Jon Harris" <goldentully@hotmail.com> wrote in message
news:2hi37dFdbdiiU1@uni-berlin.de...
> "Jon Harris" <goldentully@hotmail.com> wrote in message > news:2hh6p7Fd0kbfU1@uni-berlin.de... > > Andor <an2or@mailcircuit.com> wrote in message > > news:ce45f9ed.0405242316.a96ccef@posting.google.com... > > > > I have code for both 21065L and 21161. The problem was originally
noticed on
> > the 21161 in conjunction with SIMD mode. I then went back to look at
the
> 21065L > > code which is a bit different. The strange thing was that it looked
like it
> > should have had a similar problem with register and DAG select but
didn't!
> > > > > > > I would think > > > > > that the automatic pop of the status stack at the RTI has the same > > > > > latency as the "pop sts" instruction, or a direct write to the
mode1
> > > > > register, namely one instruction cycle. > > > > > > > > > > This would be consistent with you finding that the second
instruction
> > > > > after the RTI already reacts to the popped mode1 register. > > > > > > > > > > But, as you said, the manual is silent about this. I think I also > > > > > noticed different behaviour in the simulator as compared to the
target
> > > > > hardware in this issue. > > > > > > > > Another odd thing I noticed is that the effect of the register/DAG > > > > primary/secondary select bits also in MODE1 seems to be different
than the
> > SIMD > > > > bit. I haven't confirmed this for sure, but it sure looks like that
2nd
> > > > instruction after the RTI still uses the interrupt code's
register/DAG
> > select > > > > setting rather than that of the interrupted code. > > > > > > I'm sure this isn't the case with the SHARC - don't know (but suspect > > > the same) for the Hammerhead. > > > > When I have a chance I'd like to re-try this on both 21161 and 21065L
and see
> > what the behavior is. > > OK, I just tested this and have a definitive answer. The upshot is that
changes
> to the alternate register bits in MODE1 do not take place on the second > instruction after RTI (DB), but changes to the SIMD bit does. This sounds
like
> an anomaly to me--either one or the other is wrong. > > Here is my test (in a mix of pseudo code and assembler): > > Test A: Main loop code is running with alternate registers selected > > Interrupt code: > set r14 and r15 primary and alternate to known values > select primary registers > RTI(DB); > dm(test1) = r14; // instruction 1 > dm(test2) = r15; // instruction 2 > > Results: test1 and test2 both receive the values in primary registers.
This
> means that the change to MODE1 did NOT take effect until after instruction
2.
> Behavior was the same between 21065L and 21161. > > > > Test B: Main loop code is running with SIMD bit turned on > > Interrupt code: > turn off SIMD mode > set r14/s14 and r15/s15 primary and alternate to known values > RTI(DB); > dm(test1) = r14; // instruction 1 > dm(test2) = r15; // instruction 2 > > Results: test1 receives the value in primary r14. Location test1+1 is not > affected (no SIMD write). test2 receives the value in primary r15.
Location
> test2+1 receives the values in primary s15 (SIMD write occurred). This
means
> that the change to MODE1 took effect just before instruction 2. This was
on the
> 21161. I could not try Test B on the 21065L because it doesn't have SIMD
mode.
> > Conclusion: the effect latency for MODE1 is not consistent among various
bits in
> the case of an automatic pop of STS in an RTI(DB) instruction. The SIMD
bit
> takes affect before the alternate register select bits. > > > > >