I spent a bunch of time tracking down an issue with the 21161 SHARC, so I thought I'd mention it to you guys. The issue has to do with the timing of the automatic context switching on returning from interrupts, specifically with SIMD mode and a delayed RTI instruction. On the SHARC, when certain interrupts are received, the MODE1 and ASTAT registers are automatically pushed onto the stack, and then popped off when the interrupt routine is finished. This all happens transparently to the user. When returning from an interrupt, you can use the syntax RTI (DB); (delayed branch) so that you can squeeze in 2 more instruction instead of suffering a 2-cycle pipeline hit. It appears that the timing of the automatic popping of the MODE1 register is such that the 2nd delayed instruction executes with the incorrect SIMD mode setting. An example will illustrate. main code: assume SIMD mode is always turned on here interrupt code: interrupt happens, MODE1 is pushed interrupt code explicitly turns SIMD off and does processing at the end of the ISR: RTI (DB); r0 = 1234; //first instruction dm(0x50000) = r0; //second instruction //return actually happens here The result of this code is that the second instruction, dm(0x50000) = r0 performs the additional (aka implicit) write dm(0x50001) = s0 because SIMD mode was on in the interrupted code. This will corrupt location 0x50001, which is what was happening to me. Does this sound like a hardware anomaly to you guys? If so, who should I report it to? The documentation seems to be silent on the timing of the popping of the status registers and a delayed RTI.
Possible anomaly with 21161 SHARC
Started by ●May 21, 2004
Reply by ●May 24, 20042004-05-24
Hi Jon, that is not an anomaly of the chip. The "effect latency" of the MODEx and ASTATx-registers is one cycle. (HW-RefMan Table 3-2). So the first instr. in the delayed branch is executed with the unchanged state of MODE1 and ASTAT, the second instr. with the popped values. Mattias
Reply by ●May 24, 20042004-05-24
Jon Harris wrote: ...> On the SHARC, when certain interrupts are received, the MODE1 and ASTAT > registers are automatically pushed onto the stack, and then popped off when the > interrupt routine is finished. This all happens transparently to the user. > When returning from an interrupt, you can use the syntax RTI (DB); (delayed > branch) so that you can squeeze in 2 more instruction instead of suffering a > 2-cycle pipeline hit.Jon, are you sure the SHARC does not behave the same way? I would think that the automatic pop of the status stack at the RTI has the same latency as the "pop sts" instruction, or a direct write to the mode1 register, namely one instruction cycle. This would be consistent with you finding that the second instruction after the RTI already reacts to the popped mode1 register. But, as you said, the manual is silent about this. I think I also noticed different behaviour in the simulator as compared to the target hardware in this issue. Regards, Andor
Reply by ●May 24, 20042004-05-24
"Andor Bariska" <an2or@nospam.net> wrote in message news:40b1c1fa$1@pfaff2.ethz.ch...> Jon Harris wrote: > ... > > On the SHARC, when certain interrupts are received, the MODE1 and ASTAT > > registers are automatically pushed onto the stack, and then popped off whenthe> > interrupt routine is finished. This all happens transparently to the user. > > When returning from an interrupt, you can use the syntax RTI (DB); (delayed > > branch) so that you can squeeze in 2 more instruction instead of suffering a > > 2-cycle pipeline hit. > > Jon, are you sure the SHARC does not behave the same way?I'm confused about this. I'm only talking about the SHARC.> I would think > that the automatic pop of the status stack at the RTI has the same > latency as the "pop sts" instruction, or a direct write to the mode1 > register, namely one instruction cycle. > > This would be consistent with you finding that the second instruction > after the RTI already reacts to the popped mode1 register. > > But, as you said, the manual is silent about this. I think I also > noticed different behaviour in the simulator as compared to the target > hardware in this issue.Another odd thing I noticed is that the effect of the register/DAG primary/secondary select bits also in MODE1 seems to be different than the SIMD bit. I haven't confirmed this for sure, but it sure looks like that 2nd instruction after the RTI still uses the interrupt code's register/DAG select setting rather than that of the interrupted code. I guess the bottom line here is to use RTI(DB) with caution!
Reply by ●May 25, 20042004-05-25
Jon Harris wrote:> ... > I guess the bottom line here is to use RTI(DB) with caution!So do I... I was just tracing a bug around such a RTI(db) command, and I found that it works fine if I code it as: pop sts; rti; while it fails if I modify it to rti(db) pop sts; nop; I watched another related issue (which meanwhile made it into the anomaly list): the integrated status push of mode1 and astat fails under certain circumstances. Eventually, an additional <push sts>-cmd (and the corresponding pop sts at the return location) helped. Newest versions of VisualDSP should handle this correctly. Concerning your initial topic: I checked the anomaly list, but didn't find a corresponding entry. However, in the anomaly list for the 21160 (issue 16) reveals a difference depending on how mode1 register is written. In combination with the strange behaviour I watched, I guess that there are risks around rti(db) in combination with the status stack recovery. While it's recommended to either place a nop after a pop sts or avoid critical accesses immediately after, I assume that this is necessary too, if you deal with the implicit pop sts which rti(db) does, if recurring from distinct interrupts like IRQ0..2 Problems might increase, if you have nested interrupts enabled. I'm wondering if it works correctly, if a higher priority interrupt occurs during the rti(db) ...?? Bernhard
Reply by ●May 25, 20042004-05-25
Jon Harris wrote:> "Andor Bariska" <an2or@nospam.net> wrote in message...> > Jon, are you sure the SHARC does not behave the same way? > > I'm confused about this. I'm only talking about the SHARC.Sorry. I thought you were differentiating between SHARC (2106x) family and Hammerhead (21161). It was my confusion.> > I would think > > that the automatic pop of the status stack at the RTI has the same > > latency as the "pop sts" instruction, or a direct write to the mode1 > > register, namely one instruction cycle. > > > > This would be consistent with you finding that the second instruction > > after the RTI already reacts to the popped mode1 register. > > > > But, as you said, the manual is silent about this. I think I also > > noticed different behaviour in the simulator as compared to the target > > hardware in this issue. > > Another odd thing I noticed is that the effect of the register/DAG > primary/secondary select bits also in MODE1 seems to be different than the SIMD > bit. I haven't confirmed this for sure, but it sure looks like that 2nd > instruction after the RTI still uses the interrupt code's register/DAG select > setting rather than that of the interrupted code.I'm sure this isn't the case with the SHARC - don't know (but suspect the same) for the Hammerhead.> > I guess the bottom line here is to use RTI(DB) with caution!Definitely. I once had a bug very similar to yours - popping the status stack in the second instruction after a RTI (DB) :-). The simple solution was to exchange last with second to last instruction in the interrupt routine. Regards, Andor
Reply by ●May 25, 20042004-05-25
Andor <an2or@mailcircuit.com> wrote in message news:ce45f9ed.0405242316.a96ccef@posting.google.com...> Jon Harris wrote: > > "Andor Bariska" <an2or@nospam.net> wrote in message > ... > > > Jon, are you sure the SHARC does not behave the same way? > > > > I'm confused about this. I'm only talking about the SHARC. > > Sorry. I thought you were differentiating between SHARC (2106x) family > and Hammerhead (21161). It was my confusion.I see. In my usage, SHARC refers to anything in the 21x6x family. It looks to me that ADI really isn't using the Hammerhead name much anymore. It doesn't seem to appear in 21161 documentation, and a web search at analog.com gets only a few hits. I'm going to use part numbers for here on out to avoid ambiguity. I have code for both 21065L and 21161. The problem was originally noticed on the 21161 in conjunction with SIMD mode. I then went back to look at the 21065L code which is a bit different. The strange thing was that it looked like it should have had a similar problem with register and DAG select but didn't!> > > I would think > > > that the automatic pop of the status stack at the RTI has the same > > > latency as the "pop sts" instruction, or a direct write to the mode1 > > > register, namely one instruction cycle. > > > > > > This would be consistent with you finding that the second instruction > > > after the RTI already reacts to the popped mode1 register. > > > > > > But, as you said, the manual is silent about this. I think I also > > > noticed different behaviour in the simulator as compared to the target > > > hardware in this issue. > > > > Another odd thing I noticed is that the effect of the register/DAG > > primary/secondary select bits also in MODE1 seems to be different than theSIMD> > bit. I haven't confirmed this for sure, but it sure looks like that 2nd > > instruction after the RTI still uses the interrupt code's register/DAGselect> > setting rather than that of the interrupted code. > > I'm sure this isn't the case with the SHARC - don't know (but suspect > the same) for the Hammerhead.When I have a chance I'd like to re-try this on both 21161 and 21065L and see what the behavior is.> > I guess the bottom line here is to use RTI(DB) with caution! > > Definitely. I once had a bug very similar to yours - popping the > status stack in the second instruction after a RTI (DB) :-). The > simple solution was to exchange last with second to last instruction > in the interrupt routine.Glad I'm not alone! The fact that we've both suffered from this indicates that ADI should include some info on this in their documentation.
Reply by ●May 25, 20042004-05-25
"Jon Harris" <goldentully@hotmail.com> wrote in message news:2hh6p7Fd0kbfU1@uni-berlin.de...> Andor <an2or@mailcircuit.com> wrote in message > news:ce45f9ed.0405242316.a96ccef@posting.google.com... > > I have code for both 21065L and 21161. The problem was originally noticed on > the 21161 in conjunction with SIMD mode. I then went back to look at the21065L> code which is a bit different. The strange thing was that it looked like it > should have had a similar problem with register and DAG select but didn't! > > > > > I would think > > > > that the automatic pop of the status stack at the RTI has the same > > > > latency as the "pop sts" instruction, or a direct write to the mode1 > > > > register, namely one instruction cycle. > > > > > > > > This would be consistent with you finding that the second instruction > > > > after the RTI already reacts to the popped mode1 register. > > > > > > > > But, as you said, the manual is silent about this. I think I also > > > > noticed different behaviour in the simulator as compared to the target > > > > hardware in this issue. > > > > > > Another odd thing I noticed is that the effect of the register/DAG > > > primary/secondary select bits also in MODE1 seems to be different than the > SIMD > > > bit. I haven't confirmed this for sure, but it sure looks like that 2nd > > > instruction after the RTI still uses the interrupt code's register/DAG > select > > > setting rather than that of the interrupted code. > > > > I'm sure this isn't the case with the SHARC - don't know (but suspect > > the same) for the Hammerhead. > > When I have a chance I'd like to re-try this on both 21161 and 21065L and see > what the behavior is.OK, I just tested this and have a definitive answer. The upshot is that changes to the alternate register bits in MODE1 do not take place on the second instruction after RTI (DB), but changes to the SIMD bit does. This sounds like an anomaly to me--either one or the other is wrong. Here is my test (in a mix of pseudo code and assembler): Test A: Main loop code is running with alternate registers selected Interrupt code: set r14 and r15 primary and alternate to known values select primary registers RTI(DB); dm(test1) = r14; // instruction 1 dm(test2) = r15; // instruction 2 Results: test1 and test2 both receive the values in primary registers. This means that the change to MODE1 did NOT take effect until after instruction 2. Behavior was the same between 21065L and 21161. Test B: Main loop code is running with SIMD bit turned on Interrupt code: turn off SIMD mode set r14/s14 and r15/s15 primary and alternate to known values RTI(DB); dm(test1) = r14; // instruction 1 dm(test2) = r15; // instruction 2 Results: test1 receives the value in primary r14. Location test1+1 is not affected (no SIMD write). test2 receives the value in primary r15. Location test2+1 receives the values in primary s15 (SIMD write occurred). This means that the change to MODE1 took effect just before instruction 2. This was on the 21161. I could not try Test B on the 21065L because it doesn't have SIMD mode. Conclusion: the effect latency for MODE1 is not consistent among various bits in the case of an automatic pop of STS in an RTI(DB) instruction. The SIMD bit takes affect before the alternate register select bits.
Reply by ●May 26, 20042004-05-26
Jon Harris wrote:> ... > Conclusion: the effect latency for MODE1 is not consistent among > various bits in > the case of an automatic pop of STS in an RTI(DB) instruction. > The SIMD bit takes affect before the alternate register select > bits.Thanks for this examination. If this is true (and I see no reason to doubt it...), it will certainly be a great help to avoid conflicts. Bernhard
Reply by ●May 26, 20042004-05-26
Jon, In response to your earlier question of how to report an anomaly, send a message to dsp.support@analog.com with the details, including your test code. This will get the process started. The more people report their problems, the better. It won't even be considered to be fixed in new silicon if they don't know about it, and even if it never gets fixed, at least if it's documented people shouldn't get burned (as badly) by it. All silicon has anomalies, some companies do better than others at making them available to the users. ADI does a decent job, posting them all on their web site. While some people are shocked to learn all products (including DSPs) have problems, good engineers understand that everything has problems, and knowing about them is a heck of a lot better than not. Ron ----------- Ron Huizen BittWare "Jon Harris" <goldentully@hotmail.com> wrote in message news:2hi37dFdbdiiU1@uni-berlin.de...> "Jon Harris" <goldentully@hotmail.com> wrote in message > news:2hh6p7Fd0kbfU1@uni-berlin.de... > > Andor <an2or@mailcircuit.com> wrote in message > > news:ce45f9ed.0405242316.a96ccef@posting.google.com... > > > > I have code for both 21065L and 21161. The problem was originallynoticed on> > the 21161 in conjunction with SIMD mode. I then went back to look atthe> 21065L > > code which is a bit different. The strange thing was that it lookedlike it> > should have had a similar problem with register and DAG select butdidn't!> > > > > > > I would think > > > > > that the automatic pop of the status stack at the RTI has the same > > > > > latency as the "pop sts" instruction, or a direct write to themode1> > > > > register, namely one instruction cycle. > > > > > > > > > > This would be consistent with you finding that the secondinstruction> > > > > after the RTI already reacts to the popped mode1 register. > > > > > > > > > > But, as you said, the manual is silent about this. I think I also > > > > > noticed different behaviour in the simulator as compared to thetarget> > > > > hardware in this issue. > > > > > > > > Another odd thing I noticed is that the effect of the register/DAG > > > > primary/secondary select bits also in MODE1 seems to be differentthan the> > SIMD > > > > bit. I haven't confirmed this for sure, but it sure looks like that2nd> > > > instruction after the RTI still uses the interrupt code'sregister/DAG> > select > > > > setting rather than that of the interrupted code. > > > > > > I'm sure this isn't the case with the SHARC - don't know (but suspect > > > the same) for the Hammerhead. > > > > When I have a chance I'd like to re-try this on both 21161 and 21065Land see> > what the behavior is. > > OK, I just tested this and have a definitive answer. The upshot is thatchanges> to the alternate register bits in MODE1 do not take place on the second > instruction after RTI (DB), but changes to the SIMD bit does. This soundslike> an anomaly to me--either one or the other is wrong. > > Here is my test (in a mix of pseudo code and assembler): > > Test A: Main loop code is running with alternate registers selected > > Interrupt code: > set r14 and r15 primary and alternate to known values > select primary registers > RTI(DB); > dm(test1) = r14; // instruction 1 > dm(test2) = r15; // instruction 2 > > Results: test1 and test2 both receive the values in primary registers.This> means that the change to MODE1 did NOT take effect until after instruction2.> Behavior was the same between 21065L and 21161. > > > > Test B: Main loop code is running with SIMD bit turned on > > Interrupt code: > turn off SIMD mode > set r14/s14 and r15/s15 primary and alternate to known values > RTI(DB); > dm(test1) = r14; // instruction 1 > dm(test2) = r15; // instruction 2 > > Results: test1 receives the value in primary r14. Location test1+1 is not > affected (no SIMD write). test2 receives the value in primary r15.Location> test2+1 receives the values in primary s15 (SIMD write occurred). Thismeans> that the change to MODE1 took effect just before instruction 2. This wason the> 21161. I could not try Test B on the 21065L because it doesn't have SIMDmode.> > Conclusion: the effect latency for MODE1 is not consistent among variousbits in> the case of an automatic pop of STS in an RTI(DB) instruction. The SIMDbit> takes affect before the alternate register select bits. > > > > >