Technical discussions related to Analog Devices DSPs (including Blackfin, TigerSHARC, SHARC and ADSP-21xx DSPs).
My project uses a 21161 with the following architecture: 3 x 16-bit SRAM ( for execution of 48-bit instruction ) 16-bit flash for NV storage or program and data 2 ADC connected via SPORT0 and SPORT2 UART connected to IRQ0 Memory mapped custom ARINC-429 Communications PLD connected to IRQ2 We are using v3.5 of the tools. Majority of the code is written in C. The only assembly is that written by Analog Devices and slightly modified ( mostly to remove unused code ). We are not using the VDK. We run the majority of our code out of external SRAM. The Analog Devices libraries that we used are located in internal RAM, bank 0. We chose to use the provided interrupt() handling for all of our interrupts. We are seeing a problem where our system will reset aperiodically. It was first seen in the field, but by increasing ARINC bus traffic, we are able to make it happen more frequently. The symptoms are varied, and we are having a very difficult time tracking down the problem. Symptoms include: - Failed CRC checks on program memory, and configurable parameter memory ( configurable parameters stored in external SRAM during execution ). We force a reset in these cases. - Corrupt Data being transmitted on ARINC. - Processor execution getting 'lost' and the watchdog resets. - Changes in code ( adding / removing instructions ) can make the problem better or worse. We have implemented a scheme to instrument where the code is executing ( writing data to unused external RAM ) and then dumping that data out upon reset. I am not seeing anything specific in the execution where the instrumentation stops after a certain ISR or function. With reduced incoming ARINC bus traffic ( only essential data ), I was able to run the system for 6 days without it resetting. With the sporadic nature of the problem, we are focusing on the interrupt handling. We have already converted the UART to be polled instead of interrupt driven, but that hasn't eliminated our problem. Questions: 1. Is anyone using 21161 with the interrupt handling provided by Analog Devices ( vectoring scheme, we have our own C handlers ). 2. Has anyone had any problems similar to this? 3. I'd like to use the HW break points to try and catch the 'culprit' code corrupting memory that is not supposed to be written to after it is loaded. Anyone familiar with using them? I tried once, but it seemed to break all the time. But when it broke it did not appear that break happened when I asked it to. 4. Are there any schemes for protecting areas of external RAM that don't require HW? 5. Any ideas on how to track down something that is trampling registers or memory, but not in any sort of identifiable pattern? 6. If we migrate to v4.0 of the tools, we will need to perform a significant amount of re-test and documentation and delay the delivery of our product to our customer ( probably 2 months and we are already behind schedule ). Is it worth the effort to upgrade? Thanks in advance for any comments, suggestions, wisdom, etc. that you can provide. Regards, Robert Allen Senior Software Engineer Goodrich Sensor Systems
On Fri, 19 May 2006, rallen_prsch911 wrote: > We have implemented a scheme to instrument where the code is > executing ( writing data to unused external RAM ) and then dumping > that data out upon reset. I am not seeing anything specific in the > execution where the instrumentation stops after a certain ISR or > function. With reduced incoming ARINC bus traffic ( only essential > data ), I was able to run the system for 6 days without it resetting. Put a current meter on the system and see if it changes as a function of bus traffic. You'll be lucky if it's a simple hardware fix, but it's worth a look. > 3. I'd like to use the HW break points to try and catch > the 'culprit' code corrupting memory that is not supposed to be > written to after it is loaded. Anyone familiar with using them? I > tried once, but it seemed to break all the time. But when it broke > it did not appear that break happened when I asked it to. No, it has to empty the pipe. You'll stop 3 instructions after the requested break. You can set EMUN via jtag and it will stop after that registers number of counts. For additional possible ways to break see chapter 10 in the 21160 hardware manual. > 4. Are there any schemes for protecting areas of external RAM that > don't require HW? No, you can limit access to 4 different banks but there is no "supervisor" mode like 68k micro's. > 5. Any ideas on how to track down something that is trampling > registers or memory, but not in any sort of identifiable pattern? Yeah, but they are all a pain in the butt :-) Sounds like you are already on the right track, more detailed hunting will get you there eventually. Check the stack for overflow conditions, you can only go 8 levels deep on loops. Maybe some combination of interrupts and code causes a loop counter overflow. Try doing a memory halt - it may be that writing to a certain bank causes you to load weird vectors later on. Good luck! Sounds like a challenge that will be a great war story a year or 2 from now :-) Patience, persistence, truth, Dr. mike