ok (Tim? :-), now I've got the PLL set to 150Mhz, all code and data located in internal RAM, I can verify actual 150 MIPS speed using RPT NOP instructions, but now when I time an actual C function such as the PID example TI supplies, it appears to be ~1/6 the speed of 150 MIPS. e.g. timing the following C function, I get about 5 us per call. Single stepping in assembler mode, I count about 120 mouse clicks to go through one iteration. But 5 us equates to about 750 instructions at 150 MIPS, so what is going on here? Is this due to pipeline inefficiency effects, eg stalls or address/data-bus collisions or etc? All data and code appear to be located in internal RAM afaics, I watch the AR registers and dont see any external fetches. If I relocate to external RAM then I can see a definite increase, eg 500 times slower (!), so I am pretty confident about being in internal RAM. tia for any clues! ------------------------ TI PID example (~120 assembler instruction steps per call): void pid_reg3_calc(PIDREG3 *v) { v->e_reg3 = v->pid_ref_reg3 - v->pid_fdb_reg3; v->up_reg3 = v->Kp_reg3*v->e_reg3; v->uprsat_reg3 = v->up_reg3 + v->ui_reg3 + v->ud_reg3; if (v->uprsat_reg3 > v->pid_out_max) v->pid_out_reg3 = v->pid_out_max; else if (v->uprsat_reg3 < v->pid_out_min) v->pid_out_reg3 = v->pid_out_min; else v->pid_out_reg3 = v->uprsat_reg3; v->saterr_reg3 = v->pid_out_reg3 - v->uprsat_reg3; v->ui_reg3 = v->ui_reg3 + v->Ki_reg3*v->up_reg3 + v->Kc_reg3*v->saterr_reg3; v->ud_reg3 = v->Kd_reg3*(v->up_reg3 - v->up1_reg3); v->up1_reg3 = v->up_reg3; } linker cmd file: MEMORY { PAGE 0 : /* For this example, H0 is split between PAGE 0 and PAGE 1 */ /* BEGIN is used for the "boot to HO" bootloader mode */ /* RESET is loaded with the reset vector only if */ /* the boot is from XINTF Zone 7. Otherwise reset vector */ /* is fetched from boot ROM. See .reset section below */ RAMM0 : origin = 0x000000, length = 0x000400 BEGIN : origin = 0x3F8000, length = 0x000002 /* PRAMH0 : origin = 0x3F8002, length = 0x000FFE internal RAM */ /* PRAMH0 : origin = 0x100000, length = 0x03E800 external RAM */ PRAMH0 : origin = 0x3F8002, length = 0x000FFE RESET : origin = 0x3FFFC0, length = 0x000002 PAGE 1 : /* For this example, H0 is split between PAGE 0 and PAGE 1 */ RAMM1 : origin = 0x000400, length = 0x000400 DRAMH0 : origin = 0x3f9000, length = 0x001000 } SECTIONS { /* Setup for "boot to H0" mode: The codestart section (found in DSP28_CodeStartBranch.asm) re-directs execution to the start of user code. Place this section at the start of H0 */ codestart : > BEGIN, PAGE = 0 ramfuncs : > PRAMH0 PAGE = 0 .text : > PRAMH0, PAGE = 0 .cinit : > PRAMH0, PAGE = 0 .pinit : > PRAMH0, PAGE = 0 .switch : > RAMM0, PAGE = 0 .reset : > RESET, PAGE = 0, TYPE = DSECT /* not used, */ .stack : > RAMM1, PAGE = 1 .ebss : > DRAMH0, PAGE = 1 .econst : > DRAMH0, PAGE = 1 .esysmem : > DRAMH0, PAGE = 1 }
F2812 MIPS for C code appears low
Started by ●February 16, 2005
Reply by ●February 16, 20052005-02-16
perfb@yahoo.com wrote:> ok (Tim? :-), now I've got the PLL set to 150Mhz, all code and data > located in internal RAM, I can verify actual 150 MIPS speed using RPT > NOP instructions, but now when I time an actual C function such as the > PID example TI supplies, it appears to be ~1/6 the speed of 150 MIPS. > > e.g. timing the following C function, I get about 5 us per call. > Single stepping in assembler mode, I count about 120 mouse clicks to go > through one iteration. But 5 us equates to about 750 instructions at > 150 MIPS, so what is going on here? > > Is this due to pipeline inefficiency effects, eg stalls or > address/data-bus collisions or etc? All data and code appear to be > located in internal RAM afaics, I watch the AR registers and dont see > any external fetches. If I relocate to external RAM then I can see a > definite increase, eg 500 times slower (!), so I am pretty confident > about being in internal RAM. > > tia for any clues! > > ------------------------ > > TI PID example (~120 assembler instruction steps per call): > > > void pid_reg3_calc(PIDREG3 *v) > { > > v->e_reg3 = v->pid_ref_reg3 - v->pid_fdb_reg3; > > v->up_reg3 = v->Kp_reg3*v->e_reg3; > > v->uprsat_reg3 = v->up_reg3 + v->ui_reg3 + v->ud_reg3; > > if (v->uprsat_reg3 > v->pid_out_max) > > v->pid_out_reg3 = v->pid_out_max; > > else if (v->uprsat_reg3 < v->pid_out_min) > > v->pid_out_reg3 = v->pid_out_min; > > else > > v->pid_out_reg3 = v->uprsat_reg3; > > v->saterr_reg3 = v->pid_out_reg3 - v->uprsat_reg3; > > v->ui_reg3 = v->ui_reg3 + v->Ki_reg3*v->up_reg3 + > v->Kc_reg3*v->saterr_reg3; > > v->ud_reg3 = v->Kd_reg3*(v->up_reg3 - v->up1_reg3); > > v->up1_reg3 = v->up_reg3; > > } > > > linker cmd file: >- snipped ->You don't have the structure definition, but it appears to be floating point -- is this so? Did you trace down into each floating point call and count instructions there, too? Did you start from _outside_ the function and count the clocks to get in, then back out (which should takes somewhat less than 600 clocks, to be sure!). -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Reply by ●February 16, 20052005-02-16
aha, yes, it is due to floating pt, I hadnt realized as I only stepped-over not thru, and the branch instruction didnt disassemble correctly in Code Composer for some reason, the branch showed as a '.word' not as an instruction, anyway that explains it, thanks again, Tim!
Reply by ●February 16, 20052005-02-16
perfb@yahoo.com wrote:> aha, yes, it is due to floating pt, I hadnt realized as I only > stepped-over not thru, and the branch instruction didnt disassemble > correctly in Code Composer for some reason, the branch showed as a > '.word' not as an instruction, > > anyway that explains it, thanks again, Tim! >IIRC you have to set up your 'GEL' file for the memory mode you're in -- the 28xx wakes up emulating a 24x or a 27xx and you have to tell it what it is, but then the debugger needs to know, too. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com