> aha, yes, it is due to floating pt, I hadnt realized as I only
> stepped-over not thru, and the branch instruction didnt disassemble
> correctly in Code Composer for some reason, the branch showed as a
> '.word' not as an instruction,
>
> anyway that explains it, thanks again, Tim!
>
IIRC you have to set up your 'GEL' file for the memory mode you're in --
the 28xx wakes up emulating a 24x or a 27xx and you have to tell it what
it is, but then the debugger needs to know, too.
--
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
Reply by perf...@yahoo.com●February 16, 20052005-02-16
aha, yes, it is due to floating pt, I hadnt realized as I only
stepped-over not thru, and the branch instruction didnt disassemble
correctly in Code Composer for some reason, the branch showed as a
'.word' not as an instruction,
anyway that explains it, thanks again, Tim!
Reply by Tim Wescott●February 16, 20052005-02-16
perfb@yahoo.com wrote:
> ok (Tim? :-), now I've got the PLL set to 150Mhz, all code and data
> located in internal RAM, I can verify actual 150 MIPS speed using RPT
> NOP instructions, but now when I time an actual C function such as the
> PID example TI supplies, it appears to be ~1/6 the speed of 150 MIPS.
>
> e.g. timing the following C function, I get about 5 us per call.
> Single stepping in assembler mode, I count about 120 mouse clicks to go
> through one iteration. But 5 us equates to about 750 instructions at
> 150 MIPS, so what is going on here?
>
> Is this due to pipeline inefficiency effects, eg stalls or
> address/data-bus collisions or etc? All data and code appear to be
> located in internal RAM afaics, I watch the AR registers and dont see
> any external fetches. If I relocate to external RAM then I can see a
> definite increase, eg 500 times slower (!), so I am pretty confident
> about being in internal RAM.
>
> tia for any clues!
>
> ------------------------
>
> TI PID example (~120 assembler instruction steps per call):
>
>
> void pid_reg3_calc(PIDREG3 *v)
> {
>
> v->e_reg3 = v->pid_ref_reg3 - v->pid_fdb_reg3;
>
> v->up_reg3 = v->Kp_reg3*v->e_reg3;
>
> v->uprsat_reg3 = v->up_reg3 + v->ui_reg3 + v->ud_reg3;
>
> if (v->uprsat_reg3 > v->pid_out_max)
>
> v->pid_out_reg3 = v->pid_out_max;
>
> else if (v->uprsat_reg3 < v->pid_out_min)
>
> v->pid_out_reg3 = v->pid_out_min;
>
> else
>
> v->pid_out_reg3 = v->uprsat_reg3;
>
> v->saterr_reg3 = v->pid_out_reg3 - v->uprsat_reg3;
>
> v->ui_reg3 = v->ui_reg3 + v->Ki_reg3*v->up_reg3 +
> v->Kc_reg3*v->saterr_reg3;
>
> v->ud_reg3 = v->Kd_reg3*(v->up_reg3 - v->up1_reg3);
>
> v->up1_reg3 = v->up_reg3;
>
> }
>
>
> linker cmd file:
>
- snipped -
>
You don't have the structure definition, but it appears to be floating
point -- is this so? Did you trace down into each floating point call
and count instructions there, too? Did you start from _outside_ the
function and count the clocks to get in, then back out (which should
takes somewhat less than 600 clocks, to be sure!).
--
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
Reply by perf...@yahoo.com●February 16, 20052005-02-16
ok (Tim? :-), now I've got the PLL set to 150Mhz, all code and data
located in internal RAM, I can verify actual 150 MIPS speed using RPT
NOP instructions, but now when I time an actual C function such as the
PID example TI supplies, it appears to be ~1/6 the speed of 150 MIPS.
e.g. timing the following C function, I get about 5 us per call.
Single stepping in assembler mode, I count about 120 mouse clicks to go
through one iteration. But 5 us equates to about 750 instructions at
150 MIPS, so what is going on here?
Is this due to pipeline inefficiency effects, eg stalls or
address/data-bus collisions or etc? All data and code appear to be
located in internal RAM afaics, I watch the AR registers and dont see
any external fetches. If I relocate to external RAM then I can see a
definite increase, eg 500 times slower (!), so I am pretty confident
about being in internal RAM.
tia for any clues!
------------------------
TI PID example (~120 assembler instruction steps per call):
void pid_reg3_calc(PIDREG3 *v)
{
v->e_reg3 = v->pid_ref_reg3 - v->pid_fdb_reg3;
v->up_reg3 = v->Kp_reg3*v->e_reg3;
v->uprsat_reg3 = v->up_reg3 + v->ui_reg3 + v->ud_reg3;
if (v->uprsat_reg3 > v->pid_out_max)
v->pid_out_reg3 = v->pid_out_max;
else if (v->uprsat_reg3 < v->pid_out_min)
v->pid_out_reg3 = v->pid_out_min;
else
v->pid_out_reg3 = v->uprsat_reg3;
v->saterr_reg3 = v->pid_out_reg3 - v->uprsat_reg3;
v->ui_reg3 = v->ui_reg3 + v->Ki_reg3*v->up_reg3 +
v->Kc_reg3*v->saterr_reg3;
v->ud_reg3 = v->Kd_reg3*(v->up_reg3 - v->up1_reg3);
v->up1_reg3 = v->up_reg3;
}
linker cmd file:
MEMORY
{
PAGE 0 :
/* For this example, H0 is split between PAGE 0 and PAGE 1 */
/* BEGIN is used for the "boot to HO" bootloader mode */
/* RESET is loaded with the reset vector only if */
/* the boot is from XINTF Zone 7. Otherwise reset vector */
/* is fetched from boot ROM. See .reset section below */
RAMM0 : origin = 0x000000, length = 0x000400
BEGIN : origin = 0x3F8000, length = 0x000002
/* PRAMH0 : origin = 0x3F8002, length = 0x000FFE internal RAM */
/* PRAMH0 : origin = 0x100000, length = 0x03E800 external RAM */
PRAMH0 : origin = 0x3F8002, length = 0x000FFE
RESET : origin = 0x3FFFC0, length = 0x000002
PAGE 1 :
/* For this example, H0 is split between PAGE 0 and PAGE 1 */
RAMM1 : origin = 0x000400, length = 0x000400
DRAMH0 : origin = 0x3f9000, length = 0x001000
}
SECTIONS
{
/* Setup for "boot to H0" mode:
The codestart section (found in DSP28_CodeStartBranch.asm)
re-directs execution to the start of user code.
Place this section at the start of H0 */
codestart : > BEGIN, PAGE = 0
ramfuncs : > PRAMH0 PAGE = 0
.text : > PRAMH0, PAGE = 0
.cinit : > PRAMH0, PAGE = 0
.pinit : > PRAMH0, PAGE = 0
.switch : > RAMM0, PAGE = 0
.reset : > RESET, PAGE = 0, TYPE = DSECT /* not
used, */
.stack : > RAMM1, PAGE = 1
.ebss : > DRAMH0, PAGE = 1
.econst : > DRAMH0, PAGE = 1
.esysmem : > DRAMH0, PAGE = 1
}