DSPRelated.com
Forums

TI F2812 DSP actual MIPS < 150 ?

Started by perf...@yahoo.com February 8, 2005
running a while() loop on the SpectrumDigital F2812 EZDSP
evaluation board, using a 30 Mhz clock crystal,
toggling the external XF bit after 10,000 while() loops,
and monitoring timing on a scope, it appears that the
6 assembler instructions that comprise the while() loop
are executed at a rate ~2.3MIPS.

This seems a little low given the advertised 150 MIPS spec
of the F2812.

Am I thinking of this wrong somehow?  Is the 150 MIPS spec one of
those artificial specs that count all internal operations,
eg address calculation, opcode fetching etc?

hmmm, could it be because the code is running out of external RAM?
(I am running from CodeComposer window, connected via JTAG,
which I assume means the
code is residing in external RAM not internal flash).

Even so, I wouldnt expect a speedup of x70 if code is moved
into internal RAM, or would I?

perfb@yahoo.com wrote:

> running a while() loop on the SpectrumDigital F2812 EZDSP > evaluation board, using a 30 Mhz clock crystal, > toggling the external XF bit after 10,000 while() loops, > and monitoring timing on a scope, it appears that the > 6 assembler instructions that comprise the while() loop > are executed at a rate ~2.3MIPS. > > This seems a little low given the advertised 150 MIPS spec > of the F2812. > > Am I thinking of this wrong somehow? Is the 150 MIPS spec one of > those artificial specs that count all internal operations, > eg address calculation, opcode fetching etc? > > hmmm, could it be because the code is running out of external RAM? > (I am running from CodeComposer window, connected via JTAG, > which I assume means the > code is residing in external RAM not internal flash). > > Even so, I wouldnt expect a speedup of x70 if code is moved > into internal RAM, or would I? >
* The external bus, by default, is set very slow -- this could slow you down a bunch. * The PLL, by default, is set off -- running the master clock at 30MHz instead of 150MHz could slow you down a bunch. * Branches take more than one clock -- running a hardware loop would speed things up. Try making sure your clock rate really is 150MHz, run out of internal RAM, and take the branch overhead into account -- I think you'll get different results. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
On 8 Feb 2005 10:41:48 -0800, "perfb@yahoo.com" <perfb@yahoo.com>
wrote in comp.dsp:

> running a while() loop on the SpectrumDigital F2812 EZDSP > evaluation board, using a 30 Mhz clock crystal, > toggling the external XF bit after 10,000 while() loops, > and monitoring timing on a scope, it appears that the > 6 assembler instructions that comprise the while() loop > are executed at a rate ~2.3MIPS. > > This seems a little low given the advertised 150 MIPS spec > of the F2812. > > Am I thinking of this wrong somehow? Is the 150 MIPS spec one of > those artificial specs that count all internal operations, > eg address calculation, opcode fetching etc? > > hmmm, could it be because the code is running out of external RAM? > (I am running from CodeComposer window, connected via JTAG, > which I assume means the > code is residing in external RAM not internal flash). > > Even so, I wouldnt expect a speedup of x70 if code is moved > into internal RAM, or would I?
As Tim already said, the default timing on the 2812 is what is killing you. If you don't turn on the PLL, the core runs at 30 MHz, not 150 MHz. And the default external bus interface timing is either 26 or 52 core clock cycles, I forget which exactly. Look at the XINTF section of the data sheet for changing external memory timing. With a fast static RAM (15 ns), you should be able to run four clock external RAM cycles. When you run code from internal flash, it is actually faster than external RAM if you turn on the flash pipeline feature, even though the flash is 6 clocks. But for the parts of your code that must run at top speed, you need to run it from internal RAM. One of the TI app notes for the 28xx series shows you how to do this. You set up the linker command file so the block of code has different load and run addresses, and at run start up you copy if from the load address (in flash) to the run address in internal RAM. With the JTAG feature, you just put it in internal RAM directly. But note that even a 150 MHz pipelined DSP with a single execution unit can't sustain 150 MIPS for long. You can do it during repeat MAC operations and short stretches of code, but no indefinitely in general code. Changes in program execution flow like jumps, subroutine calls, and returns flush the pipeline, some instructions are two words, and so on. -- Jack Klein Home: http://JK-Technology.Com FAQs for comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html comp.lang.c++ http://www.parashift.com/c++-faq-lite/ alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
thanks for all the help!

As suggested, turning on the PLL and moving to internal RAM solved the
problem.
I can now verify 150MIPS by timing the difference in time caused by
adding
10 NOPs to a loop. e.g. 10 NOPS increase loop timing by 66ns =~150
MIPS.

There seems to be some speed penalty writing to the
GPIO bits too, maybe there are more control registers involved?

I am contemplating using the DSP-BIOS to manage 20 PID loop tasks
running at 100 Hz update rate, is this well within the DSP-BIOS
speed limitations / overhead penalty?

tia!



Jack Klein wrote:
> On 8 Feb 2005 10:41:48 -0800, "perfb@yahoo.com" <perfb@yahoo.com> > wrote in comp.dsp: > > > running a while() loop on the SpectrumDigital F2812 EZDSP > > evaluation board, using a 30 Mhz clock crystal, > > toggling the external XF bit after 10,000 while() loops, > > and monitoring timing on a scope, it appears that the > > 6 assembler instructions that comprise the while() loop > > are executed at a rate ~2.3MIPS. > > > > This seems a little low given the advertised 150 MIPS spec > > of the F2812. > > > > Am I thinking of this wrong somehow? Is the 150 MIPS spec one of > > those artificial specs that count all internal operations, > > eg address calculation, opcode fetching etc? > > > > hmmm, could it be because the code is running out of external RAM? > > (I am running from CodeComposer window, connected via JTAG, > > which I assume means the > > code is residing in external RAM not internal flash). > > > > Even so, I wouldnt expect a speedup of x70 if code is moved > > into internal RAM, or would I? > > As Tim already said, the default timing on the 2812 is what is
killing
> you. If you don't turn on the PLL, the core runs at 30 MHz, not 150 > MHz. And the default external bus interface timing is either 26 or
52
> core clock cycles, I forget which exactly. Look at the XINTF section > of the data sheet for changing external memory timing. With a fast > static RAM (15 ns), you should be able to run four clock external RAM > cycles. > > When you run code from internal flash, it is actually faster than > external RAM if you turn on the flash pipeline feature, even though > the flash is 6 clocks. > > But for the parts of your code that must run at top speed, you need
to
> run it from internal RAM. One of the TI app notes for the 28xx
series
> shows you how to do this. You set up the linker command file so the > block of code has different load and run addresses, and at run start > up you copy if from the load address (in flash) to the run address in > internal RAM. > > With the JTAG feature, you just put it in internal RAM directly. > > But note that even a 150 MHz pipelined DSP with a single execution > unit can't sustain 150 MIPS for long. You can do it during repeat
MAC
> operations and short stretches of code, but no indefinitely in
general
> code. > > Changes in program execution flow like jumps, subroutine calls, and > returns flush the pipeline, some instructions are two words, and so > on. > > -- > Jack Klein > Home: http://JK-Technology.Com > FAQs for > comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html > comp.lang.c++ http://www.parashift.com/c++-faq-lite/ > alt.comp.lang.learn.c-c++ > http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
On 9 Feb 2005 16:01:15 -0800, "perfb@yahoo.com" <perfb@yahoo.com>
wrote in comp.dsp:

> thanks for all the help! > > As suggested, turning on the PLL and moving to internal RAM solved the > problem. > I can now verify 150MIPS by timing the difference in time caused by > adding > 10 NOPs to a loop. e.g. 10 NOPS increase loop timing by 66ns =~150 > MIPS. > > There seems to be some speed penalty writing to the > GPIO bits too, maybe there are more control registers involved? > > I am contemplating using the DSP-BIOS to manage 20 PID loop tasks > running at 100 Hz update rate, is this well within the DSP-BIOS > speed limitations / overhead penalty? > > tia!
RTFM (Read The Fine Manual). Some of the on-chip peripherals operate at full speed, some of them require wait states. There are also protected areas, some of which you can change the protection for and others which you can not. Read the control and interrupts section of the manual, or the separate expanded user guide for it. As for how many PID loops you can run, that depends on lot of things, including the requirements of the PID loops. Should be possible to do 20 if they're not doing anything else but reading a sensor and computing a single value. On the other hand, it would be impossible to run 20 simultaneous PID loops for three-phase motors using space vector modulation. -- Jack Klein Home: http://JK-Technology.Com FAQs for comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html comp.lang.c++ http://www.parashift.com/c++-faq-lite/ alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
"Jack Klein" <jackklein@spamcop.net> wrote in message 
news:sthl01dftd2o9nhu4gkaanshmck1lvbdg9@4ax.com...
> RTFM (Read The Fine Manual).
"F" for "Fine"? Go ahead, Jack. Give him the *real* meaning! Ha ha! Brad
thanks so much for the suggestion, I am indeed in the process of RTFMs,
a mere few thousand pages for the F2812 at last count, all in glorious
pdf, gone are the days apparently of freebie TI hardcopies ... I still
have boxes of orange C25 C33 C40 manuals somewhere ...