----- Original Message -----
From: Yong Yang
Sent: Monday, April 26, 2004 3:03 PM
Subject: Re: [c6x] How to speed up profiling
Pls see the answers below,
What are the functions that you are trying to
All major functions in the encoder [Ganesh] If you profile all functions, then are you profiling on
simulator or board ? I guess board from your answer below. In which case, you
can profile only one function at a time. You need to use CSL (chip support
library) functions, which is either clock() or TIMER_getCount(). If you want an
estimate of your individual functions' breakup, then perform profiling for a
small resolution image on a simulator.[yong]I am profiling on board. Since the profiler tool already
provides cycle count, why need i use clock() or
TIMER_getCount()? Do you mean i don't use the profiler tool, but use my
hand-made code to count the time consumed by each function
What are your project settings ?
Function Profile Debug, Speed most Critical, Opt level:File, Program Level OPt: No
External Var Refs, RTS Modifications: Defns no Funcs, Memory Models: Far Calls
& Data, RTS CAlls:Use Memory Model [Ganesh] Ideally
you should run your code with file level optimization level -o3. When you are
profiling, you should ideally run using release mode with no debug information
whatsoever.[yong]yes, i run my code with file level
optimization level -o3. Howerver,The profiler tool needs function Profile
Debug information. So i don't think i can profile with no debug information
using release mode
What is your memory allocation pattern
ISDRAM base:0x0, length:40000, heap size
SDRAM base:0x80000000, length:0x5000000, heap size:
All code and data are loaded to SDRAM, L2 cache 256k
enabled [Ganesh] You are using DM642 which has only 256 KB
internal memory. From your statement, I guess you aren't using L2 ISRAM or are
you ? In any case, you should be thinking of using ISRAM judiciously.[yong]What's your recommendation to achieve the highest
performance(speed)? 256k L2 cache, plus 0 ISRAM, or other combinations, such as
192k ISRAM plus 64k cache, etc?
What is the frequency of your DSP ?
DM 642 600MHZ [Ganesh] Are
you using C6416 TEB or DM642? I am getting this doubt as you have specified
ISRAM address as well as claiming 256 KB cache which isn't possible in
DM642.[yong]i'm using DM642 EVM. I specified
ISRAM address in DSP/BIOS config file, while claiming 256 KB cache in code by
CACHE_setL2Mode(CACHE_256KCACHE). Maybe it's a mistake and i should make
ISRAM+L2 cache= 256k, is it?
Have you optimized your code or are you trying
to cross-compile the code ?
Optimazed on Pentium
3.2G PC, speed around 80fps. Now on DSP only 2fps, need realtime 15fps [Ganesh] You need to go through optimizing C code for C6416
document as well performing coding of some low level functions.[yong]How much do you think is possible to improve the speed by
code optimization such as linear assembly, software pipeline, etc. Is it
possible to move from 2fps to 15fps? otherwise shall i do some algorithm
optimation before code optimation?
How are you profiling ? Are you using clock()
functions or TIMER module ?
Using profiler tool. Under menu->Start new Sesseion,
then select profile area. No clock() functions or TIMER module [Ganesh] Answered