>> salaria wrote:
>>> Hi
>>> I have just started using CCS for DM642 simulation. During profiling i
> see
>>> that 99% fo the cycles are stall cycles and under 1% cycles are CPU
> cycles.
>>> Could someone explain about stall cycles and some way to reduce it?
>>>
>>> regards
>>> APS
>>>
>> This sounds like 1 of 2 things:
>>
>> 1) You're seeing lots of NOPs in which case you need to turn on the
>> optimizer in the compiler options.
>>
>> 2) You're seeing CPU stalls due to cache misses in which case you need
>> to enable L2 cache and the corresponding MAR bits, or alternatively copy
>
>> data into internal memory, operate on it, and then copy back to external
>
>> memory.
>>
>> Brad
>>
>
> Hi Brad,
>
> I think it is the second reason. could you help me with the following:
>
> 1. How do i enable L2 cache and the corresponding MAR bits?
> 2. How do i tell the compiler to copy packets of data to the internal
> memmory and put them back to the external memmory when processing is
> done?
>
> Thanks
> Regards
> APS
>
In order to increase your system performance you want to minimize the
number of accesses to system memory.
The easy way to do this is to utilize the two-level cache. For example
if you have a frame buffer in external memory you just operate on that
buffer and the cache will boost your performance. The CCFG register
configures the L2 cache mode. You can find its address in the data
sheet and a description of the bitfields in the 64x Two-Level Memory
Reference Guide. Same for the MAR bits.
The more difficult way to do it but sometimes more optimal way to do it
is to actually do the copies to internal memory manually. For example
you might first copy a frame buffer from external memory to internal
memory. Then you do all the processing on it and copy it back out to
external memory. This is not something the compiler will do for you.
You would most likely want to use DAT_copy from the CSL to do this copy
for you (using QDMA) while your code is doing something else.
I don't typically use the simulator so I'm not sure how much of this
stuff is supported. It sounds like all the cache stuff is supported
since it's apparently showing you hits and misses. I'm not sure about
the QDMA though. You could substitute a memcpy temporarily and then
replace it with a DAT_copy on real hardware if it's not supported.
Brad
Reply by salaria●January 29, 20072007-01-29
>salaria wrote:
>> Hi
>> I have just started using CCS for DM642 simulation. During profiling i
see
>> that 99% fo the cycles are stall cycles and under 1% cycles are CPU
cycles.
>> Could someone explain about stall cycles and some way to reduce it?
>>
>> regards
>> APS
>>
>
>This sounds like 1 of 2 things:
>
>1) You're seeing lots of NOPs in which case you need to turn on the
>optimizer in the compiler options.
>
>2) You're seeing CPU stalls due to cache misses in which case you need
>to enable L2 cache and the corresponding MAR bits, or alternatively copy
>data into internal memory, operate on it, and then copy back to external
>memory.
>
>Brad
>
Hi Brad,
I think it is the second reason. could you help me with the following:
1. How do i enable L2 cache and the corresponding MAR bits?
2. How do i tell the compiler to copy packets of data to the internal
memmory and put them back to the external memmory when processing is
done?
Thanks
Regards
APS
Reply by Brad Griffis●January 29, 20072007-01-29
salaria wrote:
> Hi
> I have just started using CCS for DM642 simulation. During profiling i see
> that 99% fo the cycles are stall cycles and under 1% cycles are CPU cycles.
> Could someone explain about stall cycles and some way to reduce it?
>
> regards
> APS
>
This sounds like 1 of 2 things:
1) You're seeing lots of NOPs in which case you need to turn on the
optimizer in the compiler options.
2) You're seeing CPU stalls due to cache misses in which case you need
to enable L2 cache and the corresponding MAR bits, or alternatively copy
data into internal memory, operate on it, and then copy back to external
memory.
Brad
Reply by salaria●January 29, 20072007-01-29
Hi
I have just started using CCS for DM642 simulation. During profiling i see
that 99% fo the cycles are stall cycles and under 1% cycles are CPU cycles.
Could someone explain about stall cycles and some way to reduce it?
regards
APS