DSPRelated.com
Forums

Optimisation of Loops

Started by Unknown March 6, 2005

Hi,

I have some trouble with my program, it needs too much cycles.
Maybe I can do some optimisation. I have to do some calculations
and therefore I have to shift this vector:

for (k = 0; k < Size - 1; k++)
Buffer[j][k] = Buffer[j][k + 1];

Does anybody know, if there is a way to do this more efficiently,
maybe with a special DSP function?

Thanks

D. Puchowski




Hi,

there were already several discussions on optimization in this group, maybe
you can get some information by searching through the group archive...

The 'Optimizing Compiler Manual' (www.ti.com), (I dont know the exact name,
might be spru189 or something like that) - also provides a lot of
information. You can improve the speed of loops a lot by using some #pragmas
(for example MUST_ITERATE) and switching on optimization when compiling.
there are a lot of other 'screws' you can turn to speed it up (putting data
and code it into internal ram, etc. etc. I think most of this was also
discussed in recent postings,

best regards,

thomas ----- Original Message -----
From: <>
To: <>
Sent: Sunday, March 06, 2005 4:59 PM
Subject: [c6x] Optimisation of Loops >
>
> Hi,
>
> I have some trouble with my program, it needs too much cycles.
> Maybe I can do some optimisation. I have to do some calculations
> and therefore I have to shift this vector:
>
> for (k = 0; k < Size - 1; k++)
> Buffer[j][k] = Buffer[j][k + 1];
>
> Does anybody know, if there is a way to do this more efficiently,
> maybe with a special DSP function?
>
> Thanks
>
> D. Puchowski >
>





Puchowski--
Here are some tips that TI reccomends:

1] Loop Partitioning in C:Calculations worth two Loop iterations
inside one, if possible

2]Pragmas: MUST_ITERATE: To indicate to the compiler that the loop is
likely to run *so many number of * times to assist in loop
unrolling.This is primarily meant for C optimization

3]Pragmas: DATA_ALIGN: To instruct the compiler to align short datas
on word boundaries to facilitate word wide optimization.This is
primarily meant for C optimization

4] Intrinsics: int_MPY/int_MPYH: Use intrinsics instead of C operators
like * for more efficient/direct use of special purpose assembly
features.Meant for C optimzation again.

5]Allocate all CODE/DATA in IRAM and use keywords-restrict and far
whenever applicable.

6] Use these build options, if possible: -O3,-PM and -mt

If after all these(you still get correct result!!!) you are unhappy
with the optimization, the re-write the algorithm in Linear Assembly.

Good Luck

--Bhooshan

On Sun, 06 Mar 2005 10:59:31 -0500, <> wrote:
>
>
> Hi,
>
> I have some trouble with my program, it needs too much cycles.
> Maybe I can do some optimisation. I have to do some calculations
> and therefore I have to shift this vector:
>
> for (k = 0; k < Size - 1; k++)
> Buffer[j][k] = Buffer[j][k + 1];
>
> Does anybody know, if there is a way to do this more efficiently,
> maybe with a special DSP function?
>
> Thanks
>
> D. Puchowski >
>


--
-------------------------------
"I've missed more than 9000 shots in my career.
I've lost almost 300 games. 26 times I've been trusted to take the
game winning shot and missed.
I've failed over and over again in my life.
And that is why I succeed."
-- Michael Jordan
--------------------------------



Hi,

is it really neccessary to copy the data? Maybe you can use a pointer.

There is a dsplib with a function dspf_sp_blk_move. But I think it has a
restriction that the blocks must not overlap.

Gustl

wrote:

>
> Hi,
>
> I have some trouble with my program, it needs too much cycles.
> Maybe I can do some optimisation. I have to do some calculations
> and therefore I have to shift this vector:
>
> for (k = 0; k < Size - 1; k++)
> Buffer[j][k] = Buffer[j][k + 1];
>
> Does anybody know, if there is a way to do this more efficiently,
> maybe with a special DSP function?
>
> Thanks
>
> D. Puchowski





Hi,

thanks a lot for these hints, I was able to speed up this loop. But if I try to optimize other loops that contain simplest mathematics (+, *, -) the consultant complains, that it can't inline the functions. What's the matter with this?

D. Puchowski
>Puchowski--
>Here are some tips that TI reccomends:
>
>1] Loop Partitioning in C:Calculations worth two Loop iterations
>inside one, if possible
>
>2]Pragmas: MUST_ITERATE: To indicate to the compiler that the loop is
>likely to run *so many number of * times to assist in loop
>unrolling.This is primarily meant for C optimization
>
>3]Pragmas: DATA_ALIGN: To instruct the compiler to align short datas
>on word boundaries to facilitate word wide optimization.This is
>primarily meant for C optimization
>
>4] Intrinsics: int_MPY/int_MPYH: Use intrinsics instead of C operators
>like * for more efficient/direct use of special purpose assembly
>features.Meant for C optimzation again.
>
>5]Allocate all CODE/DATA in IRAM and use keywords-restrict and far
>whenever applicable.
>
>6] Use these build options, if possible: -O3,-PM and -mt
>
>If after all these(you still get correct result!!!) you are unhappy
>with the optimization, the re-write the algorithm in Linear Assembly.
>
>Good Luck
>
>--Bhooshan >
>
>On Sun, 06 Mar 2005 10:59:31 -0500, <> wrote:
>>
>>
>> Hi,
>>
>> I have some trouble with my program, it needs too much cycles.
>> Maybe I can do some optimisation. I have to do some calculations
>> and therefore I have to shift this vector:
>>
>> for (k = 0; k < Size - 1; k++)
>> Buffer[j][k] = Buffer[j][k + 1];
>>
>> Does anybody know, if there is a way to do this more efficiently,
>> maybe with a special DSP function?
>>
>> Thanks
>>
>> D. Puchowski
>>





Hi,
try to check in your code there must be a function call or call to
run-time support routines( like "%") in your loop.
if that case the loop will not get pipelined and you will get a
warning that functions can't be inlined.
Try to remove such expressions having routine call from the loop
otherwise use automatic inline option(-oi) in your compiler options.

~Sri
On Mon, 14 Mar 2005 02:21:56 -0500, <> wrote:
>
>
> Hi,
>
> thanks a lot for these hints, I was able to speed up this loop. But if I try to optimize other loops that contain simplest mathematics (+, *, -) the consultant complains, that it can't inline the functions. What's the matter with this?
>
> D. Puchowski
>
> >
> >
> >
> >Puchowski--
> >Here are some tips that TI reccomends:
> >
> >1] Loop Partitioning in C:Calculations worth two Loop iterations
> >inside one, if possible
> >
> >2]Pragmas: MUST_ITERATE: To indicate to the compiler that the loop is
> >likely to run *so many number of * times to assist in loop
> >unrolling.This is primarily meant for C optimization
> >
> >3]Pragmas: DATA_ALIGN: To instruct the compiler to align short datas
> >on word boundaries to facilitate word wide optimization.This is
> >primarily meant for C optimization
> >
> >4] Intrinsics: int_MPY/int_MPYH: Use intrinsics instead of C operators
> >like * for more efficient/direct use of special purpose assembly
> >features.Meant for C optimzation again.
> >
> >5]Allocate all CODE/DATA in IRAM and use keywords-restrict and far
> >whenever applicable.
> >
> >6] Use these build options, if possible: -O3,-PM and -mt
> >
> >If after all these(you still get correct result!!!) you are unhappy
> >with the optimization, the re-write the algorithm in Linear Assembly.
> >
> >Good Luck
> >
> >--Bhooshan
> >
> >
> >
> >
> >On Sun, 06 Mar 2005 10:59:31 -0500, <> wrote:
> >>
> >>
> >> Hi,
> >>
> >> I have some trouble with my program, it needs too much cycles.
> >> Maybe I can do some optimisation. I have to do some calculations
> >> and therefore I have to shift this vector:
> >>
> >> for (k = 0; k < Size - 1; k++)
> >> Buffer[j][k] = Buffer[j][k + 1];
> >>
> >> Does anybody know, if there is a way to do this more efficiently,
> >> maybe with a special DSP function?
> >>
> >> Thanks
> >>
> >> D. Puchowski
> > >
>


--
Regards
Sri




Are these floating-point operations?
Otherwise - see what function it is using for the operation - generate
and examine optimized c code in the .lst file - and that will generally
clarify the issue.

Roger Kingsley

> Message: 1
> Date: Mon, 14 Mar 2005 02:21:56 -0500
> From:
> Subject: Re: Optimisation of Loops > Hi,
>
> thanks a lot for these hints, I was able to speed up this loop. But if
I
> try to optimize other loops that contain simplest mathematics (+, *,
-)
> the consultant complains, that it can't inline the functions. What's
the
> matter with this?
>
> D. Puchowski





Hello All

Is it possible to do the EDMA or QDMA between, SDRAM to SDRAM,
i.e. with SRC and DST address in SDRAM???? "specifically in DM642".

Rgds
Harsha


Attachment (not stored)
winmail.dat
Type: application/ms-tnef


Hi,

you are right. I'm doing some floating point calculations on a 6416, so I think the floating point arithmetics are implemented in the RTS. It seems like that there is no way to pipeline this loop except rewriting it with fixed point arithmetics?

D. Puchowski

>Hi,
>try to check in your code there must be a function call or call to
>run-time support routines( like "%") in your loop.
>if that case the loop will not get pipelined and you will get a
>warning that functions can't be inlined.
>Try to remove such expressions having routine call from the loop
>otherwise use automatic inline option(-oi) in your compiler options.
>
>~Sri >
>On Mon, 14 Mar 2005 02:21:56 -0500, <> wrote:
>>
>>
>> Hi,
>>
>> thanks a lot for these hints, I was able to speed up this loop. But if I try to optimize other loops that contain simplest mathematics (+, *, -) the consultant complains, that it can't inline the functions. What's the matter with this?
>>
>> D. Puchowski
>>
>> >
>> >
>> >
>> >Puchowski--
>> >Here are some tips that TI reccomends:
>> >
>> >1] Loop Partitioning in C:Calculations worth two Loop iterations
>> >inside one, if possible
>> >
>> >2]Pragmas: MUST_ITERATE: To indicate to the compiler that the loop is
>> >likely to run *so many number of * times to assist in loop
>> >unrolling.This is primarily meant for C optimization
>> >
>> >3]Pragmas: DATA_ALIGN: To instruct the compiler to align short datas
>> >on word boundaries to facilitate word wide optimization.This is
>> >primarily meant for C optimization
>> >
>> >4] Intrinsics: int_MPY/int_MPYH: Use intrinsics instead of C operators
>> >like * for more efficient/direct use of special purpose assembly
>> >features.Meant for C optimzation again.
>> >
>> >5]Allocate all CODE/DATA in IRAM and use keywords-restrict and far
>> >whenever applicable.
>> >
>> >6] Use these build options, if possible: -O3,-PM and -mt
>> >
>> >If after all these(you still get correct result!!!) you are unhappy
>> >with the optimization, the re-write the algorithm in Linear Assembly.
>> >
>> >Good Luck
>> >
>> >--Bhooshan
>> >
>> >
>> >
>> >
>> >On Sun, 06 Mar 2005 10:59:31 -0500, <> wrote:
>> >>
>> >>
>> >> Hi,
>> >>
>> >> I have some trouble with my program, it needs too much cycles.
>> >> Maybe I can do some optimisation. I have to do some calculations
>> >> and therefore I have to shift this vector:
>> >>
>> >> for (k = 0; k < Size - 1; k++)
>> >> Buffer[j][k] = Buffer[j][k + 1];
>> >>
>> >> Does anybody know, if there is a way to do this more efficiently,
>> >> maybe with a special DSP function?
>> >>
>> >> Thanks
>> >>
>> >> D. Puchowski
>> >>




Yes, it is possible to both EDMA or QDMA from external memory SDRAM to
external memory SDRAM with both "src" and "dst" being in external
memory.

Regds
JS > Harsha H M wrote:
>
> Hello All
>
> Is it possible to do the EDMA or QDMA between, SDRAM to SDRAM,
> i.e. with SRC and DST address in SDRAM???? "specifically in DM642".
>
> Rgds
> Harsha > Name: winmail.dat
> winmail.dat Type: application/ms-tnef
> Encoding: base64