|
Hi, I have some trouble with my program, it needs too much cycles. Maybe I can do some optimisation. I have to do some calculations and therefore I have to shift this vector: for (k = 0; k < Size - 1; k++) Buffer[j][k] = Buffer[j][k + 1]; Does anybody know, if there is a way to do this more efficiently, maybe with a special DSP function? Thanks D. Puchowski |
Optimisation of Loops
Started by ●March 6, 2005
Reply by ●March 6, 20052005-03-06
|
Hi, there were already several discussions on optimization in this group, maybe you can get some information by searching through the group archive... The 'Optimizing Compiler Manual' (www.ti.com), (I dont know the exact name, might be spru189 or something like that) - also provides a lot of information. You can improve the speed of loops a lot by using some #pragmas (for example MUST_ITERATE) and switching on optimization when compiling. there are a lot of other 'screws' you can turn to speed it up (putting data and code it into internal ram, etc. etc. I think most of this was also discussed in recent postings, best regards, thomas ----- Original Message ----- From: <> To: <> Sent: Sunday, March 06, 2005 4:59 PM Subject: [c6x] Optimisation of Loops > > > Hi, > > I have some trouble with my program, it needs too much cycles. > Maybe I can do some optimisation. I have to do some calculations > and therefore I have to shift this vector: > > for (k = 0; k < Size - 1; k++) > Buffer[j][k] = Buffer[j][k + 1]; > > Does anybody know, if there is a way to do this more efficiently, > maybe with a special DSP function? > > Thanks > > D. Puchowski > > |
Reply by ●March 6, 20052005-03-06
|
Puchowski-- Here are some tips that TI reccomends: 1] Loop Partitioning in C:Calculations worth two Loop iterations inside one, if possible 2]Pragmas: MUST_ITERATE: To indicate to the compiler that the loop is likely to run *so many number of * times to assist in loop unrolling.This is primarily meant for C optimization 3]Pragmas: DATA_ALIGN: To instruct the compiler to align short datas on word boundaries to facilitate word wide optimization.This is primarily meant for C optimization 4] Intrinsics: int_MPY/int_MPYH: Use intrinsics instead of C operators like * for more efficient/direct use of special purpose assembly features.Meant for C optimzation again. 5]Allocate all CODE/DATA in IRAM and use keywords-restrict and far whenever applicable. 6] Use these build options, if possible: -O3,-PM and -mt If after all these(you still get correct result!!!) you are unhappy with the optimization, the re-write the algorithm in Linear Assembly. Good Luck --Bhooshan On Sun, 06 Mar 2005 10:59:31 -0500, <> wrote: > > > Hi, > > I have some trouble with my program, it needs too much cycles. > Maybe I can do some optimisation. I have to do some calculations > and therefore I have to shift this vector: > > for (k = 0; k < Size - 1; k++) > Buffer[j][k] = Buffer[j][k + 1]; > > Does anybody know, if there is a way to do this more efficiently, > maybe with a special DSP function? > > Thanks > > D. Puchowski > > -- ------------------------------- "I've missed more than 9000 shots in my career. I've lost almost 300 games. 26 times I've been trusted to take the game winning shot and missed. I've failed over and over again in my life. And that is why I succeed." -- Michael Jordan -------------------------------- |
Reply by ●March 7, 20052005-03-07
|
Hi, is it really neccessary to copy the data? Maybe you can use a pointer. There is a dsplib with a function dspf_sp_blk_move. But I think it has a restriction that the blocks must not overlap. Gustl wrote: > > Hi, > > I have some trouble with my program, it needs too much cycles. > Maybe I can do some optimisation. I have to do some calculations > and therefore I have to shift this vector: > > for (k = 0; k < Size - 1; k++) > Buffer[j][k] = Buffer[j][k + 1]; > > Does anybody know, if there is a way to do this more efficiently, > maybe with a special DSP function? > > Thanks > > D. Puchowski |
Reply by ●March 14, 20052005-03-14
|
Hi, thanks a lot for these hints, I was able to speed up this loop. But if I try to optimize other loops that contain simplest mathematics (+, *, -) the consultant complains, that it can't inline the functions. What's the matter with this? D. Puchowski >Puchowski-- >Here are some tips that TI reccomends: > >1] Loop Partitioning in C:Calculations worth two Loop iterations >inside one, if possible > >2]Pragmas: MUST_ITERATE: To indicate to the compiler that the loop is >likely to run *so many number of * times to assist in loop >unrolling.This is primarily meant for C optimization > >3]Pragmas: DATA_ALIGN: To instruct the compiler to align short datas >on word boundaries to facilitate word wide optimization.This is >primarily meant for C optimization > >4] Intrinsics: int_MPY/int_MPYH: Use intrinsics instead of C operators >like * for more efficient/direct use of special purpose assembly >features.Meant for C optimzation again. > >5]Allocate all CODE/DATA in IRAM and use keywords-restrict and far >whenever applicable. > >6] Use these build options, if possible: -O3,-PM and -mt > >If after all these(you still get correct result!!!) you are unhappy >with the optimization, the re-write the algorithm in Linear Assembly. > >Good Luck > >--Bhooshan > > >On Sun, 06 Mar 2005 10:59:31 -0500, <> wrote: >> >> >> Hi, >> >> I have some trouble with my program, it needs too much cycles. >> Maybe I can do some optimisation. I have to do some calculations >> and therefore I have to shift this vector: >> >> for (k = 0; k < Size - 1; k++) >> Buffer[j][k] = Buffer[j][k + 1]; >> >> Does anybody know, if there is a way to do this more efficiently, >> maybe with a special DSP function? >> >> Thanks >> >> D. Puchowski >> |
Reply by ●March 15, 20052005-03-15
|
Hi, try to check in your code there must be a function call or call to run-time support routines( like "%") in your loop. if that case the loop will not get pipelined and you will get a warning that functions can't be inlined. Try to remove such expressions having routine call from the loop otherwise use automatic inline option(-oi) in your compiler options. ~Sri On Mon, 14 Mar 2005 02:21:56 -0500, <> wrote: > > > Hi, > > thanks a lot for these hints, I was able to speed up this loop. But if I try to optimize other loops that contain simplest mathematics (+, *, -) the consultant complains, that it can't inline the functions. What's the matter with this? > > D. Puchowski > > > > > > > > >Puchowski-- > >Here are some tips that TI reccomends: > > > >1] Loop Partitioning in C:Calculations worth two Loop iterations > >inside one, if possible > > > >2]Pragmas: MUST_ITERATE: To indicate to the compiler that the loop is > >likely to run *so many number of * times to assist in loop > >unrolling.This is primarily meant for C optimization > > > >3]Pragmas: DATA_ALIGN: To instruct the compiler to align short datas > >on word boundaries to facilitate word wide optimization.This is > >primarily meant for C optimization > > > >4] Intrinsics: int_MPY/int_MPYH: Use intrinsics instead of C operators > >like * for more efficient/direct use of special purpose assembly > >features.Meant for C optimzation again. > > > >5]Allocate all CODE/DATA in IRAM and use keywords-restrict and far > >whenever applicable. > > > >6] Use these build options, if possible: -O3,-PM and -mt > > > >If after all these(you still get correct result!!!) you are unhappy > >with the optimization, the re-write the algorithm in Linear Assembly. > > > >Good Luck > > > >--Bhooshan > > > > > > > > > >On Sun, 06 Mar 2005 10:59:31 -0500, <> wrote: > >> > >> > >> Hi, > >> > >> I have some trouble with my program, it needs too much cycles. > >> Maybe I can do some optimisation. I have to do some calculations > >> and therefore I have to shift this vector: > >> > >> for (k = 0; k < Size - 1; k++) > >> Buffer[j][k] = Buffer[j][k + 1]; > >> > >> Does anybody know, if there is a way to do this more efficiently, > >> maybe with a special DSP function? > >> > >> Thanks > >> > >> D. Puchowski > > > > -- Regards Sri |
Reply by ●March 15, 20052005-03-15
|
Are these floating-point operations? Otherwise - see what function it is using for the operation - generate and examine optimized c code in the .lst file - and that will generally clarify the issue. Roger Kingsley > Message: 1 > Date: Mon, 14 Mar 2005 02:21:56 -0500 > From: > Subject: Re: Optimisation of Loops > Hi, > > thanks a lot for these hints, I was able to speed up this loop. But if I > try to optimize other loops that contain simplest mathematics (+, *, -) > the consultant complains, that it can't inline the functions. What's the > matter with this? > > D. Puchowski |
Reply by ●March 15, 20052005-03-15
|
Hello All Is it possible to do the EDMA or QDMA between, SDRAM to SDRAM, i.e. with SRC and DST address in SDRAM???? "specifically in DM642". Rgds Harsha | |||
|
Reply by ●March 15, 20052005-03-15
|
Hi, you are right. I'm doing some floating point calculations on a 6416, so I think the floating point arithmetics are implemented in the RTS. It seems like that there is no way to pipeline this loop except rewriting it with fixed point arithmetics? D. Puchowski >Hi, >try to check in your code there must be a function call or call to >run-time support routines( like "%") in your loop. >if that case the loop will not get pipelined and you will get a >warning that functions can't be inlined. >Try to remove such expressions having routine call from the loop >otherwise use automatic inline option(-oi) in your compiler options. > >~Sri > >On Mon, 14 Mar 2005 02:21:56 -0500, <> wrote: >> >> >> Hi, >> >> thanks a lot for these hints, I was able to speed up this loop. But if I try to optimize other loops that contain simplest mathematics (+, *, -) the consultant complains, that it can't inline the functions. What's the matter with this? >> >> D. Puchowski >> >> > >> > >> > >> >Puchowski-- >> >Here are some tips that TI reccomends: >> > >> >1] Loop Partitioning in C:Calculations worth two Loop iterations >> >inside one, if possible >> > >> >2]Pragmas: MUST_ITERATE: To indicate to the compiler that the loop is >> >likely to run *so many number of * times to assist in loop >> >unrolling.This is primarily meant for C optimization >> > >> >3]Pragmas: DATA_ALIGN: To instruct the compiler to align short datas >> >on word boundaries to facilitate word wide optimization.This is >> >primarily meant for C optimization >> > >> >4] Intrinsics: int_MPY/int_MPYH: Use intrinsics instead of C operators >> >like * for more efficient/direct use of special purpose assembly >> >features.Meant for C optimzation again. >> > >> >5]Allocate all CODE/DATA in IRAM and use keywords-restrict and far >> >whenever applicable. >> > >> >6] Use these build options, if possible: -O3,-PM and -mt >> > >> >If after all these(you still get correct result!!!) you are unhappy >> >with the optimization, the re-write the algorithm in Linear Assembly. >> > >> >Good Luck >> > >> >--Bhooshan >> > >> > >> > >> > >> >On Sun, 06 Mar 2005 10:59:31 -0500, <> wrote: >> >> >> >> >> >> Hi, >> >> >> >> I have some trouble with my program, it needs too much cycles. >> >> Maybe I can do some optimisation. I have to do some calculations >> >> and therefore I have to shift this vector: >> >> >> >> for (k = 0; k < Size - 1; k++) >> >> Buffer[j][k] = Buffer[j][k + 1]; >> >> >> >> Does anybody know, if there is a way to do this more efficiently, >> >> maybe with a special DSP function? >> >> >> >> Thanks >> >> >> >> D. Puchowski >> >> |
Reply by ●March 15, 20052005-03-15
|
Yes, it is possible to both EDMA or QDMA from external memory SDRAM to external memory SDRAM with both "src" and "dst" being in external memory. Regds JS > Harsha H M wrote: > > Hello All > > Is it possible to do the EDMA or QDMA between, SDRAM to SDRAM, > i.e. with SRC and DST address in SDRAM???? "specifically in DM642". > > Rgds > Harsha > Name: winmail.dat > winmail.dat Type: application/ms-tnef > Encoding: base64 |






