
Technical discussions about the TI C6000 DSPs (including the c62x, c64x and c67x DSPs).
|
Hi DSPers, I profiled a simple code (its not a floating point code and it resides in the internal memory itself) in 6201 and 6711 simulators. It seems 6711 is running 500 times faster than 6201. I just asked for technical support from TI sending the code to them. They too got the same profiling results. They claim that the large profiling difference is purely due to the architecture of 6201. Find the attached reply from TI. What do you all think? Are they right? Please clear my doubts. "> The large difference between 6201 and 6711 simulators in profiling results > is mainly because of the Memory stalls. Probably allocating the arrays in > your project such that the processor access only one memory bank in one > read. I expect great reduction in the memory stalls and there by reduce the > execution cycles with such type of data access. Please refer to Programmers > guide(SPRU198) and Optimizing compiler Users guide(SPRU187i) would help you > for getting better performance in 6201. The large number of profiling cycles > is purely because of the 6201 architecture. I hope am clear in answering > your query." Thanks in advance. Ramanan. |
|
|
|
>Hi DSPers, > I profiled a simple code (its not a floating point code and it resides >in the internal memory itself) in 6201 and 6711 simulators. It seems 6711 is >running 500 times faster than 6201. I just asked for technical support from >TI sending the code to them. They too got the same profiling results. They >claim that the large profiling difference is purely due to the architecture >of 6201. Find the attached reply from TI. What do you all think? Are they >right? Please clear my doubts. >"> The large difference between 6201 and 6711 simulators in profiling >results >> is mainly because of the Memory stalls. Probably allocating the arrays in >> your project such that the processor access only one memory bank in one >> read. I expect great reduction in the memory stalls and there by reduce >the >> execution cycles with such type of data access. Please refer to >Programmers >> guide(SPRU198) and Optimizing compiler Users guide(SPRU187i) would help >you >> for getting better performance in 6201. The large number of profiling >cycles >> is purely because of the 6201 architecture. I hope am clear in answering >> your query." It seems unlikely that memory stalls would cause a slow-down of that magnitude. I don't know the 6201 very well, but it has some scheme for allowing dual-port access to the data memory, but only under certain addressing conditions (two accesses from different blocks/banks). If you have one of these conflicts you stall the entire pipeline. This potentially wastes 8 execution cycles AIUI, which still doesn;t sound like a 500x slow-down! Can they identify the lines of assembly causing the problem? Cheers, Martin -- Martin Thompson BEng(Hons) CEng MIEE TRW Conekt Stratford Road, Solihull, B90 4AX. UK Tel: +44 (0)121-627-3569 - |
|
Hi, As per the direction by TI, I profiled the code with C6000 simulator analysis. The number of memory stalls is 289659 times when I profiled the code in 6201. And the number of data bank conflict is 41 times. But in 6711 there are no data bank conflicts and memory stalls. I think TI is correct that 6x0x DSPs will run slower than the 6x1x DSPs. Do you all agree that 6201 is slower than 6711 even though 6201 is 200 MHz clocked and 6711 is 150 MHz clocked? Thanks, Ramanan. ----- Original Message ----- From: "Martin.J Thompson" <> To: <>; <> Sent: Friday, December 06, 2002 2:08 PM Subject: Re: [c6x] Again: 6201 Vs 6711 > > >Hi DSPers, > > I profiled a simple code (its not a floating point code and it resides > >in the internal memory itself) in 6201 and 6711 simulators. It seems 6711 is > >running 500 times faster than 6201. I just asked for technical support from > >TI sending the code to them. They too got the same profiling results. They > >claim that the large profiling difference is purely due to the architecture > >of 6201. Find the attached reply from TI. What do you all think? Are they > >right? Please clear my doubts. > > >"> The large difference between 6201 and 6711 simulators in profiling > >results > >> is mainly because of the Memory stalls. Probably allocating the arrays in > >> your project such that the processor access only one memory bank in one > >> read. I expect great reduction in the memory stalls and there by reduce > >the > >> execution cycles with such type of data access. Please refer to > >Programmers > >> guide(SPRU198) and Optimizing compiler Users guide(SPRU187i) would help > >you > >> for getting better performance in 6201. The large number of profiling > >cycles > >> is purely because of the 6201 architecture. I hope am clear in answering > >> your query." > > It seems unlikely that memory stalls would cause a slow-down of that magnitude. I don't know the 6201 very well, but it has some scheme for allowing dual-port access to the data memory, but only under certain addressing conditions (two accesses from different blocks/banks). If you have one of these conflicts you stall the entire pipeline. This potentially wastes 8 execution cycles AIUI, which still doesn;t sound like a 500x slow-down! > > Can they identify the lines of assembly causing the problem? > > Cheers, > Martin > > -- > Martin Thompson BEng(Hons) CEng MIEE > TRW Conekt > Stratford Road, Solihull, B90 4AX. UK > Tel: +44 (0)121-627-3569 - > > _____________________________________ |
|
>Hi, > As per the direction by TI, I profiled the code with C6000 simulator >analysis. The number of memory stalls is 289659 times when I profiled the >code in 6201. And the number of data bank conflict is 41 times. But in 6711 >there are no data bank conflicts and memory stalls. I think TI is correct >that 6x0x DSPs will run slower than the 6x1x DSPs. Do you all agree that >6201 is slower than 6711 even though 6201 is 200 MHz clocked and 6711 is 150 >MHz clocked? > >Thanks, >Ramanan. In that case, it looks like that is true. You'll need to rearrange your data to avoid the data bank conflicts. Is this a problem for you? Cheers, Martin > >----- Original Message ----- >From: "Martin.J Thompson" <> >To: <>; <> >Sent: Friday, December 06, 2002 2:08 PM >Subject: Re: [c6x] Again: 6201 Vs 6711 >> >> >Hi DSPers, >> > I profiled a simple code (its not a floating point code and it >resides >> >in the internal memory itself) in 6201 and 6711 simulators. It seems 6711 >is >> >running 500 times faster than 6201. I just asked for technical support >from >> >TI sending the code to them. They too got the same profiling results. >They >> >claim that the large profiling difference is purely due to the >architecture >> >of 6201. Find the attached reply from TI. What do you all think? Are they >> >right? Please clear my doubts. >> > >> >> >> >"> The large difference between 6201 and 6711 simulators in profiling >> >results >> >> is mainly because of the Memory stalls. Probably allocating the arrays >in >> >> your project such that the processor access only one memory bank in one >> >> read. I expect great reduction in the memory stalls and there by reduce >> >the >> >> execution cycles with such type of data access. Please refer to >> >Programmers >> >> guide(SPRU198) and Optimizing compiler Users guide(SPRU187i) would help >> >you >> >> for getting better performance in 6201. The large number of profiling >> >cycles >> >> is purely because of the 6201 architecture. I hope am clear in >answering >> >> your query." >> >> It seems unlikely that memory stalls would cause a slow-down of that >magnitude. I don't know the 6201 very well, but it has some scheme for >allowing dual-port access to the data memory, but only under certain >addressing conditions (two accesses from different blocks/banks). If you >have one of these conflicts you stall the entire pipeline. This potentially >wastes 8 execution cycles AIUI, which still doesn;t sound like a 500x >slow-down! >> >> Can they identify the lines of assembly causing the problem? >> >> Cheers, >> Martin >> >> -- >> Martin Thompson BEng(Hons) CEng MIEE >> TRW Conekt >> Stratford Road, Solihull, B90 4AX. UK >> Tel: +44 (0)121-627-3569 - >> > |
|
Hi, Thanks for showing interest in my problem. I am in a situation to choose either 6201 or 6711 for my application. If 6711 can be programmed easily then I may opt 6711 instead to spending time in rearranging the data for 6201. But even if I rearrage it, will it run faster than 6711? Thanks, Ramanan. ----- Original Message ----- From: "Martin.J Thompson" <> To: <>; <> Sent: Monday, December 09, 2002 1:52 PM Subject: Re: [c6x] Again: 6201 Vs 6711 > >Hi, > > As per the direction by TI, I profiled the code with C6000 simulator > >analysis. The number of memory stalls is 289659 times when I profiled the > >code in 6201. And the number of data bank conflict is 41 times. But in 6711 > >there are no data bank conflicts and memory stalls. I think TI is correct > >that 6x0x DSPs will run slower than the 6x1x DSPs. Do you all agree that > >6201 is slower than 6711 even though 6201 is 200 MHz clocked and 6711 is 150 > >MHz clocked? > > > >Thanks, > >Ramanan. > > > > In that case, it looks like that is true. You'll need to rearrange your data to avoid the data bank conflicts. Is this a problem for you? > > Cheers, > Martin |
|
Ramanan- > Thanks for showing interest in my problem. I am in a situation to choose > either 6201 or 6711 for my application. If 6711 can be programmed easily > then I may opt 6711 instead to spending time in rearranging the data for > 6201. But even if I rearrage it, will it run faster than 6711? Is cost or package size an issue? I/O throughput to the outside world? C6201 is more or less compatible with C6202/3/4; the latter devices are much smaller and available in BGA package, plus they have extremely high external bus bandwidth using XBus... If the product volume is low, and cost and I/O throughput is not / will not become an issue, then go with C6711. Jeff Brower DSP sw/hw engineer Signalogic > ----- Original Message ----- > From: "Martin.J Thompson" <> > To: <>; <> > Sent: Monday, December 09, 2002 1:52 PM > Subject: Re: [c6x] Again: 6201 Vs 6711 > > > >Hi, > > > As per the direction by TI, I profiled the code with C6000 simulator > > >analysis. The number of memory stalls is 289659 times when I profiled the > > >code in 6201. And the number of data bank conflict is 41 times. But in > 6711 > > >there are no data bank conflicts and memory stalls. I think TI is correct > > >that 6x0x DSPs will run slower than the 6x1x DSPs. Do you all agree that > > >6201 is slower than 6711 even though 6201 is 200 MHz clocked and 6711 is > 150 > > >MHz clocked? > > > > > >Thanks, > > >Ramanan. > > > > > > > In that case, it looks like that is true. You'll need to rearrange your > data to avoid the data bank conflicts. Is this a problem for you? > > > > Cheers, > > Martin |