Reply by rajesh May 6, 20082008-05-06
On May 6, 3:51 pm, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:
> rajesh wrote: > > refer > > >http://docs.blackfin.uclinux.org/doku.php?id=2d-dma > > Nothing to look at. Linuxopathological gibberish. > > VLV
Ok let me make it clear. there is 64 K sram for data and 64k sram 64k sram is divided into two banks A and B of 32k each. I have configured 16k from each bank as cache. the remaining 16 k from each bank, i am using for my dma. the was the case in which the above experiment was conducted.
Reply by Vladimir Vassilevsky May 6, 20082008-05-06

rajesh wrote:


> refer > > http://docs.blackfin.uclinux.org/doku.php?id=2d-dma
Nothing to look at. Linuxopathological gibberish. VLV
Reply by rajesh May 6, 20082008-05-06
On May 6, 3:31 pm, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:
> rajesh wrote: > > "BlackFin doesn't have any means for providing cache and DMA > > coherency. > > Hence you generally can't DMA to the memory areas which are covered by > > cache. " > > > who said this? > > Hint: cache snooping > > Go learn hardware, bad pupil. > > VLV
refer http://docs.blackfin.uclinux.org/doku.php?id=2d-dma
Reply by Vladimir Vassilevsky May 6, 20082008-05-06


> I have not said that i have used the above instruction..there are a > few other instructions which i am not able to recall.
Excuses, excuses, always excuses.
> > You can invalidate the entire cache at one shot.
Bad idea anyway.
> I have used DMA and > cache simultaneously and demonstrated the > gain of DMA over cache especially when the data acess is sequencial > in memory.
Nonsense.
> This happens when > one is accessing pixels of a 2-d image.
This happens when one has a muddle head.
> And lastly I didnt get what you mean by 'Hardware'.
LOL VLV
Reply by Vladimir Vassilevsky May 6, 20082008-05-06

rajesh wrote:


> "BlackFin doesn't have any means for providing cache and DMA > coherency. > Hence you generally can't DMA to the memory areas which are covered by > cache. " > > who said this?
Hint: cache snooping Go learn hardware, bad pupil. VLV
Reply by rajesh May 6, 20082008-05-06
On May 6, 3:14 pm, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:
> rajesh wrote: > > On May 6, 9:21 am, rajesh <getrajes...@gmail.com> wrote: > >>On May 6, 2:13 am, Vladimir Vassilevsky <antispam_bo...@hotmail.com> > >>>rajesh wrote: > > >>>>I was working on implementation of h.264 algorithm on Blackfin a > >>>>couple of years back. I had used the elegantly made DMA of the > >>>>processor to move data in and out of the internal memory (specially > >>>>during de-blocking) and i was competing with the cache in terms of > >>>>cycles.So I had a chance to experiment with the cache. > > >>>BlackFin doesn't have any means for providing cache and DMA coherency. > >>>Hence you generally can't DMA to the memory areas which are covered by > >>>cache. > >>>You are a muddle headed. Learn hardware. > > >>I have used an instruction to invalidate cache after dma > >>transfer. > > > FYI > > > iflush [ p2 ] ; /* Invalidate cache line containing address that > > P2 points to */ > > FYI, muddle head: > > 1. iflush should be used before DMA transfer, not after. > > 2. iflush is useless, since it is faster to DMA to/from buffer in L1 and > copy the data between external memory (cached) and that buffer. > > 3. Learn hardware. > > Vladimir Vassilevsky > DSP and Mixed Signal Design Consultanthttp://www.abvolt.com
"BlackFin doesn't have any means for providing cache and DMA coherency. Hence you generally can't DMA to the memory areas which are covered by cache. " who said this?
Reply by rajesh May 6, 20082008-05-06
On May 6, 3:14 pm, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:
> rajesh wrote: > > On May 6, 9:21 am, rajesh <getrajes...@gmail.com> wrote: > >>On May 6, 2:13 am, Vladimir Vassilevsky <antispam_bo...@hotmail.com> > >>>rajesh wrote: > > >>>>I was working on implementation of h.264 algorithm on Blackfin a > >>>>couple of years back. I had used the elegantly made DMA of the > >>>>processor to move data in and out of the internal memory (specially > >>>>during de-blocking) and i was competing with the cache in terms of > >>>>cycles.So I had a chance to experiment with the cache. > > >>>BlackFin doesn't have any means for providing cache and DMA coherency. > >>>Hence you generally can't DMA to the memory areas which are covered by > >>>cache. > >>>You are a muddle headed. Learn hardware. > > >>I have used an instruction to invalidate cache after dma > >>transfer. > > > FYI > > > iflush [ p2 ] ; /* Invalidate cache line containing address that > > P2 points to */ > > FYI, muddle head: > > 1. iflush should be used before DMA transfer, not after. > > 2. iflush is useless, since it is faster to DMA to/from buffer in L1 and > copy the data between external memory (cached) and that buffer. > > 3. Learn hardware. > > Vladimir Vassilevsky > DSP and Mixed Signal Design Consultanthttp://www.abvolt.com
I have not said that i have used the above instruction..there are a few other instructions which i am not able to recall. You can invalidate the entire cache at one shot. I have used DMA and cache simultaneously and demonstrated the gain of DMA over cache especially when the data acess is sequencial in memory. This happens when one is accessing pixels of a 2-d image. And lastly I didnt get what you mean by 'Hardware'.
Reply by Vladimir Vassilevsky May 6, 20082008-05-06

rajesh wrote:

> On May 6, 9:21 am, rajesh <getrajes...@gmail.com> wrote: >>On May 6, 2:13 am, Vladimir Vassilevsky <antispam_bo...@hotmail.com> >>>rajesh wrote: >>> >>>>I was working on implementation of h.264 algorithm on Blackfin a >>>>couple of years back. I had used the elegantly made DMA of the >>>>processor to move data in and out of the internal memory (specially >>>>during de-blocking) and i was competing with the cache in terms of >>>>cycles.So I had a chance to experiment with the cache. >> >>>BlackFin doesn't have any means for providing cache and DMA coherency. >>>Hence you generally can't DMA to the memory areas which are covered by >>>cache. >>>You are a muddle headed. Learn hardware. >> >>I have used an instruction to invalidate cache after dma >>transfer. > > > FYI > > iflush [ p2 ] ; /* Invalidate cache line containing address that > P2 points to */
FYI, muddle head: 1. iflush should be used before DMA transfer, not after. 2. iflush is useless, since it is faster to DMA to/from buffer in L1 and copy the data between external memory (cached) and that buffer. 3. Learn hardware. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
Reply by rajesh May 6, 20082008-05-06
On May 6, 9:21 am, rajesh <getrajes...@gmail.com> wrote:
> On May 6, 2:13 am, Vladimir Vassilevsky <antispam_bo...@hotmail.com> > wrote: > > > > > rajesh wrote: > > > Hi, > > > > I was working on implementation of h.264 algorithm on Blackfin a > > > couple of years back. I had used the elegantly made DMA of the > > > processor to move data in and out of the internal memory (specially > > > during de-blocking) and i was competing with the cache in terms of > > > cycles.So I had a chance to experiment with the cache. > > > BlackFin doesn't have any means for providing cache and DMA coherency. > > Hence you generally can't DMA to the memory areas which are covered by > > cache. > > > > I had observed very strange phenomenon occuring with the cache. I > > > had written two different versions of codes for the same deblocking > > > algorithm.(de-blocking is a part of h.264 algorithm). One is > > > supposedly optimized but actually wasnt.. > > > > The order in which i was accessing the pixel and other data is same > > > in both the cases. This point is very important. > > > I havent changed the order in which data is been accessed. > > > > Now I disable the cache, both consume the same number of cycles. Now > > > Only if i enable the cache there is a huge difference (almost 40%, i > > > dont remember exactly but it was considerably huge). > > > I can't understand what you did. BTW I compared the efficiency of the > > data cache vs L1 data memory on my tasks. Cache appears to be somewhat > > 10% slower, and this is what expected. > > > > Now tell me how does the cache bifferently with the two different > > > versions of code for the same algorithm. > > > Remember i havent changed the order in which i was accessing the data. > > > the code, on both oaccasions was residing in the internal memory. > > > there wanst much difference between the code, there was an 'if' > > > statement which was moved to outside a 'for' loop. > > > How can one explain the difference in cycles which occurs only when i > > > enable the cache,when there is no change in the order in which the > > > data being accessed. > > > No I havent changed the cache mapping option, it was kept constant. > > > I had obeserved the same phenomenon at another situation (in the same > > > H.264) on blackfin (BF533). > > > You are a muddle headed. Learn hardware. > > > Vladimir Vassilevsky > > DSP and Mixed Signal Design Consultanthttp://www.abvolt.com > > BlackFin doesn't have any means for providing cache and DMA coherency. > > Hence you generally can't DMA to the memory areas which are covered by > > cache. > > there are > > I have used an instruction to invalidate cache after dma > transfer.
FYI iflush [ p2 ] ; /* Invalidate cache line containing address that P2 points to */
Reply by rajesh May 6, 20082008-05-06
On May 6, 2:13 am, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:
> rajesh wrote: > > Hi, > > > I was working on implementation of h.264 algorithm on Blackfin a > > couple of years back. I had used the elegantly made DMA of the > > processor to move data in and out of the internal memory (specially > > during de-blocking) and i was competing with the cache in terms of > > cycles.So I had a chance to experiment with the cache. > > BlackFin doesn't have any means for providing cache and DMA coherency. > Hence you generally can't DMA to the memory areas which are covered by > cache. > > > I had observed very strange phenomenon occuring with the cache. I > > had written two different versions of codes for the same deblocking > > algorithm.(de-blocking is a part of h.264 algorithm). One is > > supposedly optimized but actually wasnt.. > > > The order in which i was accessing the pixel and other data is same > > in both the cases. This point is very important. > > I havent changed the order in which data is been accessed. > > > Now I disable the cache, both consume the same number of cycles. Now > > Only if i enable the cache there is a huge difference (almost 40%, i > > dont remember exactly but it was considerably huge). > > I can't understand what you did. BTW I compared the efficiency of the > data cache vs L1 data memory on my tasks. Cache appears to be somewhat > 10% slower, and this is what expected. > > > Now tell me how does the cache bifferently with the two different > > versions of code for the same algorithm. > > Remember i havent changed the order in which i was accessing the data. > > the code, on both oaccasions was residing in the internal memory. > > there wanst much difference between the code, there was an 'if' > > statement which was moved to outside a 'for' loop. > > How can one explain the difference in cycles which occurs only when i > > enable the cache,when there is no change in the order in which the > > data being accessed. > > No I havent changed the cache mapping option, it was kept constant. > > I had obeserved the same phenomenon at another situation (in the same > > H.264) on blackfin (BF533). > > You are a muddle headed. Learn hardware. > > Vladimir Vassilevsky > DSP and Mixed Signal Design Consultanthttp://www.abvolt.com
> BlackFin doesn't have any means for providing cache and DMA coherency. > Hence you generally can't DMA to the memory areas which are covered by > cache.
there are I have used an instruction to invalidate cache after dma transfer.