DSPRelated.com
Forums

C6713B EDMA/IRQ cache problem

Started by Bernhard 'Gustl' Bauer April 22, 2008
Hi,

I use EDMA to transfer data from a int. buffer to ext. RAM and vice
versa. Pointer from a IRQ routine have access to the int. buffer too.
When EDMA ptr and IRQ ptr are 16 samples (512 bit) apart all works fine.
Unfortunately with 16 samples the buffer is to small for all pointers.
With 8 samples (256 bit) apart I get wrong data.

The only reason for those errors I can think of is cache. But L1D line
size is 256 bit, so 8 samples apart should be no problem.

Any idea?

TIA

Gustl
Hello,

Sorry, it is not all clear where the data come from, are they coming from the interrupt routine, is it code that set them into internal memory?
Have you thought about L2 Cache? It is 16/32/64K wide.
and what about double bufferring?
Why are you saying 16 is too small and then try with 8?
Hi,

thanks for the quick answer. I cannot use L2 cache because I need all
the int. memory for code/data. Here again in more detail:

I have several long delay lines in ext. RAM. I have a lot of read/write
pointers to these delays. IRQ should read the data from the delays,
process it and write it back.

The transfer from/to the delays must be done with EDMA or it would cost
to much IRQ time. Therefor I have a buffer with several slots where the
IRQ and the EDMA have access.

Here the 1st access: IRQ gets 16 samples from a McASP input and writes
it to the 1st slot of the buffer. EDMA reads the 16 samples from this
slot and writes it to ext. delay. Then there is a transfer from extern
to intern and so on.

There are about 200 read and 200 write operations to perform. IRQ and
EDMA must run in parallel to make it in time. At the begin of an IRQ I
trigger an EDMA.

For this to work each slot must contain at least 16 samples for IRQ
pointer and 16 samples for EDMA pointer. I tried this first and
recognized that whenever EDMA and IRQ have access to adjacent samples at
the same time there is a problem. I blame this on the L1D cache.

So I increased the slot to 3*16 samples and separated the EDMA and IRQ
pointer by 2*16 samples. So even if the 1st pointer is at the last
sample and the second pointer is at the 1st sample there are 16 samples
between them. All worked fine.

But unfortunately I cannot afford the 3*16 samples for each slot. I'm
short of internal memory (surprise, surprise :-) L1D line is 32 bytes 8 samples. So a separation by 8 samples should be enough. But it is't!

There is another restriction: All pointers must start at an address of
modulo 8.

I hope this was more clearly now.

Gustl

I cannot access the ext. memory direct because
christophe blouet schrieb:
> Hello,
>
> Sorry, it is not all clear where the data come from, are they coming
> from the interrupt routine, is it code that set them into internal memory?
> Have you thought about L2 Cache? It is 16/32/64K wide.
> and what about double bufferring?
> Why are you saying 16 is too small and then try with 8?
> > To: c...
> > From: g...@quantec.de
> > Date: Tue, 22 Apr 2008 09:39:26 +0200
> > Subject: [c6x] C6713B EDMA/IRQ cache problem
> >
> > Hi,
> >
> > I use EDMA to transfer data from a int. buffer to ext. RAM and vice
> > versa. Pointer from a IRQ routine have access to the int. buffer too.
> > When EDMA ptr and IRQ ptr are 16 samples (512 bit) apart all works fine.
> > Unfortunately with 16 samples the buffer is to small for all pointers.
> > With 8 samples (256 bit) apart I get wrong data.
> >
> > The only reason for those errors I can think of is cache. But L1D line
> > size is 256 bit, so 8 samples apart should be no problem.
> >
> > Any idea?
> >
> > TIA
> >
> > Gustl
> >
It looks like a synchronisation problem.
You need to use double buffers or ping pong buffers. otherwise you get problem when IRQ and DMA try to access the same space.
IRQ needs to write into ping buffer while DMA reads pong one, when it is finished IRQ will write to ping and DMA read from pong.
You need to have both process synchronised, master clock must be McASP and that kicks the DMA transfer all of the time.

Good news is that you don't need to reserve 3 buffers.

McASP can also be linked to DMA directly, so you do not need IRQ any more (or just at the end of one ping buffer maybe).Hope it helps
Bernhard,

I can't remember, is EDMA supposed to snoop L1D ?

The IRQ is using the CPU to read fro the McASP, correct ?

So the data path is McASP -> L1D -> L2SRAM, correct ?

I'm assuming a "sample" size is 32 bits ?

As I write this, I became more interested in the problem in case I run in to it.

I found a paragraph in the TMS320C6000 Peripherals Reference Guide that basically states that in the " L2 EDMA Service" section that a read request to L2 will snoop L1D and write a line (32 bytes) to L2 if required. So I think what you have done should work.

The only thing I can think of to try is to manually flush L1D after reading values from the McASP.

- Andrew E
----- Original Message ----
From: Bernhard 'Gustl' Bauer
To: christophe blouet
Cc: C6x
Sent: Tuesday, April 22, 2008 6:45:43 AM
Subject: Re: [c6x] C6713B EDMA/IRQ cache problem

Hi,

thanks for the quick answer. I cannot use L2 cache because I need all
the int. memory for code/data. Here again in more detail:

I have several long delay lines in ext. RAM. I have a lot of read/write
pointers to these delays. IRQ should read the data from the delays,
process it and write it back.

The transfer from/to the delays must be done with EDMA or it would cost
to much IRQ time. Therefor I have a buffer with several slots where the
IRQ and the EDMA have access.

Here the 1st access: IRQ gets 16 samples from a McASP input and writes
it to the 1st slot of the buffer. EDMA reads the 16 samples from this
slot and writes it to ext. delay. Then there is a transfer from extern
to intern and so on.

There are about 200 read and 200 write operations to perform. IRQ and
EDMA must run in parallel to make it in time. At the begin of an IRQ I
trigger an EDMA.

For this to work each slot must contain at least 16 samples for IRQ
pointer and 16 samples for EDMA pointer. I tried this first and
recognized that whenever EDMA and IRQ have access to adjacent samples at
the same time there is a problem. I blame this on the L1D cache.

So I increased the slot to 3*16 samples and separated the EDMA and IRQ
pointer by 2*16 samples. So even if the 1st pointer is at the last
sample and the second pointer is at the 1st sample there are 16 samples
between them. All worked fine.

But unfortunately I cannot afford the 3*16 samples for each slot. I'm
short of internal memory (surprise, surprise :-) L1D line is 32 bytes 8 samples. So a separation by 8 samples should be enough. But it is't!

There is another restriction: All pointers must start at an address of
modulo 8.

I hope this was more clearly now.

Gustl

I cannot access the ext. memory direct because
christophe blouet schrieb:
> Hello,
>
> Sorry, it is not all clear where the data come from, are they coming
> from the interrupt routine, is it code that set them into internal memory?
> Have you thought about L2 Cache? It is 16/32/64K wide.
> and what about double bufferring?
> Why are you saying 16 is too small and then try with 8?
> > To: c...
> > From: g...@quantec.de
> > Date: Tue, 22 Apr 2008 09:39:26 +0200
> > Subject: [c6x] C6713B EDMA/IRQ cache problem
> >
> > Hi,
> >
> > I use EDMA to transfer data from a int. buffer to ext. RAM and vice
> > versa. Pointer from a IRQ routine have access to the int. buffer too.
> > When EDMA ptr and IRQ ptr are 16 samples (512 bit) apart all works fine.
> > Unfortunately with 16 samples the buffer is to small for all pointers.
> > With 8 samples (256 bit) apart I get wrong data.
> >
> > The only reason for those errors I can think of is cache. But L1D line
> > size is 256 bit, so 8 samples apart should be no problem.
> >
> > Any idea?
> >
> > TIA
> >
> > Gustl
> >


____________________________________________________________________________________
Hi,

I tracked the problem down a bit.

When EDMA writes into L2 and IRQ reads from L2 I can narrow the pointers
up to 16 samples (= number of processed samples).

When IRQ writes into L2 and EDMA reads from L2 I have to keep them 32
samples apart. As EDMA snoops L1D this should not be necessary. I don't
know if it could be less than 32, but it must be more than 16.

This solves my immediate need, so I wont spend more time on it.

Gustl

Andrew Elder schrieb:
> Bernhard,
>
> I can't remember, is EDMA supposed to snoop L1D ?

Yes

> The IRQ is using the CPU to read fro the McASP, correct ?

No, I'm using a different EDMA

> So the data path is McASP -> L1D -> L2SRAM, correct ?
>
> I'm assuming a "sample" size is 32 bits ?

yes

> As I write this, I became more interested in the problem in case I run
> in to it.
>
> I found a paragraph in the TMS320C6000 Peripherals Reference Guide that
> basically states that in the " L2 EDMA Service" section that a read
> request to L2 will snoop L1D and write a line (32 bytes) to L2 if
> required. So I think what you have done should work.
>
> The only thing I can think of to try is to manually flush L1D after
> reading values from the McASP.
>
> - Andrew E
>
> ----- Original Message ----
> From: Bernhard 'Gustl' Bauer >
> To: christophe blouet > >
> Cc: C6x >
> Sent: Tuesday, April 22, 2008 6:45:43 AM
> Subject: Re: [c6x] C6713B EDMA/IRQ cache problem
>
> Hi,
>
> thanks for the quick answer. I cannot use L2 cache because I need all
> the int. memory for code/data. Here again in more detail:
>
> I have several long delay lines in ext. RAM. I have a lot of read/write
> pointers to these delays. IRQ should read the data from the delays,
> process it and write it back.
>
> The transfer from/to the delays must be done with EDMA or it would cost
> to much IRQ time. Therefor I have a buffer with several slots where the
> IRQ and the EDMA have access.
>
> Here the 1st access: IRQ gets 16 samples from a McASP input and writes
> it to the 1st slot of the buffer. EDMA reads the 16 samples from this
> slot and writes it to ext. delay. Then there is a transfer from extern
> to intern and so on.
>
> There are about 200 read and 200 write operations to perform. IRQ and
> EDMA must run in parallel to make it in time. At the begin of an IRQ I
> trigger an EDMA.
>
> For this to work each slot must contain at least 16 samples for IRQ
> pointer and 16 samples for EDMA pointer. I tried this first and
> recognized that whenever EDMA and IRQ have access to adjacent samples at
> the same time there is a problem. I blame this on the L1D cache.
>
> So I increased the slot to 3*16 samples and separated the EDMA and IRQ
> pointer by 2*16 samples. So even if the 1st pointer is at the last
> sample and the second pointer is at the 1st sample there are 16 samples
> between them. All worked fine.
>
> But unfortunately I cannot afford the 3*16 samples for each slot. I'm
> short of internal memory (surprise, surprise :-) L1D line is 32 bytes > 8 samples. So a separation by 8 samples should be enough. But it is't!
>
> There is another restriction: All pointers must start at an address of
> modulo 8.
>
> I hope this was more clearly now.
>
> Gustl
>
> I cannot access the ext. memory direct because
> christophe blouet schrieb:
> > Hello,
> >
> > Sorry, it is not all clear where the data come from, are they coming
> > from the interrupt routine, is it code that set them into internal
> memory?
> > Have you thought about L2 Cache? It is 16/32/64K wide.
> > and what about double bufferring?
> > Why are you saying 16 is too small and then try with 8?
> >
> >
> >
> >
> > > To: c...
> > > From: g...@quantec.de
> > > Date: Tue, 22 Apr 2008 09:39:26 +0200
> > > Subject: [c6x] C6713B EDMA/IRQ cache problem
> > >
> > > Hi,
> > >
> > > I use EDMA to transfer data from a int. buffer to ext. RAM and vice
> > > versa. Pointer from a IRQ routine have access to the int. buffer too.
> > > When EDMA ptr and IRQ ptr are 16 samples (512 bit) apart all works
> fine.
> > > Unfortunately with 16 samples the buffer is to small for all pointers.
> > > With 8 samples (256 bit) apart I get wrong data.
> > >
> > > The only reason for those errors I can think of is cache. But L1D line
> > > size is 256 bit, so 8 samples apart should be no problem.
> > >
> > > Any idea?
> > >
> > > TIA
> > >
> > > Gustl
> > >