DSPRelated.com
Forums

SDRAM access too slow

Started by zhen...@hzrad.com September 13, 2006
Hi,All.
I need your help about the access speed of a 100M 32-bit SDRAM through
EMIF.

Currently I'm using the C6727 dsp and the EMIF input clock (SYSCLK3)
is set to 100M.

If I want to copy a block of 512 points of 32-bit integer from the
internal RAM to my SDRAM address 0x8000 0000, the C code could be
like below:

int buffer[512];
int * pSDRAM=(int *)(0x80000000);
memcpy(pSDRAM,buffer,512*sizeof(int));

So ideally the single write operation cost is 10 nSec, thus the 512
points write should cost no more than 5.12 uSec, am I right?
But actually the total time cost is about 21 uSec, almost 4 times
larger than what I expected.
AND I measured the EM_WE ,EM_CAS ,EM_RAS pins during all the 512
write operations of memcpy() through the oscilloscope.
The result shows that a single write does take me about 40 nSec,not 10
nSec. Why is that ?Is it because I use CPU to carry all the data so
that the burst mode is not used?

And if I read 512 points of data from SDRAM to internal RAM, using :

int buffer[512];
int * pSDRAM=(int *)(0x80000000);
memcpy(buffer,pSDRAM,512*sizeof(int)),

The total time cost is about 90 uSec, much longer than 21 uSec , the
total write time cost.Why is that? Is it the EMIF setting's problem?

Please help me to figure out why.Thanks.
Zhen Qiang Zhang-

> I need your help about the access speed of a 100M 32-bit SDRAM through
> EMIF.
>
> Currently I'm using the C6727 dsp and the EMIF input clock (SYSCLK3)
> is set to 100M.
>
> If I want to copy a block of 512 points of 32-bit integer from the
> internal RAM to my SDRAM address 0x8000 0000, the C code could be
> like below:
>
> int buffer[512];
> int * pSDRAM=(int *)(0x80000000);
> memcpy(pSDRAM,buffer,512*sizeof(int));
>
> So ideally the single write operation cost is 10 nSec, thus the 512
> points write should cost no more than 5.12 uSec, am I right?
> But actually the total time cost is about 21 uSec, almost 4 times
> larger than what I expected.
> AND I measured the EM_WE ,EM_CAS ,EM_RAS pins during all the 512
> write operations of memcpy() through the oscilloscope.
> The result shows that a single write does take me about 40 nSec,not 10
> nSec. Why is that ?Is it because I use CPU to carry all the data so
> that the burst mode is not used?

If you do not see back-to-back writes on the scope, then burst mode is not
happening and yes access time will be *very* slow.

I suggest to experiment with DMA functions to write data from SRAM to
SDRAM, and then see what happens on the dig scope.

-Jeff

> And if I read 512 points of data from SDRAM to internal RAM, using :
>
> int buffer[512];
> int * pSDRAM=(int *)(0x80000000);
> memcpy(buffer,pSDRAM,512*sizeof(int)),
>
> The total time cost is about 90 uSec, much longer than 21 uSec , the
> total write time cost.Why is that? Is it the EMIF setting's problem?
>
> Please help me to figure out why.Thanks.
Does the 6727 have QDMA ?
You should try using the DAT_copy() routine in the chip support library.

It is my understanding that CPU access of SDRAM is not well pipelined
compared to DMA type accesses. Don't have the exact figures in front of me.

- Andrew E.

z...@hzrad.com wrote:

>Hi,All.
> I need your help about the access speed of a 100M 32-bit SDRAM through
> EMIF.
>
> Currently I'm using the C6727 dsp and the EMIF input clock (SYSCLK3)
> is set to 100M.
>
> If I want to copy a block of 512 points of 32-bit integer from the
> internal RAM to my SDRAM address 0x8000 0000, the C code could be
> like below:
>
> int buffer[512];
> int * pSDRAM=(int *)(0x80000000);
> memcpy(pSDRAM,buffer,512*sizeof(int));
>
> So ideally the single write operation cost is 10 nSec, thus the 512
> points write should cost no more than 5.12 uSec, am I right?
> But actually the total time cost is about 21 uSec, almost 4 times
> larger than what I expected.
> AND I measured the EM_WE ,EM_CAS ,EM_RAS pins during all the 512
> write operations of memcpy() through the oscilloscope.
> The result shows that a single write does take me about 40 nSec,not 10
> nSec. Why is that ?Is it because I use CPU to carry all the data so
> that the burst mode is not used?
>
> And if I read 512 points of data from SDRAM to internal RAM, using :
>
> int buffer[512];
> int * pSDRAM=(int *)(0x80000000);
> memcpy(buffer,pSDRAM,512*sizeof(int)),
>
> The total time cost is about 90 uSec, much longer than 21 uSec , the
> total write time cost.Why is that? Is it the EMIF setting's problem?
>
> Please help me to figure out why.Thanks.
>
>
>
>
Thanks for your help,Jeff and Andrew.

QDMA is not surpported in 6727's dMAX Module.I can't use DAT_copy().
Instead I could use FIFO to do that kind of data movement job. The
dMAX does have a better performance than the CPU.

Yet I still get the problem why it is so slow to access data from
CPU. Maybe I should describe my test results in detail:

Here is the waveform I get through the oscilloscope.

As you can see ,the interval between two writes is 4 clock cycles,that
is 40 ns .Longer than I thought.
Timing waveform of SDRAM writes:
CLK  ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄

RAS  ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄

CAS  ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄

WE  ̄ ̄ ̄ ̄ ̄ ̄ ̄|_________| ̄ ̄ ̄ ̄ ̄|________| ̄ ̄ ̄ ̄ ̄|________| ̄ ̄ ̄
|<------40ns------->|
The waveform of SDRAM reads is even more weird:
As you can see ,the interval between two reads is 110 ns. Too long!

CLK _| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄

RAS  ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄

CAS  ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄

WE  ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄|___
|<------------------110ns---------------------->|
Please help me to figure out why is that happening.

----- Original Message -----
From: "Andrew Elder"
To:
Cc:
Sent: Thursday, September 14, 2006 12:43 AM
Subject: Re: [c6x] SDRAM access too slow
>
> Does the 6727 have QDMA ?
> You should try using the DAT_copy() routine in the chip support library.
>
> It is my understanding that CPU access of SDRAM is not well pipelined
> compared to DMA type accesses. Don't have the exact figures in front of me.
>
> - Andrew E.
>
> z...@hzrad.com wrote:
>
>>Hi,All.
>> I need your help about the access speed of a 100M 32-bit SDRAM through
>> EMIF.
>>
>> Currently I'm using the C6727 dsp and the EMIF input clock (SYSCLK3)
>> is set to 100M.
>>
>> If I want to copy a block of 512 points of 32-bit integer from the
>> internal RAM to my SDRAM address 0x8000 0000, the C code could be
>> like below:
>>
>> int buffer[512];
>> int * pSDRAM=(int *)(0x80000000);
>> memcpy(pSDRAM,buffer,512*sizeof(int));
>>
>> So ideally the single write operation cost is 10 nSec, thus the 512
>> points write should cost no more than 5.12 uSec, am I right?
>> But actually the total time cost is about 21 uSec, almost 4 times
>> larger than what I expected.
>> AND I measured the EM_WE ,EM_CAS ,EM_RAS pins during all the 512
>> write operations of memcpy() through the oscilloscope.
>> The result shows that a single write does take me about 40 nSec,not 10
>> nSec. Why is that ?Is it because I use CPU to carry all the data so
>> that the burst mode is not used?
>>
>> And if I read 512 points of data from SDRAM to internal RAM, using :
>>
>> int buffer[512];
>> int * pSDRAM=(int *)(0x80000000);
>> memcpy(buffer,pSDRAM,512*sizeof(int)),
>>
>> The total time cost is about 90 uSec, much longer than 21 uSec , the
>> total write time cost.Why is that? Is it the EMIF setting's problem?
>>
>> Please help me to figure out why.Thanks.
>>
Have you compared your timings to those in the doc below ?

Do they match ?

- Andrew

z...@hzrad.com wrote:

> Thanks for your help,Jeff and Andrew.
>
> QDMA is not surpported in 6727's dMAX Module.I can't use DAT_copy().
> Instead I could use FIFO to do that kind of data movement job. The
> dMAX does have a better performance than the CPU.
>
> Yet I still get the problem why it is so slow to access data from
> CPU. Maybe I should describe my test results in detail:
>
> Here is the waveform I get through the oscilloscope.
> As you can see ,the interval between two writes is 4 clock cycles,that
> is 40 ns .Longer than I thought.
> Timing waveform of SDRAM writes:
> CLK  ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄
>
> RAS  ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄
>
> CAS  ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄
>
> WE  ̄ ̄ ̄ ̄ ̄ ̄ ̄|_________| ̄ ̄ ̄ ̄ ̄|________| ̄ ̄ ̄ ̄ ̄|________| ̄ ̄ ̄
> |<------40ns------->|
> The waveform of SDRAM reads is even more weird:
> As you can see ,the interval between two reads is 110 ns. Too long!
>
> CLK _| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄|_| ̄
>
> RAS  ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄
>
> CAS  ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄
>
> WE  ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄|____| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄|___
> |<------------------110ns---------------------->|
> Please help me to figure out why is that happening.
>----- Original Message -----
>From: "Andrew Elder"
>To:
>Cc:
>Sent: Thursday, September 14, 2006 12:43 AM
>Subject: Re: [c6x] SDRAM access too slow
>
>
>>Does the 6727 have QDMA ?
>>You should try using the DAT_copy() routine in the chip support library.
>>
>>It is my understanding that CPU access of SDRAM is not well pipelined
>>compared to DMA type accesses. Don't have the exact figures in front of me.
>>
>>- Andrew E.
>>
>>z...@hzrad.com wrote:
>>
>>
>>
>>>Hi,All.
>>> I need your help about the access speed of a 100M 32-bit SDRAM through
>>> EMIF.
>>>
>>> Currently I'm using the C6727 dsp and the EMIF input clock (SYSCLK3)
>>> is set to 100M.
>>>
>>> If I want to copy a block of 512 points of 32-bit integer from the
>>> internal RAM to my SDRAM address 0x8000 0000, the C code could be
>>> like below:
>>>
>>> int buffer[512];
>>> int * pSDRAM=(int *)(0x80000000);
>>> memcpy(pSDRAM,buffer,512*sizeof(int));
>>>
>>> So ideally the single write operation cost is 10 nSec, thus the 512
>>> points write should cost no more than 5.12 uSec, am I right?
>>> But actually the total time cost is about 21 uSec, almost 4 times
>>> larger than what I expected.
>>> AND I measured the EM_WE ,EM_CAS ,EM_RAS pins during all the 512
>>> write operations of memcpy() through the oscilloscope.
>>> The result shows that a single write does take me about 40 nSec,not 10
>>> nSec. Why is that ?Is it because I use CPU to carry all the data so
>>> that the burst mode is not used?
>>>
>>> And if I read 512 points of data from SDRAM to internal RAM, using :
>>>
>>> int buffer[512];
>>> int * pSDRAM=(int *)(0x80000000);
>>> memcpy(buffer,pSDRAM,512*sizeof(int)),
>>>
>>> The total time cost is about 90 uSec, much longer than 21 uSec , the
>>> total write time cost.Why is that? Is it the EMIF setting's problem?
>>>
>>> Please help me to figure out why.Thanks.
>>>
>>>
>>>
>>>
>>>
>
Zhen Qiang Zhang-

> QDMA is not surpported in 6727's dMAX Module.I can't use DAT_copy().
> Instead I could use FIFO to do that kind of data movement job. The
> dMAX does have a better performance than the CPU.

I would suggest confirming with TI that C6727 allows maximum burst-rate
timing over SDRAM interface. If TI doesn't provide an app note on how to
achieve it (dMAX? FIFO?), then you have to go to tech support and get an
engineer to give you some specific answers.

-Jeff

> Yet I still get the problem why it is so slow to access data from
> CPU. Maybe I should describe my test results in detail:
>
> Here is the waveform I get through the oscilloscope.
> As you can see ,the interval between two writes is 4 clock cycles,that
> is 40 ns .Longer than I thought.
> Timing waveform of SDRAM writes:
> CLK
> P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P
>
> RAS
> PP|____|PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
>
> CAS
> PPPPPPP|____|PPPPPPPP|____|PPPPPPP|____|PPPPP
>
> WE
> PPPPPPP|_________|PPPPP|________|PPPPP|________|PPP
> |<------40ns------->|
> The waveform of SDRAM reads is even more weird:
> As you can see ,the interval between two reads is 110 ns. Too long!
>
> CLK
> _|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P|_|P
>
> RAS
> P|____|PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
>
> CAS
> PPPPPP|____|PPPPPPPPPPPPPPPPPPPPPPPP|____|P
>
> WE
> PPPPPPPPP|____|PPPPPPPPPPPPPPPPPPPPPPPP|___
> |<------------------110ns---------------------->|
> Please help me to figure out why is that happening.
> ----- Original Message -----
> From: "Andrew Elder"
> To:
> Cc:
> Sent: Thursday, September 14, 2006 12:43 AM
> Subject: Re: [c6x] SDRAM access too slow
>>
>> Does the 6727 have QDMA ?
>> You should try using the DAT_copy() routine in the chip support library.
>>
>> It is my understanding that CPU access of SDRAM is not well pipelined
>> compared to DMA type accesses. Don't have the exact figures in front of
>> me.
>>
>> - Andrew E.
>>
>> z...@hzrad.com wrote:
>>
>>>Hi,All.
>>> I need your help about the access speed of a 100M 32-bit SDRAM
>>> through
>>> EMIF.
>>>
>>> Currently I'm using the C6727 dsp and the EMIF input clock (SYSCLK3)
>>> is set to 100M.
>>>
>>> If I want to copy a block of 512 points of 32-bit integer from the
>>> internal RAM to my SDRAM address 0x8000 0000, the C code could be
>>> like below:
>>>
>>> int buffer[512];
>>> int * pSDRAM=(int *)(0x80000000);
>>> memcpy(pSDRAM,buffer,512*sizeof(int));
>>>
>>> So ideally the single write operation cost is 10 nSec, thus the 512
>>> points write should cost no more than 5.12 uSec, am I right?
>>> But actually the total time cost is about 21 uSec, almost 4 times
>>> larger than what I expected.
>>> AND I measured the EM_WE ,EM_CAS ,EM_RAS pins during all the 512
>>> write operations of memcpy() through the oscilloscope.
>>> The result shows that a single write does take me about 40 nSec,not
>>> 10
>>> nSec. Why is that ?Is it because I use CPU to carry all the data so
>>> that the burst mode is not used?
>>>
>>> And if I read 512 points of data from SDRAM to internal RAM, using :
>>>
>>> int buffer[512];
>>> int * pSDRAM=(int *)(0x80000000);
>>> memcpy(buffer,pSDRAM,512*sizeof(int)),
>>>
>>> The total time cost is about 90 uSec, much longer than 21 uSec , the
>>> total write time cost.Why is that? Is it the EMIF setting's problem?
>>>
>>> Please help me to figure out why.Thanks.