DSPRelated.com
Forums

HPI lockup problem

Started by Adolf Klemenz June 15, 2007
Dear C6x users,

we are currently testing a prototype DSP system using two C6713B DSPs,
coupled via HPI and EMIF. DSP#1 does the data acquisition and
pre-processing (filters and decimation). DSP#2 reads this pre-processed
data from DSP#1 L2RAM via the HPI.

This setup worked stable until we started using the highly optimized
filters from the 67x dsplib on DSP#1. From then on, HPI accesses
sporadically failed and the HPI locked-up on read accesses: HRDY stays high
and never returns to low.

We have carefully checked the HPI timing and it seems ok. The only errata
note I could find for the HPI indicates that access to an address range
reserved for HPI FIFOs may freeze the HPI. We also checked this using the
advanced event triggering and couldn't detect any access to this memory area.

To me it seems like some hidden/unwanted HPI timeout. The dsplib filter
algorithms extensively use all available DSP resources and data paths.
Could it be that the HPI locks-up if it can't get access to a data path for
a prolonged time? Has anyone ever encountered a similar HPI problem?

Many thanks,
Adolf Klemenz, D.SignT
Adolf-

> we are currently testing a prototype DSP system using two C6713B DSPs,
> coupled via HPI and EMIF. DSP#1 does the data acquisition and
> pre-processing (filters and decimation). DSP#2 reads this pre-processed
> data from DSP#1 L2RAM via the HPI.
>
> This setup worked stable until we started using the highly optimized
> filters from the 67x dsplib on DSP#1. From then on, HPI accesses
> sporadically failed and the HPI locked-up on read accesses: HRDY stays high
> and never returns to low.
>
> We have carefully checked the HPI timing and it seems ok. The only errata
> note I could find for the HPI indicates that access to an address range
> reserved for HPI FIFOs may freeze the HPI. We also checked this using the
> advanced event triggering and couldn't detect any access to this memory area.
>
> To me it seems like some hidden/unwanted HPI timeout. The dsplib filter
> algorithms extensively use all available DSP resources and data paths.
> Could it be that the HPI locks-up if it can't get access to a data path for
> a prolonged time? Has anyone ever encountered a similar HPI problem?

On the EMIF side, what controls HCS? What is connected to HRDY? How long is HCS
asserted before HRDY is checked and an HPI access is attempted?

-Jeff
Are you using any DMA ressource? isn't this blocked as well?

Dear C6x users,
>
> we are currently testing a prototype DSP system using two C6713B DSPs,
>coupled via HPI and EMIF. DSP#1 does the data acquisition and
>pre-processing (filters and decimation). DSP#2 reads this pre-processed
>data from DSP#1 L2RAM via the HPI.
>
>This setup worked stable until we started using the highly optimized
>filters from the 67x dsplib on DSP#1. From then on, HPI accesses
>sporadically failed and the HPI locked-up on read accesses: HRDY stays high
>and never returns to low.
>
>We have carefully checked the HPI timing and it seems ok. The only errata
>note I could find for the HPI indicates that access to an address range
>reserved for HPI FIFOs may freeze the HPI. We also checked this using the
>advanced event triggering and couldn't detect any access to this memory area.
>
>To me it seems like some hidden/unwanted HPI timeout. The dsplib filter
>algorithms extensively use all available DSP resources and data paths.
>Could it be that the HPI locks-up if it can't get access to a data path for
>a prolonged time? Has anyone ever encountered a similar HPI problem?
>
> Many thanks,
> Adolf Klemenz, D.SignT
Jeff,

At 17:06 15.06.2007 -0500, Jeff Brower wrote:
> On the EMIF side, what controls HCS?
> What is connected to HRDY?
> How long is HCS asserted before HRDY is checked and an HPI access is
attempted?

this is the interconnection:

EA2 -> HHWIL
EA3 -> HCNTL0
EA4 -> HCNTL1
EA5 -> HR/W
CE3 -> HCS
ARE -> HDS1
AWE -> HDS2
ARDY <- inverter <- HRDY
HAS pulled high

HPI access timing is determined by the CECTL3 register:
10ns setup from host control signals to HDS
40ns HDS strobe
10ns hold fon HDS to host control signals

both 6713B operate at 200MHz with 100MHz EMIF clock. This timing should
fulfill the 6713B HPI timing requirements (4 CPU cycles HDS strobe, 4 CPU
cycles HDS inactive between consecutive accesses, 5ns setup, 4ns hold). The
strobe period is extended to 4 EMIF clock cycles to allow the host DSP to
recognize ARDY.

A typical HPI read sequence (with address autoincrement) has HRDY high on
the first read operation. Consecutive reads have HRDY low, or asserted for
only a very short time. If the HPI locks-up, HRDY goes high during a read
sequence and stays high forever.

Many thanks,
Adolf Klemenz, D.SignT

-------------------------------
D.SignT - Digital Signalprocessing Technology GmbH & Co. KG

Adolf Klemenz

Marktstr. 10
D-47647 Kerken

phone (+49)(0)2833/570-976
fax (+49)(0)2833/3328
email mailto:a...@dsignt.de
web http://www.dsignt.de
-------------------------------
Adolf,

I suggest that you might approach this as a system
problem as opposed to an HPI problem. Your previous
post seemed to indicate that "HPI worked until you
changed your code". My experience has been that
either [1] HPI was not not really working because
early development testing was less intense or [2]
there was indeed an HPI problem.

1. If you load a simple 'do nothing loop' with a few
NOPs, does the HPI function correctly??
2. Have you verified that the target DSP [with HPI
interface] is not asserting ARDY false.
3. When this condition occurs during the running of
your app, do you have CCS access to the target?? If
yes, is the PC always in the same loop??

mikedunn
--- Adolf Klemenz wrote:

> Jeff,
>
> At 17:06 15.06.2007 -0500, Jeff Brower wrote:
> > On the EMIF side, what controls HCS?
> > What is connected to HRDY?
> > How long is HCS asserted before HRDY is checked
> and an HPI access is
> attempted?
>
> this is the interconnection:
>
> EA2 -> HHWIL
> EA3 -> HCNTL0
> EA4 -> HCNTL1
> EA5 -> HR/W
> CE3 -> HCS
> ARE -> HDS1
> AWE -> HDS2
> ARDY <- inverter <- HRDY
> HAS pulled high
>
> HPI access timing is determined by the CECTL3
> register:
> 10ns setup from host control signals to HDS
> 40ns HDS strobe
> 10ns hold fon HDS to host control signals
>
> both 6713B operate at 200MHz with 100MHz EMIF clock.
> This timing should
> fulfill the 6713B HPI timing requirements (4 CPU
> cycles HDS strobe, 4 CPU
> cycles HDS inactive between consecutive accesses,
> 5ns setup, 4ns hold). The
> strobe period is extended to 4 EMIF clock cycles to
> allow the host DSP to
> recognize ARDY.
>
> A typical HPI read sequence (with address
> autoincrement) has HRDY high on
> the first read operation. Consecutive reads have
> HRDY low, or asserted for
> only a very short time. If the HPI locks-up, HRDY
> goes high during a read
> sequence and stays high forever.
>
> Many thanks,
> Adolf Klemenz, D.SignT
-------------------------------
> D.SignT - Digital Signalprocessing Technology GmbH &
> Co. KG
>
> Adolf Klemenz
>
> Marktstr. 10
> D-47647 Kerken
>
> phone (+49)(0)2833/570-976
> fax (+49)(0)2833/3328
> email mailto:a...@dsignt.de
> web http://www.dsignt.de
>
-------------------------------
Adolf-

> > On the EMIF side, what controls HCS?
> > What is connected to HRDY?
> > How long is HCS asserted before HRDY is checked and an HPI access is
> attempted?
>
> this is the interconnection:
>
> EA2 -> HHWIL
> EA3 -> HCNTL0
> EA4 -> HCNTL1
> EA5 -> HR/W
> CE3 -> HCS
> ARE -> HDS1
> AWE -> HDS2
> ARDY <- inverter <- HRDY
> HAS pulled high
>
> HPI access timing is determined by the CECTL3 register:
> 10ns setup from host control signals to HDS
> 40ns HDS strobe
> 10ns hold fon HDS to host control signals
>
> both 6713B operate at 200MHz with 100MHz EMIF clock. This timing should
> fulfill the 6713B HPI timing requirements (4 CPU cycles HDS strobe, 4 CPU
> cycles HDS inactive between consecutive accesses, 5ns setup, 4ns hold). The
> strobe period is extended to 4 EMIF clock cycles to allow the host DSP to
> recognize ARDY.
>
> A typical HPI read sequence (with address autoincrement) has HRDY high on
> the first read operation. Consecutive reads have HRDY low, or asserted for
> only a very short time. If the HPI locks-up, HRDY goes high during a read
> sequence and stays high forever.

A few comments / questions:

1) Why is HR/W controlled by an address signal? This shouldn't cause a timing problem, but it does limit your debug
capabilities -- you can't read and write the same locations from the EMIF perspective -- so it's harder to tell if the
HPI-EMIF interface is truly working.

2) How long after /CE3 asserts before ARDY stablizes? Some problems I've had with HPI interfaces before had to do
with a need to assert /HCS longer and stabilize HRDY before checking it. I was never comfortable with TI data sheet
specs on this delay; I always felt there was some effect relative to CPU clock. Can you try asserting /HCS earlier or
just leaving it on always?

3) You say the problem shows up when you run specific, numerically intense test code. Does that code use another
external mem region? Have you considered the possibility that EMIF "getting off one chip select" and "getting on
another" has something to do with it? For example maybe /ARE or /ARW don't always go high when mem regions switch on
back-to-back EMIF accesses? That would sure mess things up for an HPI that samples read/write on the falling edge of
combined HDS. Can you run the test code without using any other external mem region (just /CE3 HPI area) and see if
the problem goes away?

-Jeff
Hi Jeff,

At 11:43 18.06.2007 -0500, Jeff Brower wrote:
>1) Why is HR/W controlled by an address signal? This shouldn't cause a
>timing problem, but it does limit your debug
>capabilities -- you can't read and write the same locations from the EMIF
>perspective -- so it's harder to tell if the
>HPI-EMIF interface is truly working.

We have chosen this configuration to satisfy the HPI timing constraints.
HR/W must have a 5ns setup to the HDS strobes and a 4ns hold. The HDS1 and
HDS2 strobe signals are the EMIF signals ARE and AWE in our design. Using
an address line as HR/W allows us to program this setup and hold time in
the EMIF.
>2) How long after /CE3 asserts before ARDY stablizes? Some problems I've
>had with HPI interfaces before had to do
>with a need to assert /HCS longer and stabilize HRDY before checking it. I
>was never comfortable with TI data sheet
>specs on this delay; I always felt there was some effect relative to CPU
>clock. Can you try asserting /HCS earlier or
>just leaving it on always?

We can program the minimum CE3 / HCS assertion time in the EMIF of the host
DSP. We use 4 cycles (40ns). HRDY was always stable at least 20ns after HCS
low. Longer HCS times, e.g. 60ns didn't make any difference.
>3) You say the problem shows up when you run specific, numerically intense
>test code. Does that code use another
>external mem region? Have you considered the possibility that EMIF
>"getting off one chip select" and "getting on
>another" has something to do with it? For example maybe /ARE or /ARW don't
>always go high when mem regions switch on
>back-to-back EMIF accesses? That would sure mess things up for an HPI that
>samples read/write on the falling edge of
>combined HDS. Can you run the test code without using any other external
>mem region (just /CE3 HPI area) and see if
>the problem goes away?

We setup a scope to trigger on this condition (ARE or AWE still low when
HCS goes low), but we never saw such a state. The lockup always occurs
during HPI read sequences on the third or fourth read.

Meanwhile we found a fix for the problem, but I am not really satisfied
with it:
We are using two 6713B boards for the test, both running on their own clock
oscillator. Now we run both boards synchronously from a single clock
source. Additionally we have increased the EMIF hold time for HPI access to
2 cycles. This results in 30ns idle time between consecutive HPI reads
(originally we used 20ns). Each of these modifications alone didn't solve
the problem, only if both are applied the system works stable.
Both processors run at 200MHz. The HPI timing constraints call for a 4P
minimum strobe time, and a 4P minimum idle time - 20ns at 200MHz core
clock. A 1-4-1 EMIF timing (at 100 MHz) hence should always stisfy the HPI
timing constraints, but it seems in reality the HPI is slower ...

Many thanks,
Adolf Klemenz, D.SignT

-------------------------------
D.SignT - Digital Signalprocessing Technology GmbH & Co. KG

Adolf Klemenz

Marktstr. 10
D-47647 Kerken

phone (+49)(0)2833/570-976
fax (+49)(0)2833/3328
email mailto:a...@dsignt.de
web http://www.dsignt.de
-------------------------------
Mike,

At 05:55 18.06.2007 -0700, you wrote:
>1. If you load a simple 'do nothing loop' with a few
>NOPs, does the HPI function correctly??

Yes, bootload via HPI and communications during less demanding code always
function correctly.

>2. Have you verified that the target DSP [with HPI
>interface] is not asserting ARDY false.

Yes, ARDY is driven by HRDY. It is always according to the specs, except
for the failure case, when it gets stuck.

>3. When this condition occurs during the running of
>your app, do you have CCS access to the target?? If
>yes, is the PC always in the same loop??

No, we are not able to connect to the DSP in case of a failure (timeout error)

We have found a workaround meanwhile (please see my reply to Jeff) by
increasing the idle time between HPI consecutive reads above the required
minimum 4P constraint.

Many thanks!
Adolf Klemenz, D.SignT

-------------------------------
D.SignT - Digital Signalprocessing Technology GmbH & Co. KG

Adolf Klemenz

Marktstr. 10
D-47647 Kerken

phone (+49)(0)2833/570-976
fax (+49)(0)2833/3328
email mailto:a...@dsignt.de
web http://www.dsignt.de
-------------------------------
Adolf-

> We can program the minimum CE3 / HCS assertion time in the EMIF of the host
> DSP. We use 4 cycles (40ns). HRDY was always stable at least 20ns after HCS
> low. Longer HCS times, e.g. 60ns didn't make any difference.

I don't understand this... are you saying you set the EMIF access to wait 40 nsec
then check ARDY?

So you're saying the entire HPI read/write cycle is at least 40 + 30 nsec (idle time
you mention below), or 70 nsec, not including strobe time?

> Meanwhile we found a fix for the problem, but I am not really satisfied
> with it:
> We are using two 6713B boards for the test, both running on their own clock
> oscillator. Now we run both boards synchronously from a single clock
> source. Additionally we have increased the EMIF hold time for HPI access to
> 2 cycles. This results in 30ns idle time between consecutive HPI reads
> (originally we used 20ns). Each of these modifications alone didn't solve
> the problem, only if both are applied the system works stable.
> Both processors run at 200MHz. The HPI timing constraints call for a 4P
> minimum strobe time, and a 4P minimum idle time - 20ns at 200MHz core
> clock. A 1-4-1 EMIF timing (at 100 MHz) hence should always stisfy the HPI
> timing constraints, but it seems in reality the HPI is slower ...

You shouldn't have to sync the HPI with anything -- otherwise there would be a lot of
situations where HPI couldn't be used. My guess is that another combination or
sequence of events, yet to be found, will make it unstable again.

-Jeff
Jeff,

At 12:14 19.06.2007 -0500, Jeff Brower wrote:
>I don't understand this... are you saying you set the EMIF access to wait
>40 nsec
>then check ARDY?

Yes, that's the way the EMIF works in asynchronous mode: ARDY is sampled on
the last programmed strobe cycle. If ARDY is found low, the bus cycle is
extended until ARDY is high.

>So you're saying the entire HPI read/write cycle is at least 40 + 30 nsec
>(idle time
>you mention below), or 70 nsec, not including strobe time?

10 nsec setup, at least 40ns strobe (depending on ARDY/HRDY), plus 20nsec
hold.
>You shouldn't have to sync the HPI with anything -- otherwise there would
>be a lot of
>situations where HPI couldn't be used. My guess is that another combination or
>sequence of events, yet to be found, will make it unstable again.

That's exactly what I am afraid of. We have used the HPI in several systems
before without problems, but these all used slower access time, and never
had the target DSP been performing so many numerically intense operations.

Many thanks,
Adolf Klemenz, D.SignT
-------------------------------
D.SignT - Digital Signalprocessing Technology GmbH & Co. KG

Adolf Klemenz

Marktstr. 10
D-47647 Kerken

phone (+49)(0)2833/570-976
fax (+49)(0)2833/3328
email mailto:a...@dsignt.de
web http://www.dsignt.de
-------------------------------