Hello,
In my project we use Gigabit Ethernet to transfer data from our image board with
6455 to image server (a high-performance computer). Our circuit board is placed
inside a machine.
I use NDK 2.0 and the helloWorld.pjt with NDK to develop my own project (TCP-IP
protocol). The program running on the server is developed by another guy, and I
don't know much about it.
(The data rate is about 500 Mbps. Previously I tested 6455 Ethernet with another
computer, and got 700+ Mbps speed.)
Normally the network transaction works well and reposefully for several hours.
But sometimes, the network speed suddenly shuts down. When that happens, from my
dsp project, I see that the network send() function times out (send() returns
-1, network error code 35). By default, TCP uses BLOCK mode, and I set SEND time
limit 12s.
I have ran network program on the board (not in the working machine) with my
computer for several days, and the problem never occured.
Does dsp or the server causes the problem, or does the complex electrical
environment disturb the circuit board and Ethernet peripherals?
Thanks in advance!
_____________________________________
6455 Ethernet network error
Started by ●February 10, 2011
Reply by ●February 11, 20112011-02-11
Francis,
I haven't seen the code, so can only guess.
Here are my guesses.
--a table overflows
--a malloc fails
--a memory leak
I would be trying to obtain the source for the tcp/ip stack and debug it as that
sounds like where the failure is occurring.
R. Williams
---------- Original Message -----------
From: f...@gmail.com
To: c...
Sent: Thu, 10 Feb 2011 22:22:09 -0500
Subject: [c6x] 6455 Ethernet network error
> Hello,
>
> In my project we use Gigabit Ethernet to transfer data from our image
> board with 6455 to image server (a high-performance computer). Our
> circuit board is placed inside a machine.
>
> I use NDK 2.0 and the helloWorld.pjt with NDK to develop my own
> project (TCP-IP protocol). The program running on the server is
> developed by another guy, and I don't know much about it.
>
> (The data rate is about 500 Mbps. Previously I tested 6455 Ethernet
> with another computer, and got 700+ Mbps speed.)
>
> Normally the network transaction works well and reposefully for
> several hours. But sometimes, the network speed suddenly shuts down.
> When that happens, from my dsp project, I see that the network send()
> function times out (send() returns -1, network error code 35). By
> default, TCP uses BLOCK mode, and I set SEND time limit 12s.
>
> I have ran network program on the board (not in the working machine)
> with my computer for several days, and the problem never occured.
>
> Does dsp or the server causes the problem, or does the complex
> electrical environment disturb the circuit board and Ethernet peripherals?
>
> Thanks in advance!
------- End of Original Message -------
_____________________________________
I haven't seen the code, so can only guess.
Here are my guesses.
--a table overflows
--a malloc fails
--a memory leak
I would be trying to obtain the source for the tcp/ip stack and debug it as that
sounds like where the failure is occurring.
R. Williams
---------- Original Message -----------
From: f...@gmail.com
To: c...
Sent: Thu, 10 Feb 2011 22:22:09 -0500
Subject: [c6x] 6455 Ethernet network error
> Hello,
>
> In my project we use Gigabit Ethernet to transfer data from our image
> board with 6455 to image server (a high-performance computer). Our
> circuit board is placed inside a machine.
>
> I use NDK 2.0 and the helloWorld.pjt with NDK to develop my own
> project (TCP-IP protocol). The program running on the server is
> developed by another guy, and I don't know much about it.
>
> (The data rate is about 500 Mbps. Previously I tested 6455 Ethernet
> with another computer, and got 700+ Mbps speed.)
>
> Normally the network transaction works well and reposefully for
> several hours. But sometimes, the network speed suddenly shuts down.
> When that happens, from my dsp project, I see that the network send()
> function times out (send() returns -1, network error code 35). By
> default, TCP uses BLOCK mode, and I set SEND time limit 12s.
>
> I have ran network program on the board (not in the working machine)
> with my computer for several days, and the problem never occured.
>
> Does dsp or the server causes the problem, or does the complex
> electrical environment disturb the circuit board and Ethernet peripherals?
>
> Thanks in advance!
------- End of Original Message -------
_____________________________________
Reply by ●February 11, 20112011-02-11
Feng Li-
> In my project we use Gigabit Ethernet to transfer data
> from our image board with 6455 to image server (a
> high-performance computer). Our circuit board is
> placed inside a machine.
>
> I use NDK 2.0 and the helloWorld.pjt with NDK to
> develop my own project (TCP-IP protocol). The program
> running on the server is developed by another guy, and
> I don't know much about it.
>
> (The data rate is about 500 Mbps. Previously I tested
> 6455 Ethernet with another computer, and got 700+ Mbps
> speed.)
>
> Normally the network transaction works well and
> reposefully for several hours. But sometimes, the
> network speed suddenly shuts down. When that happens,
> from my dsp project, I see that the network send()
> function times out (send() returns -1, network error
> code 35). By default, TCP uses BLOCK mode, and I set
> SEND time limit 12s.
Suggest to run a test where you control exactly the number of packets (and their lengths). If your test is repeatable
and the time until failure is approximately the same (3 hrs? 4 hrs?), then it's probably a memory leak, a bad
pointer, or some other software issue that takes time to develop.
A couple of years ago I was involved in debugging a problem that took 22 hrs until failure. A bad write pointer was
very slowly advancing through memory until finally it overwrote a critical location -- and then 'boom', total hardware
freeze. The lock-up was so bad there was no way to debug (look at memory values, program execution trace, etc).
Initially the time was intermittent, but, once we created a precisely controlled test, with all I/O data generated
exactly the same each time, it became clear that time until failure was repeatable.
-Jeff
_____________________________________
> In my project we use Gigabit Ethernet to transfer data
> from our image board with 6455 to image server (a
> high-performance computer). Our circuit board is
> placed inside a machine.
>
> I use NDK 2.0 and the helloWorld.pjt with NDK to
> develop my own project (TCP-IP protocol). The program
> running on the server is developed by another guy, and
> I don't know much about it.
>
> (The data rate is about 500 Mbps. Previously I tested
> 6455 Ethernet with another computer, and got 700+ Mbps
> speed.)
>
> Normally the network transaction works well and
> reposefully for several hours. But sometimes, the
> network speed suddenly shuts down. When that happens,
> from my dsp project, I see that the network send()
> function times out (send() returns -1, network error
> code 35). By default, TCP uses BLOCK mode, and I set
> SEND time limit 12s.
Suggest to run a test where you control exactly the number of packets (and their lengths). If your test is repeatable
and the time until failure is approximately the same (3 hrs? 4 hrs?), then it's probably a memory leak, a bad
pointer, or some other software issue that takes time to develop.
A couple of years ago I was involved in debugging a problem that took 22 hrs until failure. A bad write pointer was
very slowly advancing through memory until finally it overwrote a critical location -- and then 'boom', total hardware
freeze. The lock-up was so bad there was no way to debug (look at memory values, program execution trace, etc).
Initially the time was intermittent, but, once we created a precisely controlled test, with all I/O data generated
exactly the same each time, it became clear that time until failure was repeatable.
-Jeff
_____________________________________
Reply by ●February 11, 20112011-02-11
Feng Li,
On 2/11/2011 1:11 AM, Jeff Brower wrote:
>
> Feng Li-
>
> > In my project we use Gigabit Ethernet to transfer data
> > from our image board with 6455 to image server (a
> > high-performance computer). Our circuit board is
> > placed inside a machine.
> >
> > I use NDK 2.0 and the helloWorld.pjt with NDK to
> > develop my own project (TCP-IP protocol). The program
> > running on the server is developed by another guy, and
> > I don't know much about it.
> >
> > (The data rate is about 500 Mbps. Previously I tested
> > 6455 Ethernet with another computer, and got 700+ Mbps
> > speed.)
> >
> > Normally the network transaction works well and
> > reposefully for several hours. But sometimes, the
> > network speed suddenly shuts down. When that happens,
> > from my dsp project, I see that the network send()
> > function times out (send() returns -1, network error
> > code 35). By default, TCP uses BLOCK mode, and I set
> > SEND time limit 12s.
>
You did not clearly mention your network configuration. As an extension
to what Jeff said, I prefer to troubleshoot [if possible] on a simple
network that contains only the client and server. It often makes the
failures more predictable/deterministic.
mikedunn
>
> Suggest to run a test where you control exactly the number of packets
> (and their lengths). If your test is repeatable
> and the time until failure is approximately the same (3 hrs? 4 hrs?),
> then it's probably a memory leak, a bad
> pointer, or some other software issue that takes time to develop.
>
> A couple of years ago I was involved in debugging a problem that took
> 22 hrs until failure. A bad write pointer was
> very slowly advancing through memory until finally it overwrote a
> critical location -- and then 'boom', total hardware
> freeze. The lock-up was so bad there was no way to debug (look at
> memory values, program execution trace, etc).
> Initially the time was intermittent, but, once we created a precisely
> controlled test, with all I/O data generated
> exactly the same each time, it became clear that time until failure
> was repeatable.
>
> -Jeff
On 2/11/2011 1:11 AM, Jeff Brower wrote:
>
> Feng Li-
>
> > In my project we use Gigabit Ethernet to transfer data
> > from our image board with 6455 to image server (a
> > high-performance computer). Our circuit board is
> > placed inside a machine.
> >
> > I use NDK 2.0 and the helloWorld.pjt with NDK to
> > develop my own project (TCP-IP protocol). The program
> > running on the server is developed by another guy, and
> > I don't know much about it.
> >
> > (The data rate is about 500 Mbps. Previously I tested
> > 6455 Ethernet with another computer, and got 700+ Mbps
> > speed.)
> >
> > Normally the network transaction works well and
> > reposefully for several hours. But sometimes, the
> > network speed suddenly shuts down. When that happens,
> > from my dsp project, I see that the network send()
> > function times out (send() returns -1, network error
> > code 35). By default, TCP uses BLOCK mode, and I set
> > SEND time limit 12s.
>
You did not clearly mention your network configuration. As an extension
to what Jeff said, I prefer to troubleshoot [if possible] on a simple
network that contains only the client and server. It often makes the
failures more predictable/deterministic.
mikedunn
>
> Suggest to run a test where you control exactly the number of packets
> (and their lengths). If your test is repeatable
> and the time until failure is approximately the same (3 hrs? 4 hrs?),
> then it's probably a memory leak, a bad
> pointer, or some other software issue that takes time to develop.
>
> A couple of years ago I was involved in debugging a problem that took
> 22 hrs until failure. A bad write pointer was
> very slowly advancing through memory until finally it overwrote a
> critical location -- and then 'boom', total hardware
> freeze. The lock-up was so bad there was no way to debug (look at
> memory values, program execution trace, etc).
> Initially the time was intermittent, but, once we created a precisely
> controlled test, with all I/O data generated
> exactly the same each time, it became clear that time until failure
> was repeatable.
>
> -Jeff
Reply by ●February 11, 20112011-02-11
Richard and Jeff,
Thanks very much for your guesses and suggestions.
I will check my code firstly.
Francis
_____________________________________
Thanks very much for your guesses and suggestions.
I will check my code firstly.
Francis
_____________________________________
Reply by ●February 12, 20112011-02-12
If u dont know the knowledge of assembly language or c or c plus , but
u directly to write in memory space if possible ,it is write to u it
is best way , please given ideas
On 2/11/11, f...@gmail.com wrote:
> Richard and Jeff,
> Thanks very much for your guesses and suggestions.
> I will check my code firstly.
> Francis
>
_____________________________________
u directly to write in memory space if possible ,it is write to u it
is best way , please given ideas
On 2/11/11, f...@gmail.com wrote:
> Richard and Jeff,
> Thanks very much for your guesses and suggestions.
> I will check my code firstly.
> Francis
>
_____________________________________
Reply by ●February 12, 20112011-02-12
mikedunn,
Previously in my article I mentioned:
"I have ran network program on the board (not in the working machine) with my computer for several days, and the problem never occured."
The test network program is different from that runs on the machine. However, they both use NDK to develop network-oriented projects and are based on helloWorld.pjt NDK provides.
One possible case is that: the board and my project are OK, the program running on the server includes bugs.
Francis
_____________________________________
Previously in my article I mentioned:
"I have ran network program on the board (not in the working machine) with my computer for several days, and the problem never occured."
The test network program is different from that runs on the machine. However, they both use NDK to develop network-oriented projects and are based on helloWorld.pjt NDK provides.
One possible case is that: the board and my project are OK, the program running on the server includes bugs.
Francis
_____________________________________
Reply by ●February 15, 20112011-02-15