comp.dsp | OT: Ariane 5 Launcher Failure| page 2

Reply by Sebastian Doht ●September 1, 20152015-09-01

Am 01.09.2015 um 21:01 schrieb gyansorova@gmail.com:
> On Tuesday, September 1, 2015 at 11:36:05 PM UTC+12, Randy Yates wrote:
>> Folks,
>>
>> I've been in a LinkedIn discussion in which the following analysis an
>> Ariane 5 failure is documented:
>>
>>    http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
>>
>> I'm just reposting it here since I find it fascinating, and I bet there
>> are a few folks here (Tim, you come to mind especially) who might have a
>> few things to say about it.
>>
>> I find it almost laughable (if it weren't for the expense and danger
>> such a failure had or potentially had) that the root cause was a
>> conversion from float to integer! It supports a "feeling" I've had for a
>> long time that coding in float is dangerous for just such reasons.
>> --
>> Randy Yates
>> Digital Signal Labs
>> http://www.digitalsignallabs.com
>
> I thought they used ADA for such things
>

As far as I recall they used Ada but turned all range checks of Ada off 
which makes the usage of Ada as an argument for increased safety quite 
meaningless.

"A fool with a tool is still a fool"

Reply by Randy Yates ●September 1, 20152015-09-01

rickman <gnuarm@gmail.com> writes:

> On 9/1/2015 7:36 AM, Randy Yates wrote:
>> Folks,
>>
>> I've been in a LinkedIn discussion in which the following analysis an
>> Ariane 5 failure is documented:
>>
>>    http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
>>
>> I'm just reposting it here since I find it fascinating, and I bet there
>> are a few folks here (Tim, you come to mind especially) who might have a
>> few things to say about it.
>>
>> I find it almost laughable (if it weren't for the expense and danger
>> such a failure had or potentially had) that the root cause was a
>> conversion from float to integer! It supports a "feeling" I've had for a
>> long time that coding in float is dangerous for just such reasons.
>
> I find this conclusion to show an immense lack of understanding of the
> cause of the failure.  Did we read the same report?
>
> The use of integers for the variable that was the float would not have
> mitigated the accident.  If you had used an N bit integer the same
> conversion to a 16 bit integer would have resulted in the same
> overflow and conversion error.
>
> The two primary causes of the accident were allowing the software for
> alignment of the strap-down inertial platform to continue to run after
> liftoff when it received invalid inputs which resulted in the out of
> range problem and the decision to shut down the processor on this
> error based on the assumption that the software was not faulty but
> rather the hardware was, which was an erroneous assumption in this
> case.

I think there are several places one could lay the "cause" (perhaps
"root cause" was too extreme of a term). I certainly won't argue that
one would be the decision to leave the calibration running after it was
no longer required. That just doesn't make sense.

Another could be the generic exception-handling specification that all
exceptions were catastrophic and should result in the processor being
shut down.

Yet another could be the designers' decision to allow this to generate
an exception at all and not test for it and take other non-exceptional
action. That is essentially my argument.

Let me ask a question: what if the alignment algorithm designer had used
ONLY a 16-bit integer for the horizontal bias. Then, AT DESIGN TIME, the
algorithm designer would have been forced to consider out-of-range input
and choose the action more intelligently. For example, instead of
shutting the software down, they could have saturated the value. Granted
this could have been done with the double value as well, but the point
is that designer is FORCED to consider the case if you are thinking with
an integer frame-of-mind.

If a saturation had been used, then we wouldn't be talking about
exceptions in this report as it would have never happened. 

Ta-may-toe, ta-ma-toe..
-- 
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com

Reply by Randy Yates ●September 1, 20152015-09-01

rickman <gnuarm@gmail.com> writes:

> On 9/1/2015 10:59 AM, Tim Wescott wrote:
>>
>> Now, if I'm going to bring MY prejudices to bear on this, it was because
>> the systems engineering team was of the opinion that embedded software is
>> Black Magic, or they considered that it doesn't really have value because
>> it doesn't show up as a line item on the bill of materials.
>
> Prejudice is exactly the right word.

Call it what you want - if a different approach had been made, as I
outlined in a post just a few minutes ago, the Europeans would be
millions of dollars and a missile launch up. 
-- 
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com

Reply by Randy Yates ●September 1, 20152015-09-01

spope33@speedymail.org (Steve Pope) writes:

> Randy Yates  <yates@digitalsignallabs.com> wrote:
>
>>Let me ask a question: what if the alignment algorithm designer had used
>>ONLY a 16-bit integer for the horizontal bias. Then, AT DESIGN TIME, the
>>algorithm designer would have been forced to consider out-of-range input
>>and choose the action more intelligently. For example, instead of
>>shutting the software down, they could have saturated the value. Granted
>>this could have been done with the double value as well, but the point
>>is that designer is FORCED to consider the case if you are thinking with
>>an integer frame-of-mind.
>
> This is why you want to use fixed-point types.
>
> (Which is not the same as storing a value in an integer type. Very
> often fixed-point types are stored in doubles.)

Steve,

Does ADA have fixed-point types? If so, I agree violently. (I'm not up
on ADA...)
-- 
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com

Reply by Steve Pope ●September 1, 20152015-09-01

Randy Yates  <yates@digitalsignallabs.com> wrote:

>spope33@speedymail.org (Steve Pope) writes:

>> Randy Yates  <yates@digitalsignallabs.com> wrote:

>>>Let me ask a question: what if the alignment algorithm designer had used
>>>ONLY a 16-bit integer for the horizontal bias. Then, AT DESIGN TIME, the
>>>algorithm designer would have been forced to consider out-of-range input
>>>and choose the action more intelligently. For example, instead of
>>>shutting the software down, they could have saturated the value. Granted
>>>this could have been done with the double value as well, but the point
>>>is that designer is FORCED to consider the case if you are thinking with
>>>an integer frame-of-mind.

>> This is why you want to use fixed-point types.

>> (Which is not the same as storing a value in an integer type. Very
>> often fixed-point types are stored in doubles.)

>Does ADA have fixed-point types? If so, I agree violently. (I'm not up
>on ADA...)

I don't recall ADA having built-in fixed-point types from when I
studied ADA in fall, 1980.  Almost certainly someone added them
to the language at some point.  Whether they occured in a particular
ADA implementation ... for a particular rocket's computer ... 
who knows, but "casting a float to an int" sonds like if so,
they were not being used.

Any reasonably extensible language (ADA, C++) can be extended
with fixed-point types.  To add the most useful System C fixed
point types (and saturation / rounding modes) to C++ is about
50 lines of header file code maximum.  It may not be efficient
enough for embedded work, however.

Steve

Reply by Randy Yates ●September 1, 20152015-09-01

Tim Wescott <tim@seemywebsite.com> writes:

> On Tue, 01 Sep 2015 07:36:01 -0400, Randy Yates wrote:
>
>> Folks,
>> 
>> I've been in a LinkedIn discussion in which the following analysis an
>> Ariane 5 failure is documented:
>> 
>>   http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
>> 
>> I'm just reposting it here since I find it fascinating, and I bet there
>> are a few folks here (Tim, you come to mind especially) who might have a
>> few things to say about it.
>> 
>> I find it almost laughable (if it weren't for the expense and danger
>> such a failure had or potentially had) that the root cause was a
>> conversion from float to integer! It supports a "feeling" I've had for a
>> long time that coding in float is dangerous for just such reasons.
>
> Well, I don't see that as the biggest error, or even one that, given the 
> nature of the root problem, would have saved the thing if it was 
> corrected.

Why not? If the BH conversion was protected as other variables, or an
integer was used that saturated, there would have been no exception
generated and thus no crash (due to this bug).
-- 
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com

Reply by Randy Yates ●September 1, 20152015-09-01

spope33@speedymail.org (Steve Pope) writes:

> Randy Yates  <yates@digitalsignallabs.com> wrote:
>
>>I find it almost laughable (if it weren't for the expense and danger
>>such a failure had or potentially had) that the root cause was a
>>conversion from float to integer! It supports a "feeling" I've had for a
>>long time that coding in float is dangerous for just such reasons.
>
> This puts you with Von Neumann.

I'm not sure if that's a complement or a criticism... 

> Floats and doubles are not dangerous.  They can store integers within a
> certain range, just like any other format.

Yes, but when humans use them, they start being sloppy! And if you are
not sloppy, you might as well use integers/fixed-point (for many many
things).

> Programmers however are dangerous.

This sounds a lot like the anti-gun-control sentiment.. 
-- 
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com

Reply by Randy Yates ●September 1, 20152015-09-01

Randy Yates <yates@digitalsignallabs.com> writes:
> [...]
> I'm not sure if that's a complement or a criticism...
                           compliment
-- 
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com

Reply by Tim Wescott ●September 1, 20152015-09-01

On Tue, 01 Sep 2015 18:33:36 -0400, Randy Yates wrote:

> Tim Wescott <tim@seemywebsite.com> writes:
> 
>> On Tue, 01 Sep 2015 07:36:01 -0400, Randy Yates wrote:
>>
>>> Folks,
>>> 
>>> I've been in a LinkedIn discussion in which the following analysis an
>>> Ariane 5 failure is documented:
>>> 
>>>   http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
>>> 
>>> I'm just reposting it here since I find it fascinating, and I bet
>>> there are a few folks here (Tim, you come to mind especially) who
>>> might have a few things to say about it.
>>> 
>>> I find it almost laughable (if it weren't for the expense and danger
>>> such a failure had or potentially had) that the root cause was a
>>> conversion from float to integer! It supports a "feeling" I've had for
>>> a long time that coding in float is dangerous for just such reasons.
>>
>> Well, I don't see that as the biggest error, or even one that, given
>> the nature of the root problem, would have saved the thing if it was
>> corrected.
> 
> Why not? If the BH conversion was protected as other variables, or an
> integer was used that saturated, there would have been no exception
> generated and thus no crash (due to this bug).

I think that saying that the problem was that they used floating point is 
like saying "he didn't apply the brakes early enough" about a guy who 
went driving on wet roads with bald tires.

Yes, it's _a_ correct interpretation of the evidence.  But I don't think 
it's the _most useful_ interpretation.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Reply by Gordon Sande ●September 1, 20152015-09-01

On 2015-09-01 14:59:17 +0000, Tim Wescott said:

> On Tue, 01 Sep 2015 07:36:01 -0400, Randy Yates wrote:
> 
>> Folks,
>> 
>> I've been in a LinkedIn discussion in which the following analysis an
>> Ariane 5 failure is documented:
>> 
>> http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
>> 
>> I'm just reposting it here since I find it fascinating, and I bet there
>> are a few folks here (Tim, you come to mind especially) who might have a
>> few things to say about it.
>> 
>> I find it almost laughable (if it weren't for the expense and danger
>> such a failure had or potentially had) that the root cause was a
>> conversion from float to integer! It supports a "feeling" I've had for a
>> long time that coding in float is dangerous for just such reasons.
> 
> Well, I don't see that as the biggest error, or even one that, given the
> nature of the root problem, would have saved the thing if it was
> corrected.
> 
> The problem, as I see it, is that when they wrote the software for the
> Ariane 4 they were a bit sloppy (in the floating-to-integer conversion).
> Then, when they decided to reuse the software in the Ariane 5 they did
> not fully consider the impact of the change in the flight trajectory --
> i.e., they were sloppy.  Then, they didn't fully test the software --
> i.e., they were sloppy.

The story I seem to recall included the facts that the software worked
on the A4 but the A5 was a higher performance vehicle which then caused
the overflows, etc, etc as you recount.

> So they basically crashed an entire rocket system because they were
> sloppy.
> 
> Now, if I'm going to bring MY prejudices to bear on this, it was because
> the systems engineering team was of the opinion that embedded software is
> Black Magic, or they considered that it doesn't really have value because
> it doesn't show up as a line item on the bill of materials.

Previous 123 4 5 6 Next

OT: Ariane 5 Launcher Failure

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group