DSPRelated.com
Forums

OT: Ariane 5 Launcher Failure

Started by Randy Yates September 1, 2015
Tim Wescott wrote:
> On Tue, 01 Sep 2015 07:36:01 -0400, Randy Yates wrote: > >> Folks, >> >> I've been in a LinkedIn discussion in which the following analysis an >> Ariane 5 failure is documented: >> >> http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html >> >> I'm just reposting it here since I find it fascinating, and I bet there >> are a few folks here (Tim, you come to mind especially) who might have a >> few things to say about it. >> >> I find it almost laughable (if it weren't for the expense and danger >> such a failure had or potentially had) that the root cause was a >> conversion from float to integer! It supports a "feeling" I've had for a >> long time that coding in float is dangerous for just such reasons. > > Well, I don't see that as the biggest error, or even one that, given the > nature of the root problem, would have saved the thing if it was > corrected. > > The problem, as I see it, is that when they wrote the software for the > Ariane 4 they were a bit sloppy (in the floating-to-integer conversion). > Then, when they decided to reuse the software in the Ariane 5 they did > not fully consider the impact of the change in the flight trajectory -- > i.e., they were sloppy. Then, they didn't fully test the software -- > i.e., they were sloppy. > > So they basically crashed an entire rocket system because they were > sloppy. > > Now, if I'm going to bring MY prejudices to bear on this, it was because > the systems engineering team was of the opinion that embedded software is > Black Magic, or they considered that it doesn't really have value because > it doesn't show up as a line item on the bill of materials. >
+1 "oh, that? It's done - we don't have to budget for testing that." I am kind of amazed that they know the flight trajectory and don't test to exhaustion. -- Les Cargill
rickman <gnuarm@gmail.com> writes:
> On 9/1/2015 6:18 PM, Randy Yates wrote: > [...] >> If a saturation had been used, then we wouldn't be talking about >> exceptions in this report as it would have never happened. > > Nothing you say here changes the fact that the problem was not due to > the use of floating point.
From a computational perspective, that's right. But did you miss the part I wrote about concerning an "integer frame-of-mind"? I am referring to the design process that led up to this error, not the actual chugging of compiled code. You stated that we don't know the details about the algorithm. That's true. But we know they took a double and attempted to convert it to a signed 16-bit integer, so they knew the final value should have been representable by such an integer. However if they had changed the internal/intermediate computations to work with an integer instead of a float, that knowledge should have rippled into the other parts of the algorithm appropriately. And had they stayed in the integer domain, they probably wouldn't have had such a computational loading problem either. Although I don't understand why ANY conversion to an integer (e.g., whether from a double or from a wider integer) in such an application wouldn't be carefully analyzed. But it seems like when folks throw floats into their algorithms, they stop being careful. -- Randy Yates Digital Signal Labs http://www.digitalsignallabs.com
Les Cargill <lcargill99@comcast.com> writes:

> Randy Yates wrote: >> spope33@speedymail.org (Steve Pope) writes: >> >>> Randy Yates <yates@digitalsignallabs.com> wrote: >>> >>>> I find it almost laughable (if it weren't for the expense and danger >>>> such a failure had or potentially had) that the root cause was a >>>> conversion from float to integer! It supports a "feeling" I've had for a >>>> long time that coding in float is dangerous for just such reasons. >>> >>> This puts you with Von Neumann. >> >> I'm not sure if that's a complement or a criticism... >> >>> Floats and doubles are not dangerous. They can store integers within a >>> certain range, just like any other format. >> >> Yes, but when humans use them, they start being sloppy! > > Oh no no no!. You cannot trust them. Although really - integer > saturation is just as dangerous and probably more common. > >> And if you are >> not sloppy, you might as well use integers/fixed-point (for many many >> things). >> >>> Programmers however are dangerous. >> >> This sounds a lot like the anti-gun-control sentiment.. >> > > "Floating point doesn't kill rockets... programmers > kill rockets... "
Exactly!!! True enough... -- Randy Yates Digital Signal Labs http://www.digitalsignallabs.com
On 9/2/2015 12:22 AM, Randy Yates wrote:
> rickman <gnuarm@gmail.com> writes: >> On 9/1/2015 6:18 PM, Randy Yates wrote: >> [...] >>> If a saturation had been used, then we wouldn't be talking about >>> exceptions in this report as it would have never happened. >> >> Nothing you say here changes the fact that the problem was not due to >> the use of floating point. > > From a computational perspective, that's right. But did you miss the > part I wrote about concerning an "integer frame-of-mind"? I am referring > to the design process that led up to this error, not the actual chugging > of compiled code.
Here you talk about "frame of mind"...
> You stated that we don't know the details about the algorithm. That's > true. But we know they took a double and attempted to convert it to a > signed 16-bit integer, so they knew the final value should have been > representable by such an integer. However if they had changed the > internal/intermediate computations to work with an integer instead of a > float, that knowledge should have rippled into the other parts of the > algorithm appropriately. And had they stayed in the integer domain, they > probably wouldn't have had such a computational loading problem either.
Here to talk about using integers for the calculation...
> Although I don't understand why ANY conversion to an integer (e.g., > whether from a double or from a wider integer) in such an application > wouldn't be carefully analyzed. But it seems like when folks throw > floats into their algorithms, they stop being careful.
Here you conclude that they converted from float to integer without adequate consideration... All of this ignores the fact that integers could have been used throughout the computations with no difference in result... a number that was too large for the variable it was transferred to, because... there was a system error that applied inappropriate inputs to the algorithm. Your idea that using integers throughout would have shown the problem is specious because it would have had the exact same thought process throughout. -- Rick
Tim Wescott <seemywebsite@myfooter.really> wrote:
> On Tue, 01 Sep 2015 18:33:36 -0400, Randy Yates wrote:
(snip regarding Ariane rocket)
>>>> I find it almost laughable (if it weren't for the expense and danger >>>> such a failure had or potentially had) that the root cause was a >>>> conversion from float to integer! It supports a "feeling" I've had for >>>> a long time that coding in float is dangerous for just such reasons. > >>> Well, I don't see that as the biggest error, or even one that, given >>> the nature of the root problem, would have saved the thing if it was >>> corrected.
(snip)
> I think that saying that the problem was that they used floating point is > like saying "he didn't apply the brakes early enough" about a guy who > went driving on wet roads with bald tires.
Well, I somewhat agree with Randy on this one. In the case of bald tires on wet roads, the driver should know he has bald tires, and definitely know the road is wet. Floating point is great when used for what it is designed to do, but too many people use it where it shouldn't be used. It is too easy to believe that floating point will give the right answers, one forgets that they still have to think. OK, so lets say that the guy with bald tires in the rain has ABS brakes, and so doesn't bother to worry about the rain and tires, because the brakes will save him. There are many cases where systems designed for protection don't help, as the users know the system is there, and adjust accordingly. Remember engineers in the slide rule days had to know, to an order of magnitude, what answer to expect. With a calculator, it is so easy to forget, and use an answer that is many orders of magnitude wrong. And even more, when a computer is doing it.
> Yes, it's _a_ correct interpretation of the evidence. But I don't think > it's the _most useful_ interpretation.
-- glen
rickman <gnuarm@gmail.com> writes:
> [...] > Your idea that using integers throughout would have shown the problem > is specious because it would have had the exact same thought process > throughout.
What competent programmer uses an integer without knowing and considering its range and its appropriateness to the task at-hand? I'm not sure what your point is, Rick. It could be this: they KNEW the variable would go out-of-range if one or more of the inputs were outside a certain range, and they chose to consider that a catastrophic error. Yes, if you knew clearly this was the case, I guess you could say the use of floating or integer had nothing to do with it. -- Randy Yates Digital Signal Labs http://www.digitalsignallabs.com
Am 02.09.2015 um 01:19 schrieb Tim Wescott:
> On Tue, 01 Sep 2015 23:53:33 +0200, Sebastian Doht wrote: > >> Am 01.09.2015 um 21:01 schrieb gyansorova@gmail.com: >>> On Tuesday, September 1, 2015 at 11:36:05 PM UTC+12, Randy Yates wrote: >>>> Folks, >>>> >>>> I've been in a LinkedIn discussion in which the following analysis an >>>> Ariane 5 failure is documented: >>>> >>>> http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html >>>> >>>> I'm just reposting it here since I find it fascinating, and I bet >>>> there are a few folks here (Tim, you come to mind especially) who >>>> might have a few things to say about it. >>>> >>>> I find it almost laughable (if it weren't for the expense and danger >>>> such a failure had or potentially had) that the root cause was a >>>> conversion from float to integer! It supports a "feeling" I've had for >>>> a long time that coding in float is dangerous for just such reasons. >>>> -- >>>> Randy Yates Digital Signal Labs http://www.digitalsignallabs.com >>> >>> I thought they used ADA for such things >>> >>> >> As far as I recall they used Ada but turned all range checks of Ada off >> which makes the usage of Ada as an argument for increased safety quite >> meaningless. >> >> "A fool with a tool is still a fool" > > The article said something about overflowing an integer value and popping > an exception. Which sounds more like they DID hit a range check, and > lost a rocket ship because of it. >
Maybe I just go back to the article and stop citing urban myths ;) However I was referring to compile time checks not run time checks...
On 9/1/15 6:33 PM, Steve Pope wrote:
> Randy Yates<yates@digitalsignallabs.com> wrote: > >> spope33@speedymail.org (Steve Pope) writes: > >>> Randy Yates<yates@digitalsignallabs.com> wrote: > >>>> Let me ask a question: what if the alignment algorithm designer had used >>>> ONLY a 16-bit integer for the horizontal bias. Then, AT DESIGN TIME, the >>>> algorithm designer would have been forced to consider out-of-range input >>>> and choose the action more intelligently. For example, instead of >>>> shutting the software down, they could have saturated the value. Granted >>>> this could have been done with the double value as well, but the point >>>> is that designer is FORCED to consider the case if you are thinking with >>>> an integer frame-of-mind. > >>> This is why you want to use fixed-point types. > >>> (Which is not the same as storing a value in an integer type. Very >>> often fixed-point types are stored in doubles.) > >> Does ADA have fixed-point types? If so, I agree violently. (I'm not up >> on ADA...) > > I don't recall ADA having built-in fixed-point types from when I > studied ADA in fall, 1980. Almost certainly someone added them > to the language at some point.
i was told that ADA was meant to "become all things to all men." (a biblical reference for those who might not recognize it.) sounds like the mother of all bloat to me. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
On Wed, 02 Sep 2015 16:11:01 -0400, robert bristow-johnson
<rbj@audioimagination.com> wrote:

>On 9/1/15 6:33 PM, Steve Pope wrote: >> Randy Yates<yates@digitalsignallabs.com> wrote: >> >>> spope33@speedymail.org (Steve Pope) writes: >> >>>> Randy Yates<yates@digitalsignallabs.com> wrote: >> >>>>> Let me ask a question: what if the alignment algorithm designer had used >>>>> ONLY a 16-bit integer for the horizontal bias. Then, AT DESIGN TIME, the >>>>> algorithm designer would have been forced to consider out-of-range input >>>>> and choose the action more intelligently. For example, instead of >>>>> shutting the software down, they could have saturated the value. Granted >>>>> this could have been done with the double value as well, but the point >>>>> is that designer is FORCED to consider the case if you are thinking with >>>>> an integer frame-of-mind. >> >>>> This is why you want to use fixed-point types. >> >>>> (Which is not the same as storing a value in an integer type. Very >>>> often fixed-point types are stored in doubles.) >> >>> Does ADA have fixed-point types? If so, I agree violently. (I'm not up >>> on ADA...) >> >> I don't recall ADA having built-in fixed-point types from when I >> studied ADA in fall, 1980. Almost certainly someone added them >> to the language at some point. > >i was told that ADA was meant to "become all things to all men." (a >biblical reference for those who might not recognize it.)
I never heard it described that way, but it was originally developed to force a bit more discipline in many error-prone errors in order to increase code reliability, traceability, readability, etc., etc. It has since fallen out of favor a bit after a few decades of demonstrating that bad coders will write crap no matter what language you hand them. It is still used in some areas, partly for legacy purposes. It used to be required on a lot of military projects, but not so much any more.
>sounds like the mother of all bloat to me.
I had to learn it in ancient times when I worked on airline avionics, but, fortunately, got a waiver to not actually have to develop with it.
> >-- > >r b-j rbj@audioimagination.com > >"Imagination is more important than knowledge." > >
Eric Jacobsen Anchor Hill Communications http://www.anchorhill.com
Eric Jacobsen <eric.jacobsen@ieee.org> wrote:

>On Wed, 02 Sep 2015 16:11:01 -0400, robert bristow-johnson
>>i was told that ADA was meant to "become all things to all men." (a >>biblical reference for those who might not recognize it.)
>I never heard it described that way, but it was originally developed >to force a bit more discipline in many error-prone errors in order to >increase code reliability, traceability, readability, etc., etc. It >has since fallen out of favor a bit after a few decades of >demonstrating that bad coders will write crap no matter what language >you hand them. > >It is still used in some areas, partly for legacy purposes. It used >to be required on a lot of military projects, but not so much any >more.
In my first trimester in grad school, I took a programming languages course from Susan Graham. Her words were that ADA "was an attempt at a very large language". Seems exactly right to me. Steve