July 10th, 2024

The Seal Failure in the SRB That Doomed Challenger

The article delves into the Challenger disaster, attributing it to seal failure in the Solid Rocket Booster due to management decisions. It stresses the importance of proper O-ring design and assembly in rocket motors.

Read original articleLink Icon
The Seal Failure in the SRB That Doomed Challenger

The article discusses the seal failure in the Solid Rocket Booster (SRB) that led to the Challenger disaster in 1986. It highlights two critical management decisions at NASA that contributed to the tragedy: the poorly-designed O-ring seal joints and the decision to fly in colder conditions than recommended. The author emphasizes the importance of proper design and assembly of O-ring seals in rocket motors, explaining the significance of a single O-ring design versus a dual O-ring design. NASA's insistence on using a flawed two O-ring design and obstructing the pressurization path with putty increased the risk of O-ring failure, allowing hot gases and solids to breach the seal. The article provides detailed insights into the design requirements for O-ring seals in solid rocket motors and the consequences of deviating from established guidelines. It also touches on the challenges and complexities of engineering efforts, emphasizing the blend of science, art, and luck involved in rocket science and other engineering disciplines.

Link Icon 17 comments
By @Paul-Craft - 9 months
Challenger did not explode because of a "seal failure." That tragedy was entirely preventable. At least one of the engineers raised the alarm, saying that because of the recent cold weather, they couldn't guarantee that the O-ring would perform properly. But, the big wigs disregarded that warning, and seven people paid for that mistake with their lives.

No, it most certainly was not a seal that failed. It was an organization that failed. Unfortunately, it's harder to fix an organization than it is to design an O-ring that won't become brittle from sitting out a few hours on a cold night.

By @hilbert42 - 9 months
"...the crew did not die in the tank explosion and subsequent ripping-apart of the orbiter by air loads. ...The crew was still alive in the orbiter cabin until it finally hit the sea, which is about a 200-gee stop, since it hit dead broadside."

Anyone around at the time vividly remembers this horrible tragedy. My memory was unexpectedly reinforced only days later when I came across a memorial to the crew in the Smithsonian museum.

Perhaps those in NASA were aware that the crew were (or would have assumed to have been) alive until they hit the water but if I recall that knowledge wasn't available to the GP.

I assumed, like I suppose many, the crew were killed outright at the time of disintegration and that would have been the most merciful outcome. That it wasn't even now fills me with horror and I shudder to think about it. The crew's final moments must have been sheer terror.

By @mihaaly - 9 months
The overuse of emphasization (heavy phrases, all caps, underline, bold, italic, exclamations, COMBINED!! :) ) for the sometimes later coming clear statement of facts (not needing emphasis because speak for themselves) is an irritating read. Aggrevated by tangential updates in prime location of the start (instead of end) derailing attention right before started. Educationally the style is very obstuctive. But pretty useful writing still after pushing ourselves through, even after decades of thousand articles into this topic.
By @vrinsd - 9 months
Anyone who wants a genuinely detailed treatment of this subject should read Allan McDonald's book "True, Lies O-rings" *. I happened to have finished this a few weeks ago and it goes on my list of all "engineers should read this".

This was really one of the most fascinating books I've read and likely the most definitive treatment of the subject by a subject matter expert. I kind of skimmed the blog article, the book explains in critical detail the issues with the original design and why the re-design (done after the disaster) was a much more robust approach.

In a nutshell the Shuttle SRB field-joint design was taken from a Titan missle design that was deemed to be "solid engineering" because none had blown up, but Allan mentions the SRB field-joint was flawed from the start and the joints suffered rotation and physically moved / flexed. (Later, it turns out a Titan missle exploded and the teardown showed the o-rings a primary point of failure).

Allan mentions it was the blowby past the o-rings that was consistently the issue and the engineers wanted to understand and address this problem for a long time.

What was striking to me, beyond the technical aspects of making these things work is the actual cover-up and attempt on NASA+Thyokol to blame McDonald and others for the resulting disaster. I knew of some parts of this, but you don't realize how messed up the situation was/is until you read the book.

Personally I'd ignore any negative reviews of the book, I think non-engineers, especially those who haven't worked in an Aerospace/Defense environment or in a big company might think Allan is arrogant or boasting, but he starts by providing the foundation for his statements before getting into the details which is a classic "engineer's engineer" way of thinking.

* https://www.goodreads.com/author/show/2101296.Allan_J_McDona...

By @cryptonector - 9 months
This is the best explanation I've seen yet.

The summary is that you want a buffer of air between the hot gasses of the running motor and the o-ring seal such tat the air gets compressed but remains between the hot gasses and the o-ring thus insulating the o-ring from the hot gasses. But to avoid a pressurization test NASA went with a two-o-ring scheme where the space between them is pressurized, which forces the inner o-ring to be on the wrong side of where it should be, thus leaving little or no air buffer between that o-ring and the hot gasses. That in turn can cause point failures in the inner o-ring which will result in concentrated jets of hot gasses impinging on the outer o-ring which then cannot hold (because the pressure isn't uniform across the o-ring's circumference, instead it's concentrated on a point). Add the cold o-ring brittleness and boom.

Is SLS still using this o-ring design?? I sure hope not.

By @yodelshady - 9 months
Good article, I've seen this covered from the materials science and system engineering perspective before but not the mechanical perspective.

Ask any first year materials science graduate how Challenger failed and they'll confidently tell you about glass transition temperatures in fluoropolymers, but if any chartered engineer gave you that answer, fire them. People in the room at the time knew about that, but somehow a clear warning became a point of uncertainty became a minor interest became a footnote.

What I find more interesting is, ask any first year economist about 2008 and they'll tell you about Gaussian risk cupolas. Somehow in that field sticking with the level one explanation as if the PhDs in the room there didn't know is accepted.

By @incorrecthorse - 9 months
> Those two flight deck pilots had breathed-up all the oxygen in their breathing packs by the time they hit the sea, something confirmed by the empty breathing packs that were recovered. Which means they were alive when they hit the sea!

I don't understand how this follows. The best scenario is that they had their last drops of oxygen around hitting the sea; in other scenarios they died from lack of oxygen before hitting the sea.

By @simple10 - 9 months
In my college statistics class, we learned MatLab and R using data from the Challenger to recreate why the engineers thought the o-ring might fail and raised the warning. I don't remember the specifics of all the stats, but it was fascinating and the data is publicly available.

Here's a blog that shows some of the analysis (from google search): https://byuistats.github.io/Statistics-Notebook/Analyses/Log...

By @Ringz - 9 months
From the Wikipedia Article:

„Modified SR-71 Blackbird ejection seats and full pressure suits were used for the two-person crews on the first four Space Shuttle orbital test flights, but they were disabled and later removed for the operational flights.“[1]II-7

But I think this would not have helped the astronauts on the middle deck.

[1]: Jenkins, Dennis R. (2016). Space Shuttle: Developing an Icon – 1972–2013. Specialty Press. ISBN 978-1-58007-249-6.

By @dboreham - 9 months
Interesting reading. Did not glean from multiple books and documentaries that they "added another layer of turtles" to the O-ring design.
By @lupusreal - 9 months
Good explaination. Too many people think the cold launch day and NASA culture that allowed a launch on such a day was the only issue. It's less widely known that the design was fucked from the start and wasn't working properly for any of the Shuttle launches.
By @phongn - 9 months
A combination of events conspired to cause the seal failure and blow-through, yes. But it wasn’t just Max-Q that defeated the slag, but that they experienced much stronger high altitude winds in flight than any flight before or after.

They rolled 20 at liftoff and then a 1 at altitude and that’s all she wrote.

NASA came close to getting away with it. They were due to introduce a redesigned field joint in late 1986 based off the lightweight solid boosters planned for the USAF missions out of SLC-6. Had they launched Challenger a day earlier or later they might have proceeded to the new joint, and no catastrophic seal failure.

By @csours - 9 months
I just finished "The Undoing Project" - about Kahneman and Tversky. It covers quite a lot of territory, but the title is about mentally 'Undoing' disasters.

Generally speaking, people pick a proximate human action or inaction as the keystone for preventing the problem.

In the book, they give the example of going back in time to kill Hitler - but people often don't decide to go back in time to buy Adolf's art - and then one of them suggests that even something as small as another sperm or no sperm 'winning' that particular race would disrupt history just as well.

It is much more satisfying to think about killing Hitler than it is to think about throwing a rock at his parent's window.

---

It is much more satisfying to think about NASA administrators taking the warnings seriously than it is to think about all the ways the culture and incentives were messed up. You can see a particular decision that was WRONG.

Finding fault with a person is a shortcut to mental satisfaction, but it will only at best fix one problem, and at worst will find the person who 'rolled the dice' wrong, or who picked the wrong lottery numbers. That is, you can find a person who was standing next to the cause of the problem, but any other person in that same spot would have the same odds of causing the same problem.

---

I've also been thinking about learning organizations - any org that wants to accomplish really big things has to be able to learn.

I'd love to hear of any personal experience of contracts that allow for learning. I think it's possible, but usually discouraged because contracts are written defensively, and learning involves a great deal of trust.

Its very clear in this case that NASA culture was deeply cynical and brittle. As a government organization they felt they could not show any failure or waste, and this must certainly have wormed into their group and personal psychology.

In contrast, SpaceX has demonstrated what a learning organization looks like - it looks like public failure. I emphasize IT LOOKS LIKE public failure. Learning means not being embarrassed about test rockets blowing up spectacularly. It means that you collect your data and improve, and try again.

To be sure, this would not work as well (or at all?) with a publicly traded company, and it certainly would not work with a government organization.

By @WalterBright - 9 months
I'm always amazed at how those big rocket engines, with all that heat, pressure, vibration, etc. do not just blow up but are actually light enough to fly. I see the bells on the Saturn V engines glowing yellow and orange and how in hell does that hang together!

I don't know how I could design such a thing, because my spidey sense says "it'll never fly!"

As for being an astronaut, nope. I quote Gimli, the first astronaut: "Certainty of death. Small chance of success. What are we waiting for?"

I remember at the time of the Challenger disaster, some of the other teacher candidates said "but they told us it was safe!" Come on, how could that giant flaming bomb ever be considered safe by a sane person.

My favorite technical book on rockets starts out saying "things that burn and explode" which in my mind is exactly what rockets are.

By @imemyself - 9 months
Not sure if this was posted because of the book - but a book on Challenger was released a month or two ago (https://www.amazon.com/Challenger-Story-Heroism-Disaster-Spa...).

I just finished reading and would strongly recommend it to anyone interested in Challenger or aerospace in general. One of my better reads in the last few years.

And also infuriating to read...my previous impression was that there was some concern about cold weather + the o-rings, and one guy thought they shouldn't launch.

But the management mistakes were far more grievous than I realized. There was a repeated pattern of near misses on the SRB's over the years before Challenger, and most engineers working on the SRB's felt very strongly that they should not launch. The previous coldest launch was 15+ degrees warmer than Challenger's, and came very very close to failure itself.

(And while it ended up not being what killed them, Rockwell, the folks who build the Shuttle itself, also did not want to launch, out of concerns about ice).

By @librasteve - 9 months
very good explanation. surely there must have been engineers on the team that knew the NASA design was unproven - by NOT whistleblowing they killed the crew as surely as anyone else - chickenshits