The CrowdStrike Failure Was a Warning
A systems failure at CrowdStrike led to a global IT crisis affecting various sectors, emphasizing the risks of centralized, fragile structures. The incident calls for diverse infrastructure and enhanced resilience measures.
Read original articleA crucial systems failure at CrowdStrike triggered a global IT disaster affecting banks, airlines, and health-care systems, highlighting the vulnerability of hyperconnected systems designed for optimization over resilience. The incident underscores the risks posed by centralized, fragile structures in a world where small errors can lead to widespread crises. The interconnected nature of modern societies, driven by globalization and digitization, amplifies the potential for catastrophic, instantaneous risks. The CrowdStrike outage, caused by human error, serves as a stark reminder of the fragility of our digital infrastructure and the need for greater resilience. The author argues for a shift towards more diverse digital infrastructure, stringent testing protocols, and enhanced redundancy to mitigate future disasters. The incident serves as a warning that current systems prioritize optimization at the expense of resilience, urging for a reevaluation of our approach to designing and managing critical infrastructure to prevent similar crises in the future.
Related
Microsoft has serious questions to answer after the biggest IT outage in history
The largest IT outage in history stemmed from a faulty software update by CrowdStrike, impacting 70% of Windows computers globally. Mac and Linux systems remained unaffected. Concerns arise over responsibility and prevention measures.
It's not just CrowdStrike – the cyber sector is vulnerable
A faulty update from CrowdStrike's Falcon Sensor caused a global outage, impacting various industries. Stock market reacted negatively. Incident raises concerns about cybersecurity reliance, industry concentration, and the need for resilient tech infrastructure.
CrowdStrike debacle provides road map of American vulnerabilities to adversaries
A national digital meltdown caused by a software bug, not a cyberattack, exposed network fragility. CrowdStrike's flawed update highlighted cybersecurity complexity. Ongoing efforts emphasize the persistent need for digital defense.
CrowdStrike fail and next global IT meltdown
A global IT outage caused by a CrowdStrike software bug prompts concerns over centralized security. Recovery may take days, highlighting the importance of incremental updates and cybersecurity investments to prevent future incidents.
Global CrowdStrike Outage Proves How Fragile IT Systems Have Become
A global software outage stemming from a faulty update by cybersecurity firm CrowdStrike led to widespread disruptions. The incident underscored the vulnerability of modern IT systems and the need for thorough testing.
Crowdstrike is probably less bad than the alternatives that I have run into that are largely developed by very low cost engineers cough TrendMicro cough but even so, they aren't NT kernel engineers nor do they have the NT kernel release process.
Companies need to find ways to live without this crap or this will keep happening and it will be a lot worse one day. Self-compromising your own systems with RATs/MDMs/EDR/XDR/whatever other acronym soup needed to please the satanic CISSPs are just terrible ideas in general.
> A Microsoft spokesman said it cannot legally wall off its operating system in the same way Apple does because of an understanding it reached with the European Commission following a complaint. In 2009, Microsoft agreed it would give makers of security software the same level of access to Windows that Microsoft gets.
https://www.wsj.com/tech/cybersecurity/microsoft-tech-outage...
> This time, the digital cataclysm was caused by well-intentioned people who made a mistake. That meant the fix came relatively quickly; CrowdStrike knew what had gone wrong. But we may not be so lucky next time. If a malicious actor had attacked CrowdStrike or a similarly essential bit of digital infrastructure, the disaster could have been much worse.
Gee, the damage from an honest mistake (what does the author even base that on) is most likely easier to fix than the damage done by a malicious actor with bad intent. I feel so enlightened!
Literally every time I see stuff like this go down, the security software had exactly zero engineering research put into it whereas everything else did.
If people did this, CrowdStrike would either not exist or look completely different.
Servitization is a clever way to consolidate your perpetual licensed customers over to perpetual service contacts, while also further obfuscating and locking down the underlying operating environment.
This is in the best interest of Microsoft bottom line, at the expense of all private business, government, or anyone who values consumer experience really. It reduces the number of drive-by security incidents, but when WW3 happens and 75% of our economy is hosted in a whopping 12 datacenters across 3 companies I'm sure we'll be screwed. I mean just depth charging Google fiber today would probably take down 25% of the world economy.
No. It was just a failure. The warnings have been trumpeted for decades.
It should have been no surprise that the giant company that was trusted to secure our single source of OS software against "supply chain attacks" ended up committing the largest "supply chain attack" yet seen on Earth.
We are effectively still in the wild west. The gold rush has to end before we can truly civilize the place.
That's the real question here.
The root cause is NOT capitalism, nor is it users, Microsoft, or even CrowdStrike. You can't legislate, regulate, or "be more careful next time" your way out of this. Hell, blaming the users won't even work.
Here are 3 stories:
---
Imagine yourself as an inspector for the Army. The 17th Fortress has exploded this month, and nobody can figure out why. You've checked all the surviving off-site records, and are reasonably sure that the crates of dynamite that used to make up the foundations and structure of the cart were properly inspected, and even updated on a regular basis.
You more closely inspect the records, looking for any possible soldier or supplier who might have caused this loss. It might possibly be communist infiltration, or one of those pacifists!
You encounter an old civilian, who remembers a time when forts were built out of wood or bricks, and suggests that. But he's not a professional solder, what could he know.
---
Imagine you're a fire inspector. You've been to your 4th case this month of complete electrical network outage. This time, the cause seems to be that Lisa Douglas at Green Acres had Eb Dawson climb the pole, and he plugged in one too many appliances to the electricial.
If only there were a way to make sure that an overload anywhere couldn't take down the grid, and ruin so many people's days. You desperately want a day without house fires, and so many linemen being called out to test and repair circuits before connecting them back to the grid.
It will take some time before the boilers and generators get back on line from their cold re-start. In the mean while, business in town has ground to a halt.
The paperwork and processes to track and certify each appliance doesn't seem efficient.
There's this grumpy old guy who talks about fuses and circuit breakers, but he's just a crank.
---
The United States found itself embedded in yet another foreign entanglement in VietNam. There was a severe problem planning air strikes, because there were multiple sources required to plan them, and no single computer could be trusted with both of them. The strikes themselves were classified, but the locations of the enemy radar installations couldn't be trusted to the computers, because they were occasionally accessed by enemy sources. Thus the methods and means of locating the enemy radar equipment could become known, and thus rendered ineffective.
A study was done[1], and the problems were solved. There were systems based on the results of these studies[5], and they worked well.[2] Unfortunately, people thought that it was un-necessary to incorporate these measures, and they defaulted to the broken ambient authority model we're stuck with today. Here's some more reading, if you're interested.[3]
---
If you're bored... I've even got a conspiracy theory that explains how I think we actually got here, it it wasn't simply historical forces (which I think it was, 95% certainty).[4] If true, those forces would still be here today, actively suppressing any such stories.
[1] https://csrc.nist.rip/publications/history/ande72.pdf
[2] https://srl.cs.jhu.edu/pubs/SRL2003-02.pdf
[3] https://github.com/dckc/awesome-ocap
[4] https://news.ycombinator.com/item?id=40107150
[5] https://web.archive.org/web/20120919111301/http://www.albany...
Or maybe even just looking up the update online to see whether any problems had been reported before deploying it wholesale across their organizations.
Are these the same IT people whose systems all went offline in the left-pad incident because they 'accidentally' set their production servers to be dependent on a third-party repository?
I've worked at some low-budget places that didn't have much in the way of a vetting process, but even there auto-deploying unknown updates to third-party dependencies into production was always a capital N No.
Related
Microsoft has serious questions to answer after the biggest IT outage in history
The largest IT outage in history stemmed from a faulty software update by CrowdStrike, impacting 70% of Windows computers globally. Mac and Linux systems remained unaffected. Concerns arise over responsibility and prevention measures.
It's not just CrowdStrike – the cyber sector is vulnerable
A faulty update from CrowdStrike's Falcon Sensor caused a global outage, impacting various industries. Stock market reacted negatively. Incident raises concerns about cybersecurity reliance, industry concentration, and the need for resilient tech infrastructure.
CrowdStrike debacle provides road map of American vulnerabilities to adversaries
A national digital meltdown caused by a software bug, not a cyberattack, exposed network fragility. CrowdStrike's flawed update highlighted cybersecurity complexity. Ongoing efforts emphasize the persistent need for digital defense.
CrowdStrike fail and next global IT meltdown
A global IT outage caused by a CrowdStrike software bug prompts concerns over centralized security. Recovery may take days, highlighting the importance of incremental updates and cybersecurity investments to prevent future incidents.
Global CrowdStrike Outage Proves How Fragile IT Systems Have Become
A global software outage stemming from a faulty update by cybersecurity firm CrowdStrike led to widespread disruptions. The incident underscored the vulnerability of modern IT systems and the need for thorough testing.