July 20th, 2024

CrowdStrike broke Debian and Rocky Linux months ago

CrowdStrike's faulty update caused a global Blue Screen of Death issue on 8.5 million Windows PCs, impacting sectors like airlines and healthcare. Debian and Rocky Linux users also faced disruptions, highlighting compatibility and testing concerns. Organizations are urged to handle updates carefully.

Read original articleLink Icon
CrowdStrike broke Debian and Rocky Linux months ago

CrowdStrike's problematic update caused a widespread Blue Screen of Death (BSOD) issue on Windows PCs, affecting various sectors like airlines, banks, and healthcare providers. The crash was triggered by a faulty channel file delivered through the update, impacting 8.5 million Windows PCs globally. Surprisingly, similar disruptions were also experienced by Debian and Rocky Linux users due to CrowdStrike updates, indicating potential risks for daily operations. In one instance, a Debian Linux lab faced simultaneous crashes after an update, revealing compatibility issues and inadequate testing by CrowdStrike. The delayed response and lack of inclusion of Debian Linux configurations in testing raised concerns among users. CrowdStrike users on RockyLinux 9.4 also reported server crashes due to a kernel bug, emphasizing the need for rigorous testing across all supported configurations to prevent future disruptions. Organizations are advised to approach CrowdStrike updates cautiously and have contingency plans in place to mitigate such issues.

Link Icon 21 comments
By @lambdaone - 3 months
What gets me is that much of the OSS/Linux ecosystem consists of thousandas of lashed together piles of code written by independent and only very loosely coordinated groups, much of it code and lashed together by amateurs for free, and it is still more robust than software created by multi-billion dollar corporations.

Perhaps one reason is that OSS system programmers are washing their dirty linen in public; not a matter of "many eyes make bugs shallow", but that "any eyes make bad code embarassing".

Just for example, I'm planning to make one of my commercial projects open source, and I am going to have to do a lot of fixing up before I'm willing to show the source code in public. It's not terrible code, and it works perfectly well, but it's not the sort of code I'd be willing to show to the world in general. Better documentation, TODO and FIXME fixing, checking comments still reflect the code, etc. etc.

But for all my sense of shame for this (perfectly good and working) software, I've seen the insides of several closed-source commercial code bases and seen far, far worse. I would imagine most "enterprise" software is written to a similar standard.

By @andyjohnson0 - 3 months
Relevant comment from yesterday's Crowdstrike mega-thread:

"Crowdstrike did this to our production linux fleet back on April 19th, and I've been dying to rant about it." [1]

Continues with a multi-para rant.

[1] https://news.ycombinator.com/item?id=41005936

By @paholg - 3 months
I used to work in this space, and I always had the nagging question of "is any of this stuff actually useful?"

It seems a hard question to answer, but are there any third party studies of the effectiveness of Crowdstrike et al. or are we all making our lives worse for some security theater?

By @can16358p - 3 months
Product quality is on freefall: from aircraft to software. Lack of QA is the norm nowadays as everyone just care about the extra penny.
By @jsheard - 3 months
There was also this report of CrowdStrike injecting a buggy DLL into Windows applications, which could cause the app to crash through no fault of its own:

https://x.com/molecularmusing/status/1808756095860543916

By @neilwilson - 3 months
This is all a consequence of firms being able to contract out of consequential liability.

Perhaps we should render such clauses unenforceable, as we do with contracting out of consequential loss of life.

Or at least limit them.

By @smsm42 - 3 months
> The update proved incompatible with the latest stable version of Debian, despite the specific Linux configuration being supposedly supported.

> The analysis revealed that the Debian Linux configuration was not included in their test matrix.

This is suspiciously close to actual fraud. They declare they support configuration X, but they actually do not do any testing on configuration X. That's like telling me my car will have seatbelts, but in no place in manufacturing it is ensured the seatbelts are actually installed and work. I think a car maker that does something like that would be prosecuted. Why Crowdstrike isn't? I mean, one thing if they don't support some version of Linux - ok, too many of them, I can get it. But if you advertise support for it without even bothering to test on it - that's at best willful negligence and possibly outright fraud.

By @righthand - 3 months
“No one noticed” which is a cute way to say that Crowdstrike suppressed the media noticing. The day of the bug, the HN post had comments about how people tried reporting the issue months ago.

Even the article is written as people noticing. So who didn’t notice? Or were the issues not popular enough to be not ignored?

By @yolo3000 - 3 months
Is anyone here using Crowdstrike, what does it do? I see it referred to as an 'anti-virus'? I have it installed on my work laptops and I see it as a keylogger and activity monitor. "I got nothing to hide", but still bothers me when some corporate super users spy on me.
By @3np - 3 months
Anecdote from SWIM: Employer corp supposedly has CS deployed on all endpoints. Been getting away with just running it in a VM with restricted resources and not hearing anything about it. Did notice the VM failing around that time.

Also heard from an ex-coworker in another large corp where IT just gave up on enforcing compliance for Linux endpoints. I wouldn't be surprised if some IT admins effectively adopt a "don't ask don't tell" policy here: If you can figure out the alternative stack by yourself without causing noise or lying about it, you're on your own. It'd certainly make sense if the motivation for enforcement is largely checkbox- compliance.

I wonder just how widespread this kind of admittedly malicious compliance is and how much it contributed to the April incident not being bigger news...

By @pipes - 3 months
What is cloud strikes unique selling point? Genuine question, because I'd never heard of them before this.
By @kermatt - 3 months
Is it possible that these events had less impact because the damage was less / more easily fixed due to the nature of the OS?

Or perhaps because the admins of Linux systems are typically more knowledable about how to run their platforms, and not just install them?

Or is it due to sheer numbers of enterprise software running on Windows?

By @romaniitedomum - 3 months
Not just Debian and Rocky, but RHEL too. https://access.redhat.com/solutions/7068083

I ran into this doing CentOS7 to Alma9 upgrades. The bug was in RHEL, Alma and Rocky and any other distro derived from RHEL. I had a VM go into an endless reboot cycle and the only way to get back in was to boot to an emergency rescue console and disable falcon-sensor.

The problem was something to do with eBPF, and one of the workarounds was to tell falcon sensor to use kernel mode and not auto or user (bpf) mode.

We don't allow automatic updates on hosts, however, so thankfully this was contained, but it certainly begs the question of just what testing Crowdstrike are doing.

By @utensil4778 - 3 months
Huh, this story sounds familiar. I read a HN comment the other day telling this same story. They didn't just turn a random HN comment into a news article, did they?

Yup. They did. At least they cited it I suppose.

By @xyst - 3 months
Need a “crowdstrike.sucks” or maybe a general “it.sucks/{company}” to gather all of these company misgivings.

Avoid this company at all costs. Move to competitors which offer the same “audit passing” requirements

By @egorfine - 3 months
> CrowdStrike should prioritize rigorous testing across all supported

They should not.

Testing costs money and they aren't selling their product to a company that needs or wants it on a competitive market. Their business model is based on shoving the product down the throat of enterprises due to compliance and therefore they have zero incentive to invest any money into quality.

By @bigcat12345678 - 3 months
I think someone noticed. And was thinking: People wont be happy to fix this and I am not allowed to fix either. Well, it might be just like the 3000 other rare issues that would one day break the world's IT. Who cares...
By @kayo_20211030 - 3 months
In the end, because of regulatory pressure, the only pressure that matters in a commercial environment, there will be three supported OS's: a windows one, a mac one, and probably one, and only one, Linux distribution, or flavor thereof. Everything else will be toast in a commercial environment. For Linux there might be this AWS one, and that Google one, but they'll be close. And, in order to satisfy regulatory requirements, they'll be very, very close. Commercial organizations have bosses and, more ominously, regulators. We, and they, need a checkbox checked. So let's not fool ourselves with thoughts of freedom and liberty. There's a real world out there.

CrowdStrike screwed up, but there's more chance that a 1000 linux's go to one than 1 CrowdStrike goes to zero.

By @lkdfjlkdfjlg - 3 months
> ( ... ) experienced significant disruptions as a result of CrowdStrike updates, raising serious concerns about the company's software update and testing procedures

To me the issue isn't CrowdStrike's testing procedures. To me the issue is why does Debian depend on CrowdStrike? Does anyone understand this?