July 21st, 2024

Let's blame the dev who pressed "Deploy"

The article addresses accountability in software engineering post-incidents like the CrowdStrike outage. It criticizes blaming developers and advocates for holding leaders accountable to address systemic issues. It emphasizes respecting software engineers' expertise.

Read original articleLink Icon
Let's blame the dev who pressed "Deploy"

The article discusses the accountability in the software engineering industry following incidents like the CrowdStrike outage. The author argues against blaming developers for bugs and outages, pointing fingers instead at CEOs, customers, hospitals, governments, middle management, and boards. The piece criticizes the lack of accountability among leaders who prioritize profit over public service and highlights the importance of respecting software engineers' expertise and decisions. It emphasizes that holding developers responsible without addressing underlying organizational issues is counterproductive. The author draws parallels with other professions like Structural Engineers and Anesthesiologists who receive respect for their work and suggests that software engineers should be treated similarly to ensure accountability. The article concludes by condemning the practice of scapegoating developers for systemic failures in the industry.

Link Icon 17 comments
By @Retr0id - 6 months
A slightly bewildering fact is that CrowdStrike's terms and conditions say not to use it for critical infrastructure:

> Neither the offerings nor crowdstrike tools are for use in the operation of aircraft navigation, nuclear facilities, communication systems, weapons systems, direct or indirect life-support systems, air traffic control, or any application or installation where failure could result in death, severe physical injury, or property damage.

(Originally in all-caps, presumably to make it sound more legally binding)

https://www.crowdstrike.com/terms-conditions/

By @Aurornis - 6 months
Responsibility lies on large numbers of processes, teams, and people.

I thought this blog might have some substance about proper postmortem investigations and how to evaluate and address the circumstances that led to a failure like this, but it has none of that. It’s just a very angry rant about CEOs and middle management. The premise is that engineers can’t bear any responsibility for their actions because they don’t get “respect”

This has to be the 10th time I’ve seen arguments that “blame” is the right action in this case, but with the key exception that we’re only allowed to blame people other than the engineers. The last article was a lengthy rant about how it’s actually QA’s fault and engineers shouldn’t be expected to ensure their own code is correct, therefore engineers are blameless.

This is empty calories for people who like ragebait, but nothing more.

By @JKCalhoun - 6 months
Sure, blame the engineer(s).

And from now on every engineer now has the autonomy to refuse to use certain libraries and software stacks they are unfamiliar with, can refuse to submit a change, will have control over whether to push out software they worked on, etc.

And a huge pay bonus as well since they have all this CEO-like risk/responsibility now.

By @pylua - 6 months
The practice of building software would have to completely be flipped on its head to support professional liability. Agile would be the first casualty.
By @jmclnx - 6 months
>and that they need 3 months of development then you better shut the f** up and let them do their job.

This in a way covers the issue. Never in all my decades developing have had an estimate accepted. All the developers get is "It must be done by ...., no exceptions".

So you end up pulling all nighters for weeks, and because you are tired, errors or bad decisions creep in. So yes, the issue is fully with upper management.

I left a company because a big project was about to start and I could see it would be a big cluster**. People who stayed told me that is what happened.

By @chrisjj - 6 months
> it makes sense to run EDR on a mission-critical machine, but on a dumb display of information

Major logic error. See how much chaos ensued when that display went down? That's why it needs protection to keep the display up. Dumb or not is irrelevent. It is mission-critical regardless.

By @softwaredoug - 6 months
OK but structural engineering and anesthesiology are sciences with a lot of hard data behind them…

… arguing for 100% test coverage or SOLID principles are more like philosophies and anecdotes without a lot of hard data supporting them.

Software engineering looks less like engineering and more like a lot of bike shedding conversations.

By @dkarter - 6 months
Weak post. Agree with others that this is just rage bait lacking substance.

Blaming the developers (or any specific individual/group for that matter) is a cop out, it’s easy and lazy and doesn’t get to the root of the problem, which is more often than not a lack of processes, tools, information and lack of time/desire from leadership to address “technical debt” (for lack of a better term), no matter how many times the devs bring that up.

When you blame an individual or a group you can close the case shut on the post mortem and not get to any substantive improvements, meaning this can and will happen again, just to somebody else.

That’s why blameless postmortem and a blameless culture is so important. This is a good article about that philosophy:

> My summary of blameless culture is: when there is an outage, incident, or escaped bug in your service, assume the individuals involved had the best of intentions, and either they did not have the correct information to make a better decision, or the tools allowed them to make a mistake.

https://www.gybe.ca/a-few-words-about-blameless-culture/

By @batch12 - 6 months
Sure, the dev is wrong, but so is the process that allowed their error to impact the product. If a single person can make this choice on accident, then a single person can make it on purpose either by being malicious themselves or being otherwise compromised. If the company has hung their entire security posture and operational success on the choices of one person, they have a problem. Especially a security company.
By @hypeatei - 6 months
It's very surprising that critical infrastructure has been "infected" by the compliance checkbox obsession rather than just being designed in a way that completely eliminates the need for any shitty off-the-shelf security product.

There is incompetence and complacency showing at every level after the CrowdStrike outage. Both the people selling CrowdStrike and the ones implementing it.

By @betaby - 6 months
There is no law requiring EDR. It's purely constructed nonsense copied by security clowns from one company to another. Now it's kind of mandatory. It has self regulated everything to the ground
By @potatoman22 - 6 months
As a developer, it's nice to shift the blame to other people, and other people do share some of the blame. But we also need to be responsible for our code functioning. It reads like the author is looking for anyone else to blame besides the devs.
By @gizmo - 6 months
The author argues that the CEO of CrowdStrike has failed upwards. I don't agree. CrowdStrike makes compliance software. CrowdStrike's main purpose isn't to provide protection against cyber attacks. Businesses don't care enough about that. Businesses do care about simplifying their compliance burden and limiting their liability when they get hacked. That's where CrowdStrike excels, and that's why CrowdStrike can charge so much for their services. It's not easy to build a 70 billion dollar business and CrowdStrike serves a real business need.

> I remember times when leaders had dignity and self-respect. They would go on stage and apologize

Etiquette rules change over time, but a constant throughout history is that people in power don't take responsibility voluntarily. The "good old days" where leaders had dignity and self-respect never existed.

> [...] delusional claim how software engineers should bear the responsibility for bugs and outages

The /opinion/ that people /should/ be held responsible can't be delusional. Apparently the author believes that software engineers should bear zero responsibility -- even when their software kills people -- because we don't get enough respect. I don't agree with that opinion but it's a bit rich to call other people delusional when making one unfounded claim after another.

The post as a whole is way too angry and too cynical for me.

By @giantg2 - 6 months
'But then the author engages in an absurd rant about how the entire software engineering industry is a “bit of a clusterfuck”,'

I mean, it is a bit of a cluster fuck.

Most of the issues I've seen are due to speed or cost. We don't have time for tests.we don't have time to record the business requirements in a collective place (just look through multiple JIRA stories and piece it together). We don't have money for dedicated QA roles. So of course problems happen. Luckily my team only works on a lower criticality site.

None of this is really a dev's fault. It's leadership and the culture they incentivise.

By @josephg - 6 months
You know, I sort of agree with this article while completely disagreeing with it.

The article says:

> You want software engineers to be accountable for their code, then give them the respect they deserve.

The problem is, respect is something that's taken as much as its something given. We can't even decide for ourselves if software is worthy of respect. Can anyone learn to code from a coding bootcamp, where getting a job is all that matters? Or is it a discipline that takes years to master, where mistakes can and will bring down the global economy? Are we glorified plumbers, or are we mathematicians and civil engineers combined?

If you see yourself as a code monkey, of course you can't be "held responsible" for the results of your work. Coding bootcamps don't teach infosec. Its up to your company to set good practices and your job is just to follow them.

Its only if you personally want to take your role in society seriously that it makes sense to consider not just your job, but the effect your job has on the wider world. I'm personally of the opinion that this mindset is almost always long term positive for your career. Its less "blame the dev for hitting deploy" and more "I'm the dev. No, I won't hit deploy on that code in the state its in."

I didn't go to a coding bootcamp. I went to university. There they forced all of us engineering & CS students to do an ethics course - which was actually fantastic. There they taught us about the Therac-25: a computer controlled radiation machine which killed a bunch of people. The engineers on the ground knew that it needed more testing, but the company insisted it was fine and pushed it out the door.

Here's the question: If you were one of those engineers, what would you do? If you knew, or suspected, that a bug in your code could bring down 911 services and hospitals, ground planes or give people a lethal dose of radiation, do you really trust your manager to make the engineering call? Do you think the CEO understands the risk that is being taken by hitting deploy?

Of course, if we're playing the blame game, the blame ultimately the blame falls on the CEO of the company or something.

But forget the blame game. You won't be fired. You won't lose your cushy job. The question is: Who do you want to be in situations like this? Think about this question now. You won't have time in the moment when your boss tells you to hit deploy, and you have second thoughts.

Me? I want to be someone who would say no.

By @skwirl - 6 months
> If a software engineer tells you that this code needs to be 100% test covered, that AI won’t replace them, and that they need 3 months of development—then you better shut the fuck up and let them do their job.

This is so naive that it is impossible to take the author seriously.