July 2nd, 2024

DevOps Isn't Dead, but It's Not in Great Health Either

DevOps remains relevant but faces challenges. The New Stack explores its status in software development, stressing collaboration between teams, talent shortages, and providing insights to industry professionals.

Read original article

DevOps Isn't Dead, but It's Not in Great Health Either, according to The New Stack. The article discusses the state of DevOps and its relevance in the current software development landscape. It emphasizes the importance of understanding the challenges and opportunities faced by DevOps practices. The piece also touches on the need for collaboration between development and security teams, highlighting key issues such as role definition, communication, and competing priorities. Additionally, it mentions the impact of talent shortages on cloud-native journeys. The New Stack aims to provide valuable insights and resources to software engineering leaders and developers through its content and newsletters.

Maximizing Terraform modules for platform engineering

The New Stack website focuses on Terraform modules, community engagement, DevSec challenges, software development topics, and industry trends. It offers resources, articles, and newsletters for software engineering leaders and developers.

A Eulogy for DevOps

DevOps, introduced in 2007 to improve development and operations collaboration, faced challenges like centralized risks and communication issues. Despite advancements like container adoption, obstacles remain in managing complex infrastructures.

DevOps: The Funeral

The article explores Devops' evolution, emphasizing reproducibility in system administration. It critiques mislabeling cloud sysadmins as Devops practitioners and questions the industry's shift towards new approaches like Platform Engineering. It warns against neglecting automation and reproducibility principles.

Bad habits that stop engineering teams from high-performance

Engineering teams face hindering bad habits affecting performance. Importance of observability in software development stressed, including Elastic's OpenTelemetry role. CI/CD practices, cloud-native tech updates, data management solutions, mobile testing advancements, API tools, DevSecOps, and team culture discussed.

Developer experience: What is it and why should you care? (2023)

Developer experience (DevEx) optimizes software development by empowering behaviors naturally. It enhances productivity, satisfaction, and collaboration among developers, leading to improved business outcomes. Generative AI and continuous feedback play key roles in DevEx advancement.

5 comments

By @tetha - 10 months

Mh. One thing I observe at work is: Increasing what I call the technical relativistic speed of deployments to insane levels is either impossible or trivial. Like, we have C++ code in the company with bindings to WIN32 APIs. We're not talking about speed with those things. But for a lot of relatively modern software, it's pretty easy to implement automation that can make a fairly robust deployment take a few minutes. Containers make this easier, but you can have the same thing with ruby (capistrano), python (liberal use of venvs), Java and so on on VMs as well. Most of this is just a bit of config management or container orchestration config.

However, quite a few dev-teams make many very stupid decisions and suddenly your 2 minute long deployment without downtime requires 4 weeks of coordination with customers, because people figured to include a breaking change in their API, opposed to some incremental evolution. Or because people implemented a big-bang database migration, which will take hours of downtime, opposed to some incremental 2-3 step database model evolution. Or they pile 23 steps you "have to manually do after a deployment, but just maybe and not always and it's not obvious when" on top. Or because people get scared because of either bad decisions or dumb stuff just happening. Or because people don't understand different rollout strategies to hedge risk and then they get scared and then they don't deploy and then everything explodes once they deploy a crapton of stuff at once. Which, naturally, catches on fire.

The question of how to rollout a change smoothly, silently and with little coordination seems to be an entirely alien black magic to some people.

And from there, a bunch of our new applications and services with less baggage actually become slower than our old veteran products with loads of experience and really ugly baggage behind them.

It's just utterly strange to me that we are approaching a level in which we in ops can confidently reconfigure, failover and restart database clusters on git-push and some dev-teams are so worried about swapping stateless code in their system.

By @hodgesrm - 10 months

> “It may be that the ubiquity of DevOps practices has allowed developers and organizations to increase the complexity of projects they are involved in, counteracting the benefits to development velocity..."

This. There's a thicket of projects and products to automate software delivery that in the in end are no better than a shell script. (And in some cases a lot worse.) Github actions are a case in point. They are fine for relatively simple tasks but generate inscrutable errors as the workflows become more complex. My team has spent countless hours debugging issues when dependencies change unexpectedly in runners.

By @radiator - 10 months

I have seen developers do quite complex devops things, like:

- write one helm chart as a base, but then one more deployment helm chart for every environment. The deployment helm charts only differ in values.yaml, not in the templates, and they override several values -- but not all. It reminds of Object Oriented inheritance, but using helm. For bonus points, a value in the base and deployment chart might be named slightly differently. In order to know which one is used and which one is just a footgun, you must always check the templates.

- write a shell script which creates a git commit, which changes the version of a helm chart in some branch. The shell script is triggered by a github action. Then the gitops system notices the changed version in said branch, and synchronises K8s to reflect the change.

Those things might work half the time, but they are brittle and break the rest of the time. It can be a full time job to maintain this automation. It is a question if they would be better off deploying everything manually.

By @mr_toad - 10 months

> Worse still, these days, 41% of users report taking more than a week to restore service. In 2020, 34% could get things back up and running in just over a week.

It’s probably just an increase in the number of people who say they’re doing DevOps but really aren’t. I’ve worked with a number of organisations who claim to be doing DevOps but are still doing everything manually.

DevOps Isn't Dead, but It's Not in Great Health Either

Related

Maximizing Terraform modules for platform engineering

A Eulogy for DevOps

DevOps: The Funeral

Bad habits that stop engineering teams from high-performance

Developer experience: What is it and why should you care? (2023)

Related

Maximizing Terraform modules for platform engineering

A Eulogy for DevOps

DevOps: The Funeral

Bad habits that stop engineering teams from high-performance

Developer experience: What is it and why should you care? (2023)