October 31st, 2024

Get me out of data hell

A senior engineer reflects on the chaotic experience of working with a complex data warehouse, criticizing the toxic culture and inefficiencies, while valuing the lessons learned and camaraderie among teammates.

Read original articleLink Icon
Get me out of data hell

In a reflective piece, a senior engineer describes the chaotic and frustrating experience of working with an enterprise data warehouse platform, humorously dubbed the "Pain Zone." The engineer highlights the convoluted architecture, which involves over a hundred operations for a task that should be straightforward, leading to inefficiencies and confusion. The culture within the organization is critiqued for fostering a sense of dread and disempowerment among engineers, who feel pressured to work quickly despite the poor quality of the codebase. The engineer recounts a specific day of navigating the Pain Zone, where they discover that logs are filled with nonsensical data, making it impossible to determine if data is being successfully processed. Despite the challenges, the team maintains a camaraderie that helps them cope with the absurdities of their work environment. The engineer plans to leave the company soon, viewing the experience as a painful but valuable lesson in resilience and craftsmanship. The narrative serves as a commentary on the pitfalls of poor data management practices and the impact of workplace culture on software engineering.

- The enterprise data warehouse platform is overly complex, with unnecessary operations complicating simple tasks.

- A toxic culture of fear and judgment hampers engineers' ability to work effectively and prioritize quality.

- The team relies on camaraderie to navigate the challenges of their chaotic work environment.

- The engineer plans to leave the company, viewing the experience as a lesson in resilience.

- Poor data management practices lead to significant inefficiencies and confusion in the workflow.

Link Icon 48 comments
By @baazaa - 6 months
People always say this guy just has had bad luck with his employers but I live in Melbourne and work in data and reckon the whole industry is a scam.

Like why didn't anyone catch the issue with the logs? Because it doesn't matter, every data team is a cost-centre that unscrupulous managers use to launch their careers by saying they're big on AI. So nothing works, no-one cares it doesn't work, most the data engineers are incapable of coding fizzbuzz but it doesn't matter.

People always wonder why banks etc. use old mainframes. There's like a 0% success rate for new data projects. And that 0% includes projects which had launch parties etc. but no-one ever used the data or noticed how broken it was. I don't think a lot of orgs which use data as core-infra could modernize, the industry is just so broken at this point I don't think we can do what we did 30 years ago.

By @iamthepieman - 6 months
I do not use this term to refer to myself. I respect those who do and respect the meaning behind it but am just old enough that it feels alien to me 99% of the time.

But I am SO triggered by this piece. I had that intrusive feeling you sometimes get when driving where you think, "I could just close my eyes and see what happens", "Or that clif is so close and the guardrail doesn't really extend far enough"

Only for my career. Like I should just not show up on Monday. I should get in the car and drive far away and change my name and work at a nice retail joint in a mid-sized town.

I'm going to need to sit and stare into the distance for an hour and 3.

By @reverius42 - 6 months
> I've even degraded team morale because I've convinced some of the engineers that things should be better, but not management, so now some of the engineers are upset.

Oof, that hits a little close to home.

By @tokinonagare - 6 months
> At two of the four businesses I've worked at, the most highly-performing engineers have resorted to something that I think of as Pain Zone navigation. It's the practice of never working unless pair programming [...] The fear and dread comes from a culture where people feel bad that they can't work quickly enough in the terrible codebase

Exactly why I burned-out at work, worked at most 2 hours per day on a good day and finally was ejected from the project after a PM that graduated last year from school noticed and went after my head. Author is a wizard for describing the situation this well.

It's been 3 days I've been free from the tyranny of Jira and project managers, and I worked more on my personal projets than I did in a week at my former workplace.

By @ctippett - 6 months
> of course, we're serverless, because how can you hurt yourself without a cutting-edge?

A beautiful epigram.

By @bartread - 6 months
I’m going to read the rest of this. I’m enjoying it. But, simultaneously, part II has me so triggered - it bears striking resemblance to repeated situations I’ve encountered where the meaning and content of columns in a relational database were overloaded in varying degrees of heaviness (which is a practice I absolutely detest) - that I need to take a short break.
By @zombiwoof - 6 months
Data “engineering” is where all the cool kids go with no clue and create insane architectures to justify their incompetence
By @halfcat - 6 months
What’s the solution to wrangling these data projects?

The author’s experience is not far off from my own.

1. Any solution in place can only be understood by the person who created it

2. ”No, we can’t change that because then we’d have to validate everything from scratch again”

And therefore, as the author says:

> ”we'll continue with the work instead of fixing the critical production error”

I’m honestly not sure how to address it either. With traditional software dev we’d write tests, incorporate those into CI/CD, and start to course correct. We can use sample data to validate the code does what we think it does and that we didn’t break it.

But in these data projects, it’s not only the code that’s changing, but the data is also a moving target. You can write a test with sample data, but tomorrow your data might change because someone in sales added a custom field to the CRM, or IT upgraded the accounting software and all of the unique IDs changed, or someone upgraded their Excel version, or whatever.

And your code that works on the sample data needs to handle all of this, which obviously it can’t. You can try to validate the data somehow, check the schema, check if the number of rows hasn’t doubled or halved, and so forth, and then stop it from importing until you look into it, but also you can’t stop inbound data because an exec has a meeting in a few hours and expects their report to be updated.

I heard something about “data contracts” that’s supposed to address this, but it sounds like the next in a long line of buzz words intended to get management to buy another data product.

Has anyone worked in this kind of project that went well?

By @holden_nelson - 6 months
I went down the rabbit hole of this blog after reading this post. This person's blog is amazing. I particularly appreciated this piece: https://ludic.mataroa.blog/blog/quitting-my-job-for-the-way-...
By @tofflos - 6 months
> Like why didn't anyone catch the issue with the logs?

I see questions like these a lot and every time I feel that people immensely underestimate the effort required for curating data. In my experience data can only ever be as good as what it's being used for and in this story the logs haven't been used for this purpose before so they're not going to be any good.

It's some sort of data variation on the second law of thermodynamics - entropy is winning. Going in with the expectation that things should be better will only lead to frustration.

By @jauntywundrkind - 6 months
The observability world still regards itself as a system for monitoring, but reading (and sometimes seeing) how these systems just go so bad continues to drive a conviction that perhaps their strategies and tools should become bigger. That they should converge with business pipines.

We shouldn't just have wide events/big spans emitted... We should have those spans drive the pipeline. Rather than observability being a passive monitoring system, if we write code that reacts to events we are capturing, then we shuffle towards event sourcing.

Given how badly coupled together with shoestring glue & good wishes so many systems are, how opaque these pain zones are, it feels like the centralization upon existing industry standard protocols to capture events (which imo include traces) is a clear win.

(Obvious downside, these systems become mission critical, business process & monitoring both.)

By @brianhorakh - 6 months
Wonderful prose. I am in Melbourne also. I possibly used to work at the same place but I'm not sure.

I resigned due to the night terrors caused by the cyber security issues I saw everywhere. The more I explored and understood the more sleep I lost.

By @pards - 6 months
> The word enterprise means that we do this in a way that makes people say "Dear God, why would anyone ever design it that way?"

Thank you for this phrase; I'll quote it at every opportunity.

By @salt-thrower - 6 months
Beautifully written and fun to read. Blog posts like this give me a boost of mental strength to keep going during my worst episodes of burnout.
By @pxc - 6 months
This blog post rescheduled all my appointments, tucked me in, sang me a lullaby, then woke me up with coffee and breakfast late the next morning. I am healed.

For real, a fun and refreshing read (if also a little haunting).

By @hinkley - 6 months
> The word enterprise means that we do this in a way that makes people say "Dear God, why would anyone ever design it that way?"

I feel this comment in my bones.

By @Muromec - 6 months
Great piece of writing from someone who truly cares about craft and suffers from the feeling that this craft is not what they are paid for.

Add: for people who sharer the feeling -- you can work in a place where velocity isn't all, managers are not assholes and you can dedicate yourself to craft.

By @hinkley - 6 months
This is the “I saved my company half a million dollars in about five minutes” person who got in trouble for saving them half a million dollars.
By @jitl - 6 months
I wonder what company they’re describing here. It sounds like so many self inflicted problems that that you could undo or set right in a couple of weeks if you had the time and latitude to make changes across the system instead of being confined to a small area of team ownership.
By @marcosdumay - 6 months
The link about Scrum has a link to this:

https://agile2.net/

Can someone, please, tell me this is a joke. Because I can't be certain, but it doesn't look like one.

By @jaygreco - 6 months
I really like the author’s writing style here. The quips about the tea especially.
By @snidane - 6 months
Looks like the classic mistake of every data team. Every single office person works with data in one way or another. Having a team called 'data' just opens a blanch check for anyone in the organization to dump every issue and every piece of garbage to this team as long as they can identify it as data.

That's why you build data platforms and name your team accordingly. This is much easier position to defend, where you and your team have a mandate to build tools for other to be efficient with data.

If upstream provides funky logs or jsons where you expect strings, that's for your downstream to worry about. They need the data and they need to chase down the right people in the org to resolve that. Your responsibility should be only to provide a unified access to that external data and ideally some governance around the access like logging and lineage.

Tldr; Open your 'data' mandate too wide and vague and you won't survive as a team. Build data platforms instead.

By @erulabs - 6 months
As a regular old “platform engineer” I fight to ignore “data platform” tasks. There’s no target to hit, it’s just moving sand around a sand box.

If you want an answer to a specific question, we can spin up a read replica and a Metabase and write a query in an afternoon, cool. I’ll get you a chart, we’ll move on. If you want “a data analytics platform to enable blah blah blah” I’m out, I can’t do it. My eyes won’t focus, my hands stop moving.

Developers sometimes tell me stuff like “Kubernetes is too complex”, “jeez React is a pain”. I send those quotes to my friends stuck writing 195 step DAGs to transform log files from s3 into s3 so they can eventually land in s3 - ah yes but they’re parquet somewhere in between, and that matters for some reason. We laugh together, but I can see it hurts them more than I intended.

Life is too short to faff about doing nothing. Go join a company with less than 100 engineers and learn to be happy again. Let the enterprises burn, we’ll all be better for it.

Anyways this was a fantastic piece, I hope this person writes their book after all.

By @LAC-Tech - 6 months
This has such strong Australian vibes.

AU dev scene is not great. Really heavy with POs, and PMs and CTO's without the background.

By @lifeisstillgood - 6 months
>>> pretending that any of this is more important than hiring competent people and treating them well. I could build something superior to this with an ancient laptop, an internet connection, and spreadsheets.

Ow

By @t420mom - 6 months
I feel like there's a whole new generation of tech workers that need to read _Zen and the art of motorcycle maintenance_
By @paulsutter - 6 months
Understanding that IT projects are difficult gives us more empathy. Gartner says that 80% of corporate IT projects are considered failures. McKinsey says that 17% of large projects fail so badly that the companies existence is threatened. Standish group says only 10% of projects succeed.
By @nathancspencer - 6 months
of course, we're serverless, because how can you hurt yourself without a cutting-edge?

Brilliant

By @mrlonglong - 6 months
This makes me very glad I now write software that drives hardware.
By @zbyforgotp - 6 months
Bullshit jobs once again. I don’t know. These companies are complex systems.

He is writing as if the engineers all knew how to fix the systems, but were just powerless to do that. But I’ve also seen projects lead by engineers that only added to the overall complexity.

There is a paradox in this - the people who seem the most confident about fixing the systems usually only make things worse. Chesterton fences and stuff.

This article triggers me because everybody who reads it will always believe that they would fix the mess if only they got the power, but in practice when they get power they would only add new complexity to the whole mess.

By @sethammons - 6 months
> Like why didn't anyone catch the issue with the logs?

Because there were no automated tests. If the company needs something to work, that thing needs a, preferably automated, test.

By @sourcepluck - 6 months
I had read and very much enjoyed the "AI silence or you'll be Piledriven into next week" post, without clicking on anything else on the author's blog. It was click link, read whole thing, love it, send it to one or two people, move on.

Very happy to see this here, realise it's the same person and that this is "a thing", and then to rollick in the author's backlog. A joy! Raucous real-life laughter has exploded from me on numerous occasions along with most articles. I think I've read 5 in a row there, and my brain is buzzing happily.

Thank you to the author for having the courage to write about real experiences. A breath of fresh air. I look forward to future books and articles, and reading more previous work, and cross my proverbial fingers hoping they can keep it real in the face of what will presumably be an avalanche of grifters looking to leech off the attention.

By @0xbadcafebee - 6 months
I've worked for like a dozen companies full-time. Most people don't know what they're doing. I always thought 'impostor syndrome' was a projection of general insecurity. But I've started to think it's actually the subconscious saying "I'm not sure what's right or wrong, please consult an expert."

I have a fantasy of quitting my job to write books on the [modern] theory and practice of information systems engineering. Not 'how to write software', that's been done; I mean all the forms of engineering around software/information systems. In my dream, I write the books, everyone reads them, and starts doing their jobs right.

But then I remember, I, a person arrogant enough to believe he knows how to do things right, still can't get shit done right. Maybe if I were a one-man company, I could 'do everything right', and feel good about the result. But I depend on an entire company of people to do the right thing, in the right way, at the right time. That's hard even with the best people. No company is made up of the best people. It's always a mix of the best, worst, and in-between.

Strangely, a company can put out a decent product, despite the company being a tire fire. This is some comfort when you get older. You realize that everything being shit is okay, as long as the bills are paid. I have PTSD from when the thing that paid the bills was on fire, every week, for years. Lately at every job I have, I internally panic and scream at how horrible everything is. Because I'm haunted by what might happen. But it's not happening yet. So I muffle the screams, smile and nod along with the stand-up-meeting-cum-status-update.

The sad thing is, I forget that it's okay that the stand-up is shit. I forget that I'm still getting a fat paycheck just to sit in meetings that could have been an e-mail. I forget that, despite the company bleeding cloud costs [no savings plans, RIs, serverless, right-sizing, etc], we seem to be making a profit. Despite the terrible designs, bad process, ineffective leadership, absentee management, lack of security, and all the rest, the bottom line is fine. The shit is fine. Currently, and probably for the unforeseeable future.

I get craftsmanship. I'm a crappy woodworker. I enjoy making things well, and getting better at it. But our jobs are not fine woodworking. Our jobs are construction. We are banging rusted nails into shitty, twisted, racked, cupped, knotty-ass studs. If we're lucky. Yeah, this building is going to be shit. But somebody's still going to pay for it. And there'll be another job after. If we really wanted fine woodworking, we never would have taken this job, and we know it. We'd be struggling to sell a cabinet that took us two 80+ hour weeks, too tired to appreciate its beauty, too defeated by flaws only we notice.

So let's stop beating ourselves up. Let's stop beating each other up. We don't, can't, won't, find meaning in this monument to mediocrity. No comfort from the pain zone. No pride to take home. But we are paying the bills, with more left over than most have. No broken backs and long hours. No lack of health care, no abuse from customers or the public. Not even that big a worry about job security. We are the lucky ones. We are blessed with a golden shovel. So let's do like those blue collar laborers we often idolize, and get to this annoying, bloody awful work that we are blessed with.

By @anitil - 6 months
I really enjoy this writing for a couple of reasons. Firstly, Australian. Secondly the acerbic wit just tickles my fancy

> we're serverless, because how can you hurt yourself without a cutting-edge?

Just perfect.

By @lucidguppy - 6 months
> I could build something superior to this with an ancient laptop, an internet connection, and spreadsheets. It would take me a month tops.

^^^ Then do it... and then strangler fig the original.

By @djoldman - 6 months
This kind of thing is pretty typical.

The older the company is, the more likely one finds this morass.

It won't change absent powerful technical leadership.

By @lucidguppy - 6 months
> because we don't have any tests,...

Right there ^^^

By @ramshanker - 6 months
Acting against your own better Judgement, I have to take that fight/ urge every few days. ;)
By @javajosh - 6 months
Tempting to write a response piece, "Get me into data hell, I need the money."
By @codethief - 6 months
Largely unrelated, but "data hell" reminds me of this classic from Silicon Valley: https://m.youtube.com/watch?v=YPgkSH2050k
By @kayo_20211030 - 6 months
Good Grief! Prolix?

> coated in grass which rends those who tread upon it like a legion of upraised spears,

and there's more. Get to the point, whatever it is.

By @jacobyoder - 6 months
"Suffice it to say that while people are sincerely trying their best, our leaders are not even remotely equipped to handle the volume of people just outright lying to them about IT."

I've tried to come up with some heuristic to determine whether or not a team is competent, good, or doomed. I've been exposed to all over the last... 8-10 years, and one of the key things I've noticed is the ratio of competent/skilled developers to the unskilled ones is a big ... indicator(?). Predictor?

Colleague of mine has been working with a team - dev team has ranged from 5-8 people over the last few years. Few people seem to have any grasp of programming at all. Only two people - my colleague and one other - have ever taken projects from ideas to delivery, or even taken features from requests to successful rollout of already functioning software.

The arguments that people get in to there - days or weeks of people 'researching' whether or not OAUTH 'really' requires 'refresh tokens' or whether it's really supposed to be a JWT. Management has some notion of 'every voice is legitimate and should be heard - we don't support bullying' and so on.

If you have a team of 10, and 1 or 2 people are simply bad at having the ability to think somewhat abstractly, you can survive.

If that number hits, say, 4-5... the team will struggle. A lot. You can keep things going, but it will be slow. And everything becomes a battle.

If that number becomes 7 or 8, and you only really have 1-2 developers who are actually competent developers... things will continue to spiral downward.

On the other side - I worked with a team of about 8-10 people on a 6 month contract. The larger org had another 40 or so folks, handling other projects, and support. Onboarding was great - I pushed production code in the first week. Everyone on the team was competent, including the juniors. I had more development experience, but they had more company experience, and it was really a relatively enjoyable engagement overall.

It was refreshing to be able to ask anyone on the team questions, and either get a workable answer, or an "I'm not sure, let's check with XYZ" to get working answers. The "oh, yeah, it's ABC" when ABC is clearly not the answer stuff never happened. People committing code and pushing to production without ever having run the code at all - I've experienced that - didn't happen - that's happening to my colleague.

The problem with a plurality of tech-incompetent folks in a tech group is that they honestly can not determine that they aren't competent. The only examples of competence are in the minority, and tend to not be trusted (even though that minority is the only portion that turns out working/functional code).

Leaving ends up being the only option in those cases. My colleague is only at his place part time, and has hung around because they've gone through some restructuring where new folks were brought in, and... you hope that things might get better in a few months, then realize they don't.

By @_jonas - 6 months
I'm excited for LLM applications that can setup, monitor/validate, and optimize data pipelines at scale. Seems possible soon given that SQL and most data records aren't intended to be human-friendly
By @SynasterBeiter - 6 months
I hate whining like these. Just do your job and get over it. No need to be theatrical about it.