Cheating Is All You Need
Steve Yegge discusses the transformative potential of Large Language Models in software engineering, emphasizing their productivity benefits, addressing skepticism, and advocating for their adoption to avoid missed opportunities.
Read original articleThe blog post by Steve Yegge discusses the transformative impact of Large Language Models (LLMs) on software engineering, likening their significance to that of the World Wide Web and cloud computing. Yegge expresses concern over the skepticism prevalent among engineers regarding LLMs, which he believes could lead to missed opportunities similar to those experienced with early technologies like AWS. He recounts personal anecdotes to illustrate how initial demos often precede major technological advancements. Yegge highlights a successful demonstration of ChatGPT generating functional Emacs-Lisp code from a simple prompt, emphasizing the potential productivity gains from using LLMs in coding. He critiques the notion that LLM-generated code cannot be trusted, arguing that trust in code is a fundamental issue in software engineering regardless of the source. The post concludes with a brief history of LLMs, noting their rapid evolution and the emergence of coding assistants that leverage these models to enhance programming tasks. Yegge advocates for embracing the capabilities of LLMs, suggesting they represent a significant leap forward in the field.
- Large Language Models are seen as a major technological advancement in software engineering.
- Skepticism among engineers may hinder the adoption of LLMs, similar to past hesitations with technologies like AWS.
- LLMs can significantly boost productivity by generating code that requires minimal human adjustment.
- Trust issues with LLM-generated code reflect broader challenges in software engineering regarding code reliability.
- The rapid evolution of LLMs has led to the development of coding assistants that enhance programming efficiency.
Related
> Peeps, let’s do some really simple back-of-envelope math. Trust me, it won’t be difficult math.
> You get the LLM to draft some code for you that’s 80% complete/correct.
> You tweak the last 20% by hand.
> How much of a productivity increase is that? Well jeepers, if you’re only doing 1/5th the work, then you are… punches buttons on calculator watch… five times as productive.
The hard part is the engineering, yes, and now we can actually focus on it. 90% of the work that software engineers do isn't novel. People aren't sitting around coding complicated algorithms that require PhDs. Most of the work is doing pretty boring stuff. And even if LLMs can only write 50% of that you're still going to get a massive boost in productivity.
Of course this isn't always going to be great. Debugging this code is probably going to suck, and even with the best engineering it'll likely accelerate big balls of mud. But it's going to enable a lot of code to get written by a lot more people and it's super exciting.
> You tweak the last 20% by hand.
> How much of a productivity increase is that? Well jeepers, if you’re only doing 1/5th the work, then you are… punches buttons on calculator watch… five times as productive
Except, it’s probably four times as hard to verify the 80% right code as it is to just write the code yourself in the first place.
As far as I can tell, this is a really unnecessarily long-winded article that is basically just "make sure you have good prompt context." And yes, when you're doing RAG, you need to make sure you pull the right data for your context.
Is this just a spam blog post for Sourcegraph?
> All you crazy MFs are completely overlooking the fact that software engineering exists as a discipline because you cannot EVER under any circumstances TRUST CODE.
is straight up insulting to me, because it effectively comes down to "use my product or you're a looney".
Also, while two years after this post (which should be labeled 2023) I've still barely tried to entirely offload coding to an LLM, the few times I did try have been pretty crap. I also really, really don't want to 'chat' with my codebase or editor. 'Chatting' to me feels about as slow as writing something myself, while I also don't get a good mental model 'for free'.
I am a moderately happy user of AI autocomplete (specifically Supermaven), but I only ever accept suggestions that are trivially correct to me. If it's not trivial, it might be useful as a guide of where to look in actual documentation if relevant, but just accepting it will lead me down a wrong path more often than not.
Cool, so I'm going to make five times more money? What's that you say, I'm not? So who is going to be harvesting the profits, then?
I will admit, I skimmed. No jury would convict.
A different slice of time, and a different set of opportunities -- as Yegge says, the recollection that if you hadn't poo-pooed some prototype, you'd have $130 million based on an undeniably successful model -- gives you a different instinct.
I really do hear the -- utterly valid -- criticisms of the early Web and Internet when I hear people's skepticism of AI, including that it is overhyped. Overhyped or not, betting on the Internet in 1994 would have gotten you far, far, more of an insight (and an income) than observing all its early flaws and ignoring or decrying it.
This is somewhat orthogonal to whether those criticisms are correct!
Past discussion: https://news.ycombinator.com/item?id=35273406
So, I scrolled back up to the top, and, lo and behold...
I'm positive about coding + LLMs, personally. I like it better than going without. I also think it's ok to just use a thing and be happy about it without having to predict whether it's going to be great or awful 10 years from now. It's ok to not know the future.
While we are here, I'm currently using Cursor as my programming assistant of choice. I assume things are moving fast enough that there are already better alternatives. Any suggestions?
My thoughts:
1. Will context windows grow enough so that we can afford to be lazy and drop entire knowledge bases into our queries?
2. Will the actual models of LLMs be able to more reliably capture data in a semi-structured way, perhaps bypassing the "data moat" that the author claims?
I'm not saying there's not a money volcano to be had somewhere, built on top of LLMs. But his call to action is stupid. Every other "money volcano" he cites was a clever use of existing technology to meet a previously-unseen but very real need.
LLMs are the mirror image of that. They are something that's clearly, obviously useful and yet people are struggling to leverage them to the degree that _ought_ to be possible. There are no truly slam dunk applications yet, let alone money volcanos (except for the AI hype train itself.)
They are the ultimate hammer in search of a nail. And it's a pretty amazing hammer, I'm not a complete LLM skeptic. But at this point we should all be searching for the best nails, not just cheerleading for the hammer.
I think it holds up. Saying "LLMs are useful for code" is a lot less controversial today than it was back in March 2023 - there are still some hold-outs who will swear that all LLM-produced code is untrustworthy junk and no good engineer would dream of using them, but I think their numbers are continuing to drop.
I don't really remember how it works, probably because I didn't have to think hard about the design, I just sort of iterated it in conversation which probably meant it didn't stick.
While the iteration part was mostly positive, I don't know that it was worth it in the end to be left with something that I can understand but feels a bit foreign at the same time.
2. If you let someone else solve your problems, and you don't reflect in the solution, you won't learn.
3. I rarely read the fine print, but are all LLM IDEs really the same when it comes to intellectual property of the code? Or the responsibily of bad code?
This is the problem. Managers think that the code is the product. It isn't. And LLMs can only produce the code.
- The parent article complains about a lot of "meh" responses by engineers. I often feel very "meh" about LLMs and don't use them very often. It's not cause I hate them, but it is because my context window that I'd need to feed into an LLM is just a bit too big to be tractable for having it help me code. What I need is for my LLM code assistant to be able to comfortably accept ~7-10 git repositories, each with a code+comment line count of ~100k lines of novel (not in the training set) code, and be able to answer my questions and write somewhat useful code. I'm working in big, old, corporate, proprietary codebases, and I need an LLM to be even a little helpful with that. Maybe that's possible and easy today, but I don't know how to do that so I don't. I'm open to suggestions.
- When I talk to a co-worker, even a junior engineer, I feel like I can have sensible discussions with them even if they may not know exactly what I'm talking about. I can trust that they'll say things like "I don't really know all the details, but it sounds like...", or they'll ask follow up questions. I do not have to tell my co-workers to ask follow up questions or to tell me about the bounds of their understanding. We're human, we talk. When I try to chat with an LLM about business problems, I feel more like I'm trying to chat with a 15 year old who is desperately trying to sound knowledgeable, where they will try their best to confidently answer my questions every time. They won't ever specify "I don't know about that, but tell me if this sounds right...". Instead, they'll confidently lie to my face with insanely fabricated details. I don't want to have to navigate a conversation with the background question of "is this the fabrications of a madman?" in the back of my head all the time, so I don't. Maybe that's the wrong value calculus, and if so, I'm open to being persuaded. Please, share your thoughts.
Since code is one of those things you can train LLMs with using RL – not RHLF – it can actually do goal based AI planning, all the people who say "I don't trust LLM code" are missing how beside the point this is.
A lot of code in this world is accepted, even if it is suboptimal and people don't realize how much money is to be made by writing it so with ML models.