January 7th, 2025

How I Program with LLMs

The author discusses the positive impact of large language models on programming productivity, highlighting their uses in autocomplete, search, and chat-driven programming, while emphasizing the importance of clear objectives.

Read original articleLink Icon
CuriositySkepticismAppreciation
How I Program with LLMs

The author shares their experiences using large language models (LLMs) in programming over the past year, highlighting their positive impact on productivity. They emphasize a proactive approach to integrating LLMs into their workflow, which has led to the development of a tool for Go programming called sketch.dev. The author identifies three primary uses for LLMs: autocomplete, search, and chat-driven programming. Autocomplete enhances productivity by reducing mundane typing, while LLMs provide better answers to complex questions compared to traditional search engines. Chat-driven programming, though challenging, offers significant value by generating initial drafts and ideas, especially when the programmer lacks the energy to start from scratch. The author notes that effective use of LLMs requires clear objectives and manageable complexity to avoid confusion. They also discuss the advantages of smaller code packages, which facilitate LLM interactions and improve code readability. The author concludes that while LLMs can produce errors, they are adept at correcting mistakes when provided with feedback. Overall, the integration of LLMs into programming practices has proven beneficial, particularly in product development contexts.

- LLMs have a net-positive effect on programming productivity.

- Key uses of LLMs include autocomplete, search, and chat-driven programming.

- Smaller code packages enhance LLM interactions and improve code readability.

- Effective use of LLMs requires clear objectives and manageable complexity.

- LLMs can produce errors but are capable of correcting them with feedback.

Related

AI: What people are saying
The discussion around the impact of large language models (LLMs) on programming productivity reveals several key themes.
  • Many users find LLMs beneficial for autocomplete and search functionalities, significantly enhancing their coding efficiency.
  • Chat-driven programming is seen as a double-edged sword; while it can provide useful starting points, it often generates buggy or incomplete code that requires substantial manual correction.
  • Some programmers express skepticism about relying on LLMs for serious projects, emphasizing the importance of understanding the code and maintaining quality.
  • Concerns about the long-term implications of LLMs on junior developers and the potential for increased technical debt are prevalent.
  • There is a recognition that LLMs serve as valuable tools for brainstorming and debugging, acting as a "thought partner" in the coding process.
Link Icon 56 comments
By @dewitt - 11 days
One interesting bit of context is that the author of this post is a legit world-class software engineer already (though probably too modest to admit it). Former staff engineer at Google and co-founder / CTO of Tailscale. He doesn't need LLMs. That he says LLMs make him more productive at all as a hands-on developer, especially around first drafts on a new idea, means a lot to me personally.

His post reminds me of an old idea I had of a language where all you wrote was function signatures and high-level control flow, and maybe some conformance tests around them. The language was designed around filling in the implementations for you. 20 years ago that would have been from a live online database, with implementations vying for popularity on the basis of speed or correctness. Nowadays LLMs would generate most of it on the fly, presumably.

Most ideas are unoriginal, so I wouldn't be surprised if this has been tried already.

By @highfrequency - 10 days
> A lot of the value I personally get out of chat-driven programming is I reach a point in the day when I know what needs to be written, I can describe it, but I don’t have the energy to create a new file, start typing, then start looking up the libraries I need... LLMs perform that service for me in programming. They give me a first draft, with some good ideas, with several of the dependencies I need, and often some mistakes. Often, I find fixing those mistakes is a lot easier than starting from scratch.

This to me is the biggest advantage of LLMs. They dramatically reduce the activation energy of doing something you are unfamiliar with. Much in the way that you're a lot more likely to try kitesurfing if you are at the beach standing next to a kitesurfing instructor.

While LLMs may not yet have human-level depth, it's clear that they already have vastly superhuman breadth. You can argue about the current level of expertise (does it have undergrad knowledge in every field? PhD level knowledge in every field?) but you can't argue about the breadth of fields, nor that the level of expertise improves every year.

My guess is that the programmers who find LLMs useful are people who do a lot of different kinds of programming every week (and thus are constantly going from incompetent to competent in things that other people already know), rather than domain experts who do the same kind of narrow and specialized work every day.

By @mlepath - 11 days
The first rule of programming with LLMs is don't use them for anything you don't know how to do. If you can look at the solution and immediately know what's wrong with it, they are a time saver otherwise...

I find chat for search is really helpful (as the article states)

By @wdutch - 11 days
I no longer work in tech, but I still write simple applications to make my work life easier.

I frequently use what OP refers to as chat-driven programming, and I find it incredibly useful. My process starts by explaining a minimum viable product to the chat, which then generates the code for me. Sometimes, the code requires a bit of manual tweaking, but it’s usually a solid starting point. From there, I describe each new feature I want to add—often pasting in specific functions for the chat to modify or expand.

This approach significantly boosts what I can get done in one coding session. I can take an idea and turn it into something functional on the same day. It allows me to quickly test all my ideas, and if one doesn’t help as expected, I haven’t wasted much time or effort.

The biggest downside, however, is the rapid accumulation of technical debt. The code can get messy quickly. There's often a lot of redundancy and after a few iterations it can be quite daunting to modify.

By @nemothekid - 11 days
I think "Chat driven programming" is the most common type of the most hyped LLM-based programming I see on twitter that I just can't relate to. I've incorporated LLMs mainly as auto-complete and search; asking ChatGPT to write a quick script or to scaffold some code for which the documentation is too esoteric to parse.

But having the LLM do things for me, I frequently run into issues where it feels like I'm wasting my time with an intern. "Chat-based LLMs do best with exam-style questions" really speaks to me, however I find that constructing my prompts in such a way where the LLM does what I want uses just as much brainpower as just programming the thing my self.

I do find ChatGPT (o1 especially) really good at optimizing existing code.

By @notjoemama - 11 days
Our company has a no AI use policy. The assumption is zero trust. We simply can’t know whether a model or its framework could or would send proprietary code outside the network. So it’s best to assume all LLMs/AI is or will send code or fragments of code. While I applaud the incredible work by their creators, I’m not sure how a responsible enterprise class company could rely on “trust us bro” EULAs or repo readmes.
By @Ozzie_osman - 11 days
One mode I felt was missed was "thought partner", especially while debugging (aka rubber ducking).

We had an issue recently with a task queue seemingly randomly stalling. We were able to arrive at the root cause much more quickly than we would have because of a back-and-forth brainstorming session with Claude, which involved describing the issue we were seeing, pasting in code from library to ask questions, asking it to write some code to add some missing telemetry, and then probing it for ideas on what might be going wrong. An issue that may have taken days to debug took about an hour to identify.

Think of it as rubber ducking with a very strong generalist engineer who knows about basically any technical concepts.

By @nunez - 11 days
I definitely respect David's opinion given his caliber, but pieces like this make me feel strange that I just don't have a burning desire to use them.

Like, yesterday I made some light changes to a containerized VPN proxy that I maintain. My first thought wasn't "how would Claude do this?" Same thing with an API I made a few weeks ago that scrapes a flight data website to summarize flights in JSON form.

I knew I would need to write some boilerplate and that I'd have to visit SO for some stuff, but asking Claude or o1 to write the tests or boilerplate for me wasn't something I wanted or needed to do. I guess it makes me slower, sure, but I actually enjoy the process of making the software end to end.

Then again, I do all of my programming on Vim and, technically, writing software isn't my day job (I'm in pre-sales, so, best case, I'm writing POC stuff). Perhaps I'd feel differently if I were doing this day in, day out. (Interestingly, I feel the same way about AI in this sense that I do about VSCode. I've used it; I know what's it capable of; I have no interest in it at all.)

The closest I got to "I'll use LLMs for something real" was using it in my backend app that tracks all of my expenses to parse pictures of receipts. Theoretically, this will save me 30 seconds per scan, as I won't need to add all of the transaction metadata myself. Realistically, this would (a) make my review process slower, as LLMs are not yet capable of saying "I'm not sure" and I'd have to manually check each transaction at review time, (b) make my submit API endpoint slower since it takes relatively-forever for it to analyze images (or at least it did when I experimented with this on GPT4-turbo last year), and (c) drive my costs way up (this service costs almost nothing to run, as I run it within Lambda's free tier limit).

By @bangaladore - 11 days
The killer feature about LLMs with programming in my opinion is autocomplete (the simple copilot feature). I can probably be 2-3x more productive as I'm not typing (or thinking much). It does a fairly good job pulling in nearby context to help it. And that's even without a language server.

Using it to generate blocks of code in a chat like manner in my opinion just never works well enough in the domains I use it on. I'll try to get it to generate something and then realize when I get some functional result I could've done it faster and more effectively.

Funny enough, other commenters here hate autocomplete but love chat.

By @LouisSayers - 10 days
The use of LLMs reminds me a bit of how people use search engines.

Some years ago I gave a task to some of my younger (but intelligent) coworkers.

They spent about 50 minutes searching in google and came back to me saying they couldn't find what they were looking for.

I then typed in a query, clicked one of the first search results and BAM! - there was the information they were unable to find.

What was the difference? It was the keywords / phrases we were using.

By @Balgair - 10 days
I'm not a 'programmer'. At best, I'm a hacker, at best. I don't work in a team. All my code is mostly one time usage to just get some little thing done, sometimes a bit of personal stuff too. I mostly use Excel anyways, and then python, and even then, I hate python because half the time I'm just dealing with library issues (not a joke, I measured it (and, no, I'm not learning another language, but thank you)). I'm in biotech, a very non code-y section of it too.

LLMs are just a life saver. Literally.

They take my code time down from weeks to an afternoon, sometimes less. Any they're kind.

I'm trying to write a baseball simulator on my own, as a stretch goal. I'm writing my own functions now, a step up for me. The code is to take in real stats, do Monte Carlo, get results. Basic stuff. Such a task was impossible for me before LLMs. I've tried it a few times. No go. Now with LLMs, I've got the skeleton working and should be good to go before opening day. I'm hoping that I can use it for some novels that I am writing to get more realistic stats (don't ask).

I know a lot of HN is very dismissive of LLMs as code help. But to me, a non programmer, they've opened it up. I can do things I never imagined that I could. Is it prod ready? Hell no, please God no. But is it good enough for me to putz with and get just working? Absolutely.

I've downloaded a bunch of free ones from huggingface and Meta just to be sure they can't take them away from me. I'm never going back to that frustration, that 'Why can't I just be not so stupid?', that self-hating, that darkness. They have liberated me.

By @brabel - 11 days
What the author is asking about, a quick sketchpad where you can try out code quickly and chat with the AI, already exists in the JetBrains IDEs. It's called a scratch file[1].

As far as I know, the idea of a scratch "buffer" comes from emacs. But in Jetbrains IDEs, you have the full IDE support even with context from your current project (you can pick the "modules" you want to have in context). Given the good integration with LLMs, that's basically what the author seems to want. Perhaps give GoLand[2] a try.

Disclosure: no, I don't work for Jetbrains :D just a very happy customer.

[1] https://www.jetbrains.com/help/idea/scratches.html

[2] https://www.jetbrains.com/go/

By @rafaelmn - 11 days
I disagree about search. While LLM can give you an answer faster, good doc (eg. MDN article in CSS example) will :

- be way more reliable

- probably be up to date on how you should solve it in latest/recommend approach

- put you in a place where you can search for adjecent tech

LLM with search has potential but I'd like if current tools are more oriented on source material rather than AI paraphrasing.

By @charlieyu1 - 10 days
I’m a hobby programmer who never worked a programming job. Last week I was bored, I asked o1 to help me to write a Solitaire card game using React because I’m very rusty with web development.

The first few steps were great. Guided me to install things and setup a project structure. The model even generated codes for a few files.

Then something went wrong, the model kept telling me what to do in vague, but didn’t output codes anymore. So I asked for further help, and now it started contradicting itself, rewriting business logic that were implemented in the first response, 3-4 pieces of code snippets of the same file that aren’t compatible etc, and it all fell apart.

By @justatdotin - 11 days
lots of colleauges using copilot or whatever for autocomplete - I just find that annoying.

or writing tests - that's ... not so helpful. worst is when a lazy dev takes the generated tests and leaves it at that: usually just a few placeholders that test the happy path but ignore obvious corner cases. (I suppose for API tests that comes down to adding test case parameters)

but chatting about a large codebase, I've been amazed at how helpful it can be.

what software patterns can you see in this repo? how does the implementation compare to others in the organisation? what common features of the pattern are missing?

also, like a linter on steroids, chat can help explore how my project might be refactored to better match the organisation's coding style.

By @hansvm - 10 days
That quartile reservoir sampler example is ... intriguing?

My experience with LLM code is that it can't come up with anything even remotely novel. If I say "make it run in amortized O(1)" then 99 times out of 100 I'll get a solution so wildly incorrect (but confidently asserting its own correctness) that it can't possibly be reshaped into something reasonable without a re-write. The remaining 1/100 times aren't usually "good" either.

For the reservoir sampler -- here, it did do the job. David almost certainly knows enough to know the limits of that code and is happy with its limitations. I've solved that particular problem at $WORK though (reservoir sampling for percentile estimates), and for the life of me I can't find a single LLM prompt or sequence of prompts that comes anywhere close to optimality unless that prompt also includes the sorts of insights which lead to an amortized O(1) algorithm being possible (and, even then, you still have to re-run the query many times to get a useful response).

Picking on the article's solution a bit, why on earth is `sorted` appearing in the quantile estimation phase? That's fine if you're only using the data structure once (init -> finalize), but it's uselessly slow otherwise, even ignoring splay trees or anything else you could use to speed up the final inference further.

I personally find LLMs helpful for development when either (1) you can tolerate those sorts of mishaps (e.g., I just want to run a certain algorithm through Scala and don't really care how slow it is if I can run it once and hexedit the output), or (2) you can supply all the auxilliary information so that the LLM has a decent chance of doing it right -- once you've solved the hard problems, the LLM can often get the boilerplate correct when framing and encapsulating your ideas.

By @wrs - 11 days
I’ve been working with Cursor’s agent mode a lot this week and am seeing where we need a new kind of tool. Because it sees the whole codebase, the agent will quickly get into a state where it’s changed several files to implement some layering or refactor something. This requires a response from the developer that’s sort of like a code review, in that you need to see changes and make comments across multiple files, but unlike a code review, it’s not finished code. It probably doesn’t compile, big chunks of it are not quite what you want, it’s not structured into coherent changesets…it’s kind of like you gave the intern the problem and they submitted a bit of a mess. It would be a terrible PR, but it’s a useful intermediate state to take another step from.

It feels like the IDE needs a new mode to deal with this state, and that SCM needs to be involved somehow too. Somehow help the developer guide this somewhat flaky stream of edits and sculpt it into a good changeset.

By @choeger - 11 days
Essentially, an LLM is a compressed database with a universal translator.

So what we can get out of it is everything that has been written (and publicly released) before translated to any language it knows about.

This has some consequences.

1. Programmers still need to know what algorithms or interfaces or models they want.

2. Programmers do not have to know a language very well anymore, to write code, but the have to for bug fixing. Consequently the rift between garbage software and quality software will grow.

3. New programming languages will face a big economical hurdle to take off.

By @cratermoon - 11 days
But the question must be asked: At what cost?

Are the results a paradigm shift so much better that it's worth the hundreds of billions sunk into the hardware and data centers? Is spicy autocomplete worth the equivalent of flying from New York to London while guzzling thousands of liters of water?

It might work, for some definition of useful, but what happens when the AI companies try to claw back some of that half a trillion dollars they burnt?

By @owebmaster - 11 days
I thought his project, sketch.dev is of very poor quality. I wouldn't ship something like this - the auth process is awful and broke, I still can't login. If after 14 hours of the post the service is still rugged to death, it also means the scalability of the app is bad. If we are going to use LLMs to replace hours of programming, we should aim for quality too.
By @singpolyma3 - 11 days
It seems like everything I see about success using LLMs for this kind of work is for greenfield. What about three weeks later when the job changes to maintenance and interation on something that's already working? Are people applying LLMs to that space?
By @ripped_britches - 11 days
I’ll say that the payoff for investing the time to learn how to do this right is huge. Especially with cursor which allows me to easily chat around context (docs, library files, etc)
By @simondotau - 11 days
I've recently started using Cursor because it means I can now write python where two weeks ago I couldn't write python. It wrote the first pass of an API implementation by feeding it the PDF documentation. I've spent a few days testing and massaging it into a well formed, well structured library, pair-programming style.

Then I needed to write a simple command line utility, so I wrote it in Go, even though I've never written Go before. Being able to make tiny standalone executables which do real work is incredible.

Now if I ever need to write something, I can choose the language most suited to the task, not the one I happen to have the most experience with.

That's a superpower.

By @golergka - 11 days
I have written a small fullstack app over the holidays, mostly with LLMs, to see how far would they get me. Turns out, they can easily write 90% of the code, but you still need to review everything, make the main architectural decisions and debug stuff when AI cant solve the bug after 2-3 iterations. I get a huge productivity boost and at the same time am not afraid that they will replace me. At least not yet.

Can't recommend aider enough. I've tried many different coding tools, but they all seem like a leaky abstraction over LLMs medium of sequential text generation. Aider, on the other hand, leans into it in the best possible way.

By @dxuh - 11 days
Currently a lot of my work consists of looking at large, (to me) unknown code bases and figuring out how certain things work. I think LLMs are currently very bad at this and it is my understanding that there are problems in increasing context window sizes to multiple millions of tokens, so I wonder if LLMs will ever get good at this.
By @sublimefire - 11 days
I've been doing that for a while as well and mostly agree. Although one thing that I find useful is to build the local infrastructure to be able to collect useful prompts and the ability to work with files and urls. Web interface is limiting alone.

I like gptresearcher and all of the glue put in place to be able to extend prompts and agents etc. Not to mention the ability to fetch resources from the web and do research type summaries on it.

All in all it reminds me the work of security researchers, pentesters and analysts. Throughout the career they would build a set of tools and scripts to solve various problems. LLMs kind of force the devs to create/select tools for themselves to ease the burden of their specific line of work as well. You could work without LLMs but maybe it will be a bit more difficult to stand out in the future.

By @ianpurton - 11 days
I've been coding professionally for 30 years.

I'm probably in the same place as the author, using Chat-GPT to create functions etc, then cut and pasting that into VSCode.

I've started using cline which allows me to code using prompts inside VSCode.

i.e. Create a new page so that users can add tasks to a tasks table.

I'm getting mixed results, but it is very promising. I create a clinerules file which gets added to the system prompt so the AI is more aware of my architecture. I'm also looking at overiding the cline system prompt to both make it fit my architecture better and also to remove stuff I don't need.

I jokingly imagine in the future we won't get asked how long a new feature will take, rather, how many tokens will it take.

By @btbuildem - 11 days
The search part really resonates with me. I do a lot of odd/unusual/one-off things for my side projects, and I use LLMs extensively in helping me find a path forward. It's like an infinitely patient, all-knowing expert that pulls together info from any and all domain. Sometimes it will have answers that I am unable to find another way (eg, what's the difference between "busy s..." and "busy p..." AT command response on the esp8285?). It saves me hours of struggle, and I would not want to go back to the old ways.
By @polotics - 11 days
My main usage is in helping me approach domains and tools I don't know enough to confidently know how best to get started.

So one thing that doesn't get a mention in the article but is quite significant I think is the long lag of knowledge cutoff dates: looking at even the latest and greatest, there is one year or more of missing information.

I would love for someone more versed than me to tell us how best to use RAG or LoRA to get the model to answer with fully up to date knowledge on libraries, frameworks, ...

By @ryanobjc - 10 days
I have been getting more value out of LLMs recently, and the great irony is it is because of a few different packages in emacs and the wonderful CLI LLM chat programming tool 'aider'.

My workflow puts LLM chat at my fingertips, and I can control the context. Pretty much any text in emacs can be sent to a LLM of your choice via API.

Aider is even better, it does a bunch of tricks to improve performance, and is rapidly becoming a 'must have' benchmark for LLM coding. It integrates with git so each chat modification becomes a new git commit. Easy to undo changes, redo changes, etc. It also has a bunch of hacks because while o1 is good as reasoning, it (apparently) doesn't do code modification well. Aider will send different types of requests to different 'strengths' of LLMs etc. Although if you can use sonnet, you can just use that and be done with it.

It's pretty good, but ultimately it's still just a tool for transforming words into code. It won't help you think or understand.

I feel bad for new kids who won't develop muscle and sight strength to read/write code. Because you still need to read/write code, and can't rely on the chat interface for everything.

By @Ygg2 - 11 days
> Search. If I have a question about a complex environment, say “how do I make a button transparent in CSS” I will get a far better answer asking any consumer-based LLM, than I do using an old fashioned web search engine.

I don't think this is about LLMs getting better, but search becoming worse. In no small thanks to LLMs polluting the results. Do search images for terms and count how many are AI generated.

I can say I got better result from Google X years ago vs Google of today.

By @bambax - 11 days
> There are three ways I use LLMs in my day-to-day programming: 1/ Autocomplete 2/ Search 3/ Chat-driven programming

I do mostly 2/ Search, which is like a personalized Stack Overflow and sometimes feels incredible. You can ask a general question about a specific problem and then dive into some specific point to make sure you understand every part clearly. This works best for things one doesn't know enough about, but has a general idea of how the solution should sound or what it should do. Or, copy-pasting error messages from tools like Docker and have the LLM debug it for you really feels like magic.

For some reason I have always disliked autocomplete anywhere, so I don't do that.

The third way, chat-driven programming, is more difficult, because the code generated by LLMs can be large, and can also be wrong. LLMs are too eager to help, and they will try to find a solution even if there isn't one, and will invent it if necessary. Telling them in the prompt to say "I don't know" or "it's impossible" if need be, can help.

But, like the author says, it's very helpful to get started on something.

> That is why I still use an LLM via a web browser, because I want a blank slate on which to craft a well-contained request

That's also what I do. I wouldn't like having something in the IDE trying to second guess what I write or suddenly absorbing everything into context and coming up with answers that it thinks make a lot of sense but actually don't.

But the main benefit is, like the author says, that it lets one start afresh with every new question or problem, and save focused threads on specific topics.

By @averus - 10 days
I think the author is really on the right path with his vision for LLMs as tool for software development. Last week I tried probably all of them with something like a code challenge.

I have to say that I am impressed with sketch.dev, it got me a working example from the first try and it looked cleaner form all the others, similar but cleaner somehow in terms of styling.

The whole time I was using those tools I was thinking that I want exactly this a LLM trained specifically on the Go official documentation, or whatever your favourite language is, ideally fined tuned by the maintainers of the language.

I want the LLM to show me an idiomatic way to write an API using the standard library I don't necessarily want it to do it instead of me, or to be trained on all of the scrapped data they could scrape. Show me a couple of examples maybe explain a concept, give me steps by step guidance.

I also share his frustrations with the chat based approach what annoys me personally the most is the anthropomorphization of the LLMs, yesterday Gemini was even patronizing me...

By @ghostbit - 6 days
> you’re going to have days of tense back-and-forth about whether the cost of the work is worth the benefit. An LLM will do it in 60 seconds and not make you fight to get it done. Take advantage of the fact that redoing work is extremely cheap.

The fast iteration cycle of getting a baseline (but less than ideal or even completely wrong) is a great point here. Redoing the work is fast and easy but still requires review and validation to know how to request the rework to obtain the optimal result.

By @agentultra - 11 days
It seems nice for small projects but I wouldn’t use it for anything serious that I want to maintain long term.

I would write the tests first and foremost: they are the specification. They’re for future me and other maintainers to understand and I wouldn’t want them to be generated: write them with the intention of explaining the module or system to another person. If the code isn’t that important I’ll write unit tests. If I need better assurances I’ll write property tests at a minimum.

If I’m working on concurrent or parallel code or I’m working on designing a distributed system, it’s gotta be a model checker. I’ve verified enough code to know that even a brilliant human cannot find 1-in-a-million programming errors that surface in systems processing millions of transactions a minute. We’re not wired that way. Fortunately we have formal methods. Maths is an excellent language for specifying problems and managing complexity. Induction, category theory, all awesome stuff.

Most importantly though… you have to write the stuff and read it and interact with it to be able to keep it in your head. Programming is theory-building as Naur said.

Personally I just don’t care to read a bunch of code and play, “spot the error;” a game that’s rigged for me to be bad at. It’s much more my speed to write code that obviously has no errors in it because I’ve thought the problem through. Although I struggle with this at times. The struggle is an important part of the process for acquiring new knowledge.

Though I do look forward to algorithms that can find proofs of trivial theorems for me. That would be nice to hand off… although simp does a lot of work like that already. ;)

By @fassssst - 11 days
They’re pretty great for printf debugging. Yesterday I was confounded by a bug so I rapidly added a ton of logging that the LLM wrote instantly, then I had the LLM analyze the state difference between the repro and non repro logs. It found something instantly that it would have taken me a few hours to find, which led me to a fix.
By @aerhardt - 10 days
His experience mirrors mine. I'm happy he explicitly mentions search, when people have been shouting "this is not meant for search" for a couple years now. Of course it helps with search. I also love the tech for producing first drafts, and it greatly lowers the energy and cognitive load when attacking new tasks, like others are repeating on this thread.

I think at the same time, while the author says this is the second most impressive technology he's seen in his lifetime, it's still a far cry from the bombastic claims being made by the titans of industry regarding its potential. Not uncommon to see claims here on HN of 10x improvements in productivity, or teams of dozens of people being axed, but nothing in the article or in my experience lines up with that.

By @yawnxyz - 11 days
> I could not go a week without getting frustrated by how much mundane typing I had to do before having a FIM model

For those not in-the-know, I just learned today that code autocomplete is actually called "Fill-in-the-Middle" tasks

By @jmull - 11 days
LLM auto-complete is good — it suggests more of what I was going to type, and correctly (or close enough) often enough that it’s useful. Especially in the boilerplate-y languages/code I have to use for $dayjob.

Search has been neutral. For finding little facts it’s been about the same as regular search. When digging in, I want comprehensive, dense, reasonably well-written reference documentation. That’s not exactly wide-spread, but LLMs don’t provide this either.

Chat-driven generates too much buggy/incomplete code to be useful, and the chat interface is seriously clunky.

By @e12e - 11 days
Interesting. I wonder what the equivalent of sketch.dev would look like if it targeted Smalltalk and was embedded in a Smalltalk image (preferably with a local LLM running in smalltalk)?

I'd love to be able to tell my (hypothetical smalltalk) tablet to create an app for me, and work interactively, interacting with the app as it gets built...

Ed: I suppose I should just try and see where cloud ai can take smalltalk today:

https://github.com/rsbohn/Cuis-Smalltalk-Dexter-LLM

By @stevage - 11 days
This is a great article with lots of useful insights.

But I'm completely unconvinced by the final claim that LLM interfaces should be separate from IDE's, and should be their own websites. No thanks.

By @999900000999 - 11 days
I still find most LLMS to be extremely poor programmers .

Claude will often generate tons and tons of useless code quickly using up it's limit. I often find myself yelling at it to stop.

I was just working with it last night.

"Hi Claude, can you add tabs here.": <div>

<MainContent/>

<div/>

Claude will then start generating MainContent.

DeepSeek, despite being free does a much better job than Claude. I don't know if it's smarter, but whatever internal logic it has is much more to the point.

Claude also has a very weird bias towards a handful of UI libraries that has installed, even if those wouldn't be good for your project. I wasted hours on shancn UI which requires a very particular setup to work.

LLM's are generally great at common tasks using a top 5( popularity) language.

Ask it to do something in a Haxe UI library and it'll make up functions that *look* correct.

Overall I like them, they definitely speed things up. I don't think most experienced software engineers have much to worry about for now. But I am really worried about juniors. Why higher a junior engineer, when you can just tell your seniors they need to use Copilot to crank out more code

By @denvermullets - 11 days
this is almost exactly how ive been using llms. i dont like the code complete in the ide, personally, and prefer all llm usage to be narrow specific blocks of code. it helps as i bounce between a lot of side projects, projects at work, and freelance projects. not to mention with context switching it really helps keep things moving, imo
By @justinl33 - 11 days
I've maintained several SDKs, and the 'cover everything' approach leads to nightmare dependency trees and documentation bloat. imo, the LLM paradigm shifts this even further - why maintain a massive SDK when users can generate precisely what they need? This could fundamentally change how we think about API distribution.
By @jimmydoe - 11 days
Anyone has good recommendation of LocalLLM for autocompletion

Most editors I use supports online LLM but it's too slow sometimes for me.

By @dboreham - 10 days
Interesting that he had the same thought initially as I did (after running a model myself on my own hardware) : this is like the first time I ran a traceroute across the planet.
By @lysecret - 11 days
Funny, he starts of dismissing an AI IDE to end with building an AI IDE :D (Smells a little bit like not invented here syndrom) Otherwise fascinating article!
By @theptip - 10 days
This lines up well with my experience. I’ve tried coming at things from the IDE and chat side, and I think we need to merge tooling more to find the sweet spot. Claude is amazing at building small SPAs, and then you hit the context window cutoff and can’t do anything except copy your file out. I suspect IDEs will figure this out before Claude/ChatGPT learn to be good enough at the things folks need from IDEs. But long-term, i suppose you don’t want to have to drop down to code at all and so the constraints of chat might force the exploration of the new paradigm more aggressively.

Hot take of the day, I think making tests and refactors easier is going to be revolutionary for code quality.

By @jordanmorgan10 - 10 days
The more experienced the engineer the less CSS is on the page. This seems to be a universal truth, I want to learn from these people - but my goodness, but could we at least use margins to center content.
By @assimpleaspossi - 11 days
Since all these AI products just put together things they pull from elsewhere, I'm wondering if, eventually, there could be legal issues involving software products put together using such things.
By @EGreg - 11 days
Can’t we just use test-driven development with AI Agents?

1) Idea

2) Tests

3) Code until all tests pass

By @_boffin_ - 11 days
Does anyone know of any good chat based ui builders. No. Not build a chat app.

Does webflow have something?

My problem is being able to describe what I want in the style I want.

By @User23 - 11 days
LLMs are, at their core, search tools. Training is indexing and prompting is querying that index. The granularity being at the n-gram rather than the document level is a huge deal though.

Properly using them requires understanding that. And just like we understand every query won’t find what we want, neither will every prompt. Iterative refinement is virtually required for nontrivial cases. Automating that process, like eg cursor agent, is very promising.