December 12th, 2024

Program Synthesis and Large Language Models

The article argues that large language models like ChatGPT won't replace traditional programming, as generating correct code is complex and requires programming skills, despite LLMs assisting developers.

Read original articleLink Icon
Program Synthesis and Large Language Models

The article discusses the limitations of large language models (LLMs) like ChatGPT in the context of program synthesis and the future of programming. While some experts predict that advancements in AI will render traditional programming obsolete, the author argues against this notion. The claim that most software will be AI-generated overlooks the complexity of software development, particularly for non-trivial applications such as operating systems and game engines. The challenges of generating correct program code from specifications are well-documented in computer science, with program synthesis being a computationally hard problem. The author emphasizes that natural language, including English, is semantically ambiguous and cannot simplify the synthesis process. Although LLMs can assist in generating code and may serve as useful tools for programmers, they do not replace the need for programming skills or the study of computer science. The article concludes that while LLMs can facilitate dialogue between developers and users, they do not eliminate the fundamental challenges of program correctness and synthesis.

- The notion that programming will become obsolete due to AI advancements is challenged.

- Generating correct program code from specifications is a complex and computationally hard problem.

- Natural language programming is limited by semantic ambiguity and does not simplify program synthesis.

- LLMs can assist programmers but do not replace the need for programming skills.

- The importance of program correctness remains a central issue in computer science.

Link Icon 10 comments
By @bjornsing - 4 months
I’m beginning to see LLMs more as a solution to the code reuse problem: they can implement anything that’s common enough to appear a number of times in their training data.

If you’re hoping they will soon implement entirely novel complex systems based on a loose specification I think you’ll be disappointed.

By @lmeyerov - 4 months
It's frustrating whenever folks throw up proof complexity as why LLMs can't work. If most programs most people want to write can map into predictable & verifiable abstractions, or we recognize almost no software is verified to beginwith, we realize the world is already moving on irrespective of personal hobby horses here. Shocker: Much of the world we interact with every day already runs on PHP, JavaScript, and untyped Python that is not verified, not type checked, and has repos overflowing with bugs and CVEs.

Prof. Dawn Song IMO has been articulating a more productive view. LLMs are generating half the new code on github anyways, so lean in: use this as an opportunity to make it easy for new code to use formal methods where before it would have been to hard. Progress will happen either way, and at least this way we have a shot at bringing the verifiability in to more user code.

By @golly_ned - 4 months
This article compares the hardness of program synthesis to the utility of chatbots to show chatbots fall short.

The difference is we don’t expect chatbots to be 100% right but we do expect program synthesis to be 100% right. For chatbots, 99% is amazing in terms of utility. That other 1% is really hard to get.

Given the limitations of English being able to robustly specify a program, thus requiring constraints for program synthesis to be formal descriptions of specifications, the author is committing a category error in comparing two incommensurable solutions to two distinct problems.

By @md_rumpf - 4 months
Well... How do humans synthesizs programs then? We don't do exponential search over some symbols. We "somehow know" which branches get us closer to a program and which don't. It's foolish to believe it is impossible to teach an AI to "somehow know" this, too. Also: For some reason the bar is always higher for AI. Next to no human code is verifiably correct. And the actually verified-to-be-correct code is an even smaller subset.
By @jiangyaokai - 4 months
I have always believed difficulty is inherent to the problem. Regardless of whether that solver is a machine or a living organism. Maybe I am tainted in the same way since I studied complexity theory.
By @mondrian - 4 months
Tl;dr (a) program synthesis is provably hard and (b) English is too ambiguous as a specification language. Therefore we’re nowhere close.

Ultimately not a convincing argument. We could use this same argument to argue that a human cannot write a program to spec. That may be strictly true, but not interesting.

By @childintime - 4 months
The argument goes like this: a car will never be able to do all the things a horse can do.

Seems to me the car is the AI and the horse is the human.