August 2nd, 2024

Where Are Large Language Models for Code Generation on GitHub?

The study examines Large Language Models like ChatGPT and Copilot in GitHub projects, noting their limited use in smaller projects, short code snippets, and minimal modifications compared to human-written code.

Read original article

The study titled "Where Are Large Language Models for Code Generation on GitHub?" investigates the use of Large Language Models (LLMs) like ChatGPT and Copilot in software development, particularly focusing on their code generation capabilities as reflected in GitHub projects. The research highlights that ChatGPT and Copilot are the most commonly used LLMs for code generation, while other models have minimal presence. It notes that projects utilizing these models tend to be smaller and less recognized, often led by individuals or small teams, yet they show signs of continuous development. The primary programming languages for generated code include Python, Java, and TypeScript, with a focus on data processing and transformation tasks. In contrast, C/C++ and JavaScript are used for algorithm implementation and user interface code. The generated code snippets are generally short and exhibit low complexity. The study also finds that LLM-generated code is present in a limited number of projects and undergoes fewer modifications compared to human-written code, with bug-related changes being particularly rare. Additionally, comments accompanying the generated code often lack detail, typically only indicating the code's source without providing context on prompts or modifications. The findings suggest implications for both researchers and practitioners in understanding the integration and effectiveness of LLMs in real-world coding scenarios.

Karpathy: Let's reproduce GPT-2 (1.6B): one 8XH100 node 24h $672 in llm.c

The GitHub repository focuses on the "llm.c" project by Andrej Karpathy, aiming to implement Large Language Models in C/CUDA without extensive libraries. It emphasizes pretraining GPT-2 and GPT-3 models.

Self hosting a Copilot replacement: my personal experience

The author shares their experience self-hosting a GitHub Copilot replacement using local Large Language Models (LLMs). Results varied, with none matching Copilot's speed and accuracy. Despite challenges, the author plans to continue using Copilot.

Up to 90% of my code is now generated by AI

A senior full-stack developer discusses the transformative impact of generative AI on programming, emphasizing the importance of creativity, continuous learning, and responsible integration of AI tools in coding practices.

4 comments

By @nickpsecurity - 9 months

This is about what I expected if my own experimentation extrapolated. The code quality was too inconsistent to replace human labor for many uses.

However, it was good for automating boilerplate and for one-off utilities that were tedious to write. In those two categories, many new uses are similar to previous uses in the training data. So, extrapolation is easier for such jobs.

The abstract suggests that’s the kind of use they’re doing. That’s also why they’re not usually fixing the bugs. We can often ignore or work around bugs in such use cases.

By @staplung - 9 months

From the paper:

"Specifically, we first conduct keyword searches, such as “generated by ChatGPT” and “generated by Copilot”, to locate GitHub code files that include such keywords, retaining only those files that contain GPT-generated code."

This seems like a pretty serious weakness to me; presumably there is a lot of code generated by LLMs that isn't annotated as such.

Where Are Large Language Models for Code Generation on GitHub?

Related

Karpathy: Let's reproduce GPT-2 (1.6B): one 8XH100 node 24h $672 in llm.c

Self hosting a Copilot replacement: my personal experience

Up to 90% of my code is now generated by AI

Related

Karpathy: Let's reproduce GPT-2 (1.6B): one 8XH100 node 24h $672 in llm.c

Self hosting a Copilot replacement: my personal experience

Up to 90% of my code is now generated by AI