October 1st, 2024

A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs

The paper analyzes package hallucinations in code-generating LLMs, revealing a 5.2% rate in commercial models and 21.7% in open-source models, urging the research community to address this issue.

Read original articleLink Icon
A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs

The paper titled "We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs" addresses the emerging issue of package hallucinations in code-generating Large Language Models (LLMs). These hallucinations occur when LLMs generate erroneous package recommendations, posing a significant threat to the integrity of the software supply chain, particularly in languages like Python and JavaScript that rely on centralized package repositories. The authors conducted a thorough evaluation using 16 popular LLMs and generated 576,000 code samples to analyze the prevalence of these hallucinations. Their findings indicate that commercial models exhibit an average hallucination rate of 5.2%, while open-source models show a much higher rate of 21.7%, resulting in over 205,000 unique hallucinated package names. The study emphasizes the systemic nature of this issue and proposes several mitigation strategies that effectively reduce hallucinations without compromising code quality. The authors call for urgent attention from the research community to address this persistent challenge in the field of software engineering.

- Package hallucinations represent a new threat to software supply chains, particularly in popular programming languages.

- The study found that commercial LLMs have a 5.2% hallucination rate, while open-source models have a 21.7% rate.

- Over 205,000 unique hallucinated package names were identified in the analysis.

- Mitigation strategies were implemented that significantly reduced hallucinations while maintaining code quality.

- The authors urge the research community to focus on addressing the challenges posed by package hallucinations.

Link Icon 2 comments
By @SirMaster - 8 months
This seems to be a problem I frequently run into. The LLM tends to suggest using libraries that don't seem to ever have existed as far as I can find.
By @Mathnerd314 - 8 months
> One course of action that we chose not to pursue for ethical reasons was publishing actual packages using hallucinated package names to PyPI

I mean, this makes sense from a security perspective. But from a language usage perspective, if there is a missing package that would be super-useful, then implementing and publishing that package would be a win.

I'm curious what the package names were, they seem to have deliberately omitted any package names. Maybe there are some good package ideas in the 19% of names that were hallucinated by multiple models.