July 6th, 2024

How Good Is ChatGPT at Coding, Really?

A study in IEEE evaluated ChatGPT's coding performance, showing success rates from 0.66% to 89%. ChatGPT excelled in older tasks but struggled with newer challenges, highlighting strengths and vulnerabilities.

Read original articleLink Icon
How Good Is ChatGPT at Coding, Really?

A study published in IEEE Transactions on Software Engineering evaluated OpenAI's ChatGPT in coding tasks, showing a broad range of success rates from 0.66% to 89%. While ChatGPT can outperform humans in some cases, it also raises security concerns. The AI's ability to generate correct code decreased for newer problems post-2021, indicating a lack of exposure to evolving coding challenges. ChatGPT demonstrated efficiency in solving LeetCode problems pre-2021 but struggled with newer ones. It could fix compiling errors but had difficulty correcting its mistakes due to a lack of understanding. The generated code showed vulnerabilities like missing null tests, but these were fixable. Developers are advised to provide additional information to help ChatGPT comprehend problems better and avoid vulnerabilities. Overall, ChatGPT's performance varied based on the complexity of the coding task and the familiarity of the problem, showcasing both strengths and limitations in AI-based code generation.

Related

ChatGPT is biased against resumes with credentials that imply a disability

ChatGPT is biased against resumes with credentials that imply a disability

Researchers at the University of Washington found bias in ChatGPT, an AI tool for resume ranking, against disability-related credentials. Customizing the tool reduced bias, emphasizing the importance of addressing biases in AI systems for fair outcomes.

The Death of the Junior Developer – Steve Yegge

The Death of the Junior Developer – Steve Yegge

The blog discusses AI models like ChatGPT impacting junior developers in law, writing, editing, and programming. Senior professionals benefit from AI assistants like GPT-4o, Gemini, and Claude 3 Opus, enhancing efficiency and productivity in Chat Oriented Programming (CHOP).

ChatGPT is hallucinating fake links to its news partners' biggest investigations

ChatGPT is hallucinating fake links to its news partners' biggest investigations

ChatGPT by OpenAI generates fake URLs for major news sites, failing to link to correct articles despite promises. Journalists express concerns over reliability and demand transparency due to hallucinated URLs.

OpenAI's ChatGPT Mac app was storing conversations in plain text

OpenAI's ChatGPT Mac app was storing conversations in plain text

OpenAI's ChatGPT Mac app had a security flaw storing conversations in plain text, easily accessible. After fixing the flaw by encrypting data, OpenAI emphasized user security. Unauthorized access concerns were raised.

ChatGPT just (accidentally) shared all of its secret rules

ChatGPT just (accidentally) shared all of its secret rules

ChatGPT's internal guidelines were accidentally exposed on Reddit, revealing operational boundaries and AI limitations. Discussions ensued on AI vulnerabilities, personality variations, and security measures, prompting OpenAI to address the issue.

Link Icon 2 comments
By @nutshell89 - 3 months
The study involves AI based code generation on Leetcode problems using GPT-3.5 — functional code was produced between 0.66 percent and 89 percent of the time, but I'd like to see the same study conducted with GPT-4 or the latest Claude models.