June 22nd, 2024

Researchers describe how to tell if ChatGPT is confabulating

Researchers at the University of Oxford devised a method to detect confabulation in large language models like ChatGPT. By assessing semantic equivalence, they aim to reduce false answers and enhance model accuracy.

Read original articleLink Icon
Researchers describe how to tell if ChatGPT is confabulating

Researchers from the University of Oxford have developed a method to identify when large language models (LLMs) like ChatGPT are confabulating, providing false answers with confidence. Confabulation occurs when LLMs produce wrong and arbitrary claims due to uncertainties in facts or phrasing. The study focuses on semantic entropy to evaluate the certainty of LLM responses by assessing the semantic equivalence of potential answers. By distinguishing between uncertainty in phrasing and incorrect answers, the researchers aim to prevent confabulation and improve the accuracy of LLM outputs. This research is crucial as LLMs are increasingly relied upon for various tasks, from academic assignments to job applications. Understanding and mitigating confabulation in LLMs can enhance the reliability and trustworthiness of their responses, benefiting users who depend on these models for information and assistance.

Related

Testing Generative AI for Circuit Board Design

Testing Generative AI for Circuit Board Design

A study tested Large Language Models (LLMs) like GPT-4o, Claude 3 Opus, and Gemini 1.5 for circuit board design tasks. Results showed varied performance, with Claude 3 Opus excelling in specific questions, while others struggled with complexity. Gemini 1.5 showed promise in parsing datasheet information accurately. The study emphasized the potential and limitations of using AI models in circuit board design.

How to run an LLM on your PC, not in the cloud, in less than 10 minutes

How to run an LLM on your PC, not in the cloud, in less than 10 minutes

You can easily set up and run large language models (LLMs) on your PC using tools like Ollama, LM Suite, and Llama.cpp. Ollama supports AMD GPUs and AVX2-compatible CPUs, with straightforward installation across different systems. It offers commands for managing models and now supports select AMD Radeon cards.

Lessons About the Human Mind from Artificial Intelligence

Lessons About the Human Mind from Artificial Intelligence

In 2022, a Google engineer claimed AI chatbot LaMDA was self-aware, but further scrutiny revealed it mimicked human-like responses without true understanding. This incident underscores AI limitations in comprehension and originality.

Delving into ChatGPT usage in academic writing through excess vocabulary

Delving into ChatGPT usage in academic writing through excess vocabulary

A study by Dmitry Kobak et al. examines ChatGPT's impact on academic writing, finding increased usage in PubMed abstracts. Concerns arise over accuracy and bias despite advanced text generation capabilities.

Detecting hallucinations in large language models using semantic entropy

Detecting hallucinations in large language models using semantic entropy

Researchers devised a method to detect hallucinations in large language models like ChatGPT and Gemini by measuring semantic entropy. This approach enhances accuracy by filtering unreliable answers, improving model performance significantly.

Link Icon 9 comments
By @derefr - 7 months
> But perhaps the simplest explanation is that an LLM doesn't recognize what constitutes a correct answer but is compelled to provide one

Why is it compelled to provide one, anyway?

Which is to say, why is the output of each model layer a raw softmax — thus discarding knowledge of the confidence each layer of the model had in its output?

Why not instead have the output of each layer be e.g. softmax but rescaled by min(max(pre-softmax vector), 1.0)? Such that layers that would output higher than 1.0 just get softmax'ed normally; but layers that would output all "low-confidence" results (a vector all lower than 1.0) preserve the low-confidence in the output — allowing later decoder layers to use that info to build I-refuse-to-answer-because-I-don't-know text?

By @dawatchusay - 7 months
Is confabulation different from hallucination? If not I do suppose this is a more accurate term for the phenomenon except that the exact definition isn’t common sense without looking it up whereas “hallucination” is more widely understood.
By @glymor - 7 months
TL;DR sample the top N results from the LLM and use traditional NLP to extract factoids, if the LLM is confabulating the factoids would have random distribution, but if it's not it will be heavily weighted towards one answer.

A figure from the paper shows this better than my TL;DR: https://www.nature.com/articles/s41586-024-07421-0/figures/1

By @lokimedes - 7 months
What we lack is for these models to state their context for their response.

We have focused on the inherent lack of input context, leading to wrong conclusions, but what about that 90B+ parameters universe, plenty of room for multiple contexts to associate any input to surprising pathways.

In the olden days of MLPs we had the same problem with softmax basically squeezing N output scores into a normalized “probability”, where each output neuron actually was the sum of multiple weighted paths, which one winning the softmax made up the “true” answer, but there may as well have been two equally likely outcomes, with just the internal “context” as difference. In physics we have the path integral interpretation and I dare say, we humans too, may provide outputs that are shaped by our inner context.

By @zmmmmm - 7 months
> There are a number of reasons for this. The AI could have been trained on misinformation; the answer could require some extrapolation from facts that the LLM isn't capable of; or some aspect of the LLM's training might have incentivized a falsehood

This article seems rather contrived. They present this totally broken idea of how LLMs work (that they are trained from the outset for accuracy on facts) and then proceed to present this research as it is a discovery that LLMs don't work like that.

By @ajuc - 7 months
Simplistic version of this is just asking the question in 2 ways - ask for confirmation that the answer is no, then ask for confirmation that the answer is yes :)

If it's sure it won't confirm it both ways.

By @gmerc - 7 months
So the same as SelfCheckGPT from several months ago?
By @doe_eyes - 7 months
> LLMs aren't trained for accuracy

This assertion in the article doesn't seem right at all. When LLMs weren't trained for accuracy, we had "random story generators" like GPT-2 or GPT-3. The whole breakthrough with RLHF was that we started training them for accuracy - or the appearance of it, as rated by human reviewers.

This step both made the models a lot more useful and willing to stick to instructions, and also a lot better at... well, sounding authoritative when they shouldn't.

By @techostritch - 7 months
This method seems to lean into the idea of LLM as fancy search engine rather than true intelligence. Isn’t the eventual goal of LLMs or ai that it’s smarter than humans. So I guess my questions are:

Is it plausible that LLM’s get so smart that we can’t understand them. Do we spend like years trying to validate scientific theories confabulated by AI?

In the run up to super-intelligence, it seems like we’ll have to tweak the creativity knobs up, like the whole goal will be to find novel patterns humans don’t find, is there a way to tweak those knobs that get us super genius and not super conspiracy theorist? Is there even a difference? Part of this might depend on whether or not we think we can feed LLM’s “all” the information.

But in fact, assuming that Silicon Valley CEO’s are some of the smartest people in the world, I might argue that confabulation of a possible future is in fact their primary value. Not being allowed to confabulate is incredibly limiting.