AI hallucinations: Why LLMs make things up (and how to fix it)
AI hallucinations in large language models can cause misinformation and ethical issues. A three-layer defense strategy and techniques like chain-of-thought prompting aim to enhance output reliability and trustworthiness.
Read original articleLarge Language Models (LLMs) can produce confident yet fictional responses, a phenomenon known as "AI hallucination." This issue is significant as it can lead to misinformation, ethical concerns, and legal implications for organizations. Hallucinations arise from limitations in model architecture, probabilistic generation constraints, and gaps in training data. To mitigate these hallucinations, a three-layer defense strategy is proposed: input layer controls to optimize queries, design layer improvements to enhance model architecture, and output layer validations to verify responses. Techniques such as chain-of-thought prompting, retrieval-augmented generation, and fine-tuning can improve the reliability of LLM outputs. Future research aims to innovate around these mitigation techniques and develop new architectures that enhance the understanding of data by LLMs. While hallucinations cannot be entirely eliminated, understanding their causes and implementing effective strategies can significantly reduce their occurrence, thereby increasing the trustworthiness of AI systems.
- AI hallucinations can lead to misinformation and reputational damage for organizations.
- Hallucinations stem from model architecture limitations, probabilistic generation issues, and training data gaps.
- A three-layer defense strategy can help mitigate hallucinations in LLMs.
- Techniques like chain-of-thought prompting and retrieval-augmented generation enhance output reliability.
- Future research focuses on improving AI understanding and developing new architectures to reduce hallucinations.
Related
Large Language Models are not a search engine
Large Language Models (LLMs) from Google and Meta generate algorithmic content, causing nonsensical "hallucinations." Companies struggle to manage errors post-generation due to factors like training data and temperature settings. LLMs aim to improve user interactions but raise skepticism about delivering factual information.
Overcoming the Limits of Large Language Models
Large language models (LLMs) like chatbots face challenges such as hallucinations, lack of confidence estimates, and citations. MIT researchers suggest strategies like curated training data and diverse worldviews to enhance LLM performance.
GPTs and Hallucination
Large language models, such as GPTs, generate coherent text but can produce hallucinations, leading to misinformation. Trust in their outputs is shifting from expert validation to crowdsourced consensus, affecting accuracy.
LLMs Will Always Hallucinate, and We Need to Live with This
The paper by Sourav Banerjee and colleagues argues that hallucinations in Large Language Models are inherent and unavoidable, rooted in computational theory, and cannot be fully eliminated by improvements.
Internal representations of LLMs encode information about truthfulness
The study examines hallucinations in large language models, revealing that their internal states contain truthfulness information that can enhance error detection, though this encoding is complex and dataset-specific.
Every article on hallucinations needs to start with this fact until we've hammered that into every "AI Engineer"'s head. Hallucinations are not a bug—they're not a different mode of operation, they're not a logic error. They're not even really a distinct kind of output.
What they are is a value judgement we assign to the output of an LLM program. A "hallucination" is just output from an LLM-based workflow that is not fit for purpose.
This means that all techniques for managing hallucinations (such as the ones described in TFA, which are good) are better understood as techniques for constraining and validating the probabilistic output of an LLM to ensure fitness for purpose—it's a process of quality control, and it should be approached as such. The trouble is that we software engineers have spent so long working in an artificially deterministic world that we're not used to designing and evaluating probabilistic quality control systems for computer output.
[0] They link to this paper: https://arxiv.org/pdf/2401.11817
It suggests a qualitative difference between desirable and undesirable operation that isn't really there. They're all hallucinations, we just happen to like some of them more than others.
The first problem was a simple numbers problem. It's 2 digit numbers in a series of boxes. You have to add numbers together to make a trail to get from left to right moving only horizontally or vertically. The numbers must add up to 1000 when you get to the exit. For people it takes about 5 minutes to figure out. The AI couldn't get it after all 50 students each spent a full 30 minutes changing the prompt to try to get it done. The AI would just randomly add numbers and either add extra at the end to make 1000, or just say the numbers added to 1000 even if it didn't.
The second problem was writing a basic one paragraph essay with one citation. The humans got it done, when with researching for a source, in about 10 minutes. After an additional 30 minutes none of the students could get AI to produce the paragraph without logic or citation errors. It would either make up fake sources, or would just flat out lie about what the sources said. My favorite was a citation related to dairy farming in an essay that was supposed to be about the dangers of smoking tobacco.
This isn't necessarily relevant to the article above, but if there are any teachers here, this is something to do with your students to teach them exactly why not to just use AI for their homework.
Also, I saw any such blog title as "how to make money in the stock market:" friend, if you knew the answer you wouldn't blog about it you'd be infinitely rich
The only downsides of this approach is that it requires a lot of tokens before the model can ascertain the correctness of its answer, and also that sometimes it just gives up and concludes that the puzzle is unsolvable (although that second part can be mitigated by adding something like "There is definitely a solution, keep trying until you solve it" to the prompt).
https://openreview.net/forum?id=FBkpCyujtS (min_p sampling, note extremely high review scores)
https://github.com/xjdr-alt/entropix (Entropix)
https://www.youtube.com/watch?v=nEnklxGAmak
It's not a single thing, a specific defect, but rather a failure mode, an absence of cohesive intelligence.
Any attempt to fix a non-specific ailment (schizophrenia, death, old age, hallucinations) will run into useless panaceas.
MetaAI makes up stuff reliably. You'd think it would be an ace at baseball stats for example, but "what teams did so-and-so play for", you absolutely must check the results yourself.
When we are not sure of an answer we have two choices: say the first thing that comes to mind (like an LLM), or say "I'm not sure".
LLMs aren't easily trained to say "I'm not sure" because that requires additional reasoning and introspection (which is why CoT models do better); hence hallucinations occur when training data is vague.
So why not just measure uncertainty in the tokens themselves? Because there are many ways to say the same thing, so a high entropy answer may only reflect uncertainty in synonyms-- many ways to say the same thing.
The paper referenced works to eliminate semantic similarity from entropy measurements, leaving much more useful results, proving that hallucination is conceptually a simple problem.
This crayon is red. This crayon is blue.
The adult asks: "is this crayon red?" The child responds: "no that crayon is blue." The adult then affirms or corrects the response.
This occurs over and over and over until that child understands the difference between red and blue, orange and green, yellow and black etcetera.
We then move on to more complex items and comparisons. How could we expect AI to understand these truths without training them to understand?
LLMs likely have a similar problem.
Because you don't know how to fix it. Only how to mitigate it.
Context is still a huge problem for AI models, and it's probably still the main reason for hallucinating AIs.
I like the output = creative
TLDR: Hallucinations are inherent to the whole thing but as humans we can apply bubble gum, bandaids and prayers
Related
Large Language Models are not a search engine
Large Language Models (LLMs) from Google and Meta generate algorithmic content, causing nonsensical "hallucinations." Companies struggle to manage errors post-generation due to factors like training data and temperature settings. LLMs aim to improve user interactions but raise skepticism about delivering factual information.
Overcoming the Limits of Large Language Models
Large language models (LLMs) like chatbots face challenges such as hallucinations, lack of confidence estimates, and citations. MIT researchers suggest strategies like curated training data and diverse worldviews to enhance LLM performance.
GPTs and Hallucination
Large language models, such as GPTs, generate coherent text but can produce hallucinations, leading to misinformation. Trust in their outputs is shifting from expert validation to crowdsourced consensus, affecting accuracy.
LLMs Will Always Hallucinate, and We Need to Live with This
The paper by Sourav Banerjee and colleagues argues that hallucinations in Large Language Models are inherent and unavoidable, rooted in computational theory, and cannot be fully eliminated by improvements.
Internal representations of LLMs encode information about truthfulness
The study examines hallucinations in large language models, revealing that their internal states contain truthfulness information that can enhance error detection, though this encoding is complex and dataset-specific.