July 18th, 2024

ChatGPT Isn't 'Hallucinating'–It's Bullshitting – Scientific American

AI chatbots like ChatGPT can generate false information, termed as "bullshitting" by authors to clarify responsibility and prevent misconceptions. Accurate terminology is crucial for understanding AI technology's impact.

Read original articleLink Icon
ChatGPT Isn't 'Hallucinating'–It's Bullshitting – Scientific American

The article discusses the use of AI chatbots like ChatGPT and highlights the issue of these programs generating false information, which the authors argue should be termed as "bullshitting" rather than "hallucinating." They explain that these chatbots, based on large language models, aim to produce human-like text without necessarily caring about the truthfulness of the information. The authors emphasize that referring to AI-generated falsehoods as "hallucinations" is misleading and can impact public understanding of the technology. They suggest that using accurate terminology is crucial for clarifying responsibilities when such technology is used in critical areas like healthcare. By labeling ChatGPT as a "bullshit machine," the authors aim to prevent anthropomorphizing these chatbots and ensure accountability for the information they generate. The article underscores the importance of language in shaping perceptions of technology and the need to accurately describe AI capabilities to avoid misconceptions and misplaced trust.

Link Icon 7 comments
By @pixelsort - 6 months
The mind of a dreaming person is much closer analogy to how LLMs unpack representations of knowledge. Are dreams also bullshit because the dreaming mind doesn't care whether the imagery and ideas it presents are grounded by truth?

The term "hallucination" may be imprecise, but it is a lot closer to the truth than "bullshit".

By @randcraw - 6 months
Not an especially insightful article. BS is an imprecise term that's no improvement on hallucination and actually suggests a hidden agenda from the LLM. 'Hallucinate' captures rather well a state of blissful ignorance of how nonsensical and surreal the LLM's malpropism is, rather than a malign intent to deceive from 'BS'.
By @adultSwim - 6 months
Both terms are misleading anthropomorphizing.
By @Lockal - 6 months
ChatGPT is not bullshitting; it is outputting words within the same statistics as in training dataset.

The bullshit is only in the actions of marketers trying to engage technically illiterate wallets in funding a bubble out of false promises (such as "dreaming", "imagination", "hallucinations", "bullshitting", "superintelligence", "superalignment", "smarter than human", and so forth).

By @surfingdino - 6 months
Of course it is. It's time to call this BS what it is.
By @agucova - 6 months
> Now, we can see from this description that nothing about the modeling ensures that the outputs accurately depict anything in the world. There is not much reason to think that the outputs are connected to any sort of internal representation at all.

This is just wrong. Accurate modelling of language at the scale of modern LLMs requires these models to develop rich world models during pretraining, which also requires distinguishing facts from fiction. This is why bullshitting happens less with better, bigger models: the simple answer is that they just know more about the world, and can also fill in the gaps more efficiently.

We have empirical evidence here: it's even possible to peek into a model to check whether the model 'thinks' what it's saying is true or not. From “Discovering Latent Knowledge in Language Models Without Supervision” (2022) [1]:

> Specifically, we introduce a method for accurately answering yes-no questions given only unlabeled model activations. It works by finding a direction in activation space that satisfies logical consistency properties, such as that a statement and its negation have opposite truth values. (...) We also find that it cuts prompt sensitivity in half and continues to maintain high accuracy even when models are prompted to generate incorrect answers. Our results provide an initial step toward discovering what language models know, distinct from what they say, even when we don't have access to explicit ground truth labels.

So when a model is asked to generate an answer it knows is incorrect, it's internal state still tracks the truth value of the statements. This doesn't mean the model can't be wrong about what it thinks is true (or that it won't try to fill in the gaps incorrectly, essentially bullshitting), but it does mean that the world models are sensitive to truth.

More broadly, we do know these models have rich internal representations, and have started learning how to read them. See for example “Language Models Represent Space and Time” (Wes & Tegmark, 2023) [2]:

> We discover that LLMs learn linear representations of space and time across multiple scales. These representations are robust to prompting variations and unified across different entity types (e.g. cities and landmarks). In addition, we identify individual "space neurons" and "time neurons" that reliably encode spatial and temporal coordinates. While further investigation is needed, our results suggest modern LLMs learn rich spatiotemporal representations of the real world and possess basic ingredients of a world model.

For anyone curious, I can recommend the Othello-GPT paper as a good introduction to this problem (“Do Large Language Models learn world models or just surface statistics?”) [3].

[1]: https://arxiv.org/abs/2310.02207

[2]: https://arxiv.org/abs/2310.02207

[3]: https://thegradient.pub/othello/