LLMs struggle to explain themselves
Large language models can identify number patterns but struggle to provide coherent explanations. An interactive demo highlights this issue, revealing that even correct answers often come with nonsensical reasoning.
Read original articleThe article discusses the limitations of large language models (LLMs) in explaining their reasoning, particularly in recognizing number patterns. An interactive demo allows users to generate random programs that compute number sequences, where the LLM is tasked with identifying a third sequence based on two examples. Despite the LLM's ability to select the correct sequence, its explanations often lack coherence and accuracy. The author notes that even when the LLM provides a correct answer, the reasoning it offers is typically nonsensical. The demo utilizes an API from Anthropic's Claude 3.5 Sonnet and allows users to experiment with different settings and prompts. The author reflects on the surprising ineffectiveness of LLMs in generating meaningful explanations, even after multiple interactions. The article concludes with a personal anecdote about the author's inspiration for the project, emphasizing the potential for future improvements in LLM capabilities regarding number sequence recognition.
- LLMs can identify number patterns but struggle to explain their reasoning effectively.
- The interactive demo allows users to experiment with LLMs in recognizing number sequences.
- Even correct answers from LLMs are often accompanied by nonsensical explanations.
- The project was inspired by a personal observation related to number patterns.
- There is optimism for future advancements in LLM capabilities for pattern recognition.
Related
- Many users agree that LLMs primarily recognize patterns rather than genuinely reason or recall past thoughts.
- Some commenters share personal experiences with data analysis and the challenges of interpreting results, drawing parallels to LLM behavior.
- There is skepticism about claims of emergent behavior in LLMs, with some asserting that they merely fit patterns learned during training.
- Several comments highlight that both LLMs and humans can struggle to explain their decisions post hoc.
- One user expresses interest in a custom tool for generating integer sequences, suggesting a preference for more focus on innovative approaches rather than LLMs.
That is, they have absolutely no genuine recollection of what they were thinking at the time they said something in the past. Even with "Tree of Thought" approaches all you're doing is recording past conversations, states, and contexts, and your new inference asking for the "justification" of that, will be a similarly totally fake justification, because as I said they have no memory but only context.
In my own app I can switch to a different LLM right in the middle of a conversation and the new LLM will just continue to always think it said everything in the prior context even though that's not the case.
In fact, if asked for the reason post mortem, people (and LLM) are likely to make them up on the spot. I wonder if the same dynamics is at play here.
Not surprising, since the Fibonacci sequence will be in the text swallowed by the LLM.
The Fibonacci sequence with 3,1 would be 3,1,4,5. I think you mean the house number was 1347. That would work and be easier to notice.
So for a hackweek I built a tool to tokenize all of our log messages, and then grabbed all of our logs and built a gigantic n-dimensional vector for every five minute chunk of two days of those logs, then calculated the pythagorean difference for each of those five minute chunks, and looked at the biggest differences, most outlier five minute chunks. And they were all from 8-8:30AM CET on the two days (our company and most of our customers were US based, I just was looking at what timezones matched up to the interesting time). I said "okay, this looks interesting, let me see what is happening in the logs then" but it was impossible to figure out what the statistics were seeing. Because the math thinks in ways that human brains don't- it views the entire dataset simultaneously, and human brains just can't keep five minutes of busy log files in their working memory, but humans build narratives and the math can't understand that. So I ended up getting frustrated and giving up on the project. Because explaining in terms that I could understand and start debugging was the whole point of the project!
To excuse their assumption of reasoning capabilities, the author in the FAQ snarkily points to “research” indicating evidence of reasoning—all of which was written by OpenAI and Microsoft employees who would not be allowed to publish anything to the contrary.
It’s a shame people continue to buy into the hype cycle on new tech. Here’s a hint: if the creators of VC-backed tech make extraordinary claims about it, you should assume it’s heavily exaggerated if not an outright lie.
The source code for the demo is on GitHub: https://github.com/jyc/stackbee
Imagine if you said to a secretary that you're 60% yes and 40% no, and she arbitrarily decided to write NO in your report and then a day later the board asked you why you made that decision.
You'd be confused too.