September 10th, 2024

Deductive Verification for Chain-of-Thought Reasoning in LLMs

The paper discusses limitations of Chain-of-Thought prompting in LLMs, proposing a framework called Natural Program to improve deductive reasoning accuracy and trustworthiness through structured verification and smaller subprocesses.

Read original articleLink Icon
Deductive Verification for Chain-of-Thought Reasoning in LLMs

The paper titled "Deductive Verification of Chain-of-Thought Reasoning" explores the limitations of Chain-of-Thought (CoT) prompting in Large Language Models (LLMs), which can lead to hallucinations and errors in complex reasoning tasks. The authors propose a method to enhance the deductive reasoning capabilities of LLMs by implementing a structured verification process. This involves breaking down the reasoning verification into smaller subprocesses that focus on specific contexts and premises. The proposed framework, termed Natural Program, allows for a more rigorous generation of reasoning steps, ensuring that each subsequent step is grounded in the previous one. This method not only improves the accuracy of answers in complex reasoning tasks but also facilitates self-verification of the reasoning process at each stage. The authors aim to enhance the trustworthiness and correctness of LLM outputs through this systematic approach. The code related to this research will be made available for further exploration and application.

- The paper addresses the challenges of hallucinations and errors in LLMs using Chain-of-Thought prompting.

- A new framework called Natural Program is proposed for structured deductive reasoning.

- The verification process is decomposed into smaller, context-specific subprocesses.

- The approach enhances the accuracy and trustworthiness of reasoning outputs.

- Code for the proposed method will be released for public use.

Link Icon 3 comments
By @YeGoblynQueenne - 7 months
Chain of Thought prompting reminds me of Facilitated Communication:

https://en.wikipedia.org/wiki/Facilitated_communication

A long discredited intervention where a "facilitator" guides the hand of a non-verbal human to help them write down their thoughts and experiences. Experiments that blinded the facilitator to the observations of the subject, where the written message matched the facilitator's, rather than the subject's, observations, have convincingly proved that it was so much bunkum. It's the Clever Hans Effect all by another name, and with non-verbal humans rather than horses.

Chain of Thought works like that: without hand-holding by a human who understands how to answer a question, the LLM's performance drops, or drops off a cliff even. Of course this is much harder to prove for LLMs than it was for facilitated communication because LLMs don't really do anything without a prompt in the first place. Which should be a very big hint of what's really going on with CoT.

By @jawon - 7 months
People out there trying build some semblance of AI out of an LLM using larger and larger networks of “agents” that generate, classify, revise and verify data using the same LLM they're building larger and larger networks of agents upon to try and build some semblance of AI.

The end game is a brain-sized network where each neuron is an agent sending a 1M token prompt to a 10T parameter model to update their "weights".

By @Lerc - 7 months
Sometimes it looks like the computationalists are trying to sneak back into the room while no-one is looking.

There does seem to be quite a lot of independent ad-hoc efforts making custom notations for C-O-T. I feel like we're in a period similar to just after the first programming languages and compilers were invented but regular expressions were yet to come. In a way that's quite exciting, its another little Cambrian explosion.

I don't think it will be a panacea though. In my observations of failures of reasoning in LLMs, a lot of the problem isn't that they fail to follow logical steps but that they fail to notice the presence of implied premises completely. Chain of Thought is good for spotting the wrong reasoning, but not for spotting that the problem is not the one that it appears at first glance.