December 4th, 2024

Automated reasoning to remove LLM hallucinations

Amazon Web Services has launched Automated Reasoning checks in Amazon Bedrock Guardrails to enhance large language model accuracy, allowing organizations to validate outputs against established facts, currently in preview in Oregon.

Read original article

Automated reasoning to remove LLM hallucinations

Amazon Web Services (AWS) has introduced Automated Reasoning checks as a new feature in Amazon Bedrock Guardrails, aimed at enhancing the accuracy of responses generated by large language models (LLMs) and mitigating factual errors caused by hallucinations. This feature employs mathematical and logic-based verification processes to ensure that the outputs of generative AI applications align with established facts. Automated Reasoning checks are part of a broader suite of safeguards that include content filtering and personal identifiable information (PII) redaction. The checks allow organizations to encode their rules and guidelines into structured policies, which can then be used to validate the accuracy of LLM-generated content. This approach is particularly beneficial for applications where factual accuracy is critical, such as in human resources or operational workflows. The system analyzes uploaded documents to create initial reasoning policies, which can be tested and refined through a user-friendly interface. The Automated Reasoning checks are currently available in preview in the US West (Oregon) AWS Region, with plans for broader access in the future. This initiative positions AWS as a leader in responsible AI capabilities among major cloud providers.

- AWS has launched Automated Reasoning checks to improve the accuracy of LLM outputs.

- The feature is part of Amazon Bedrock Guardrails, which includes various content safety measures.

- Organizations can create structured policies to validate LLM-generated content.

- Automated Reasoning checks are particularly useful for applications requiring high factual accuracy.

- The feature is currently in preview in the US West (Oregon) AWS Region.

Leveraging AI for efficient incident response

Meta has developed an AI-assisted system for root cause analysis, achieving 42% accuracy by combining heuristic retrieval and LLM ranking, significantly improving investigation efficiency while addressing potential risks through feedback and explainability.

Looming Liability Machines (LLMs)

The use of Large Language Models for root cause analysis in cloud incidents raises concerns about undermining human expertise, leading to superficial analyses, systemic failures, and risks from unexpected automated behaviors.

AWS AI Stack – Ready-to-Deploy Serverless AI App on AWS and Bedrock

The AWS AI Stack is a boilerplate for serverless AI applications, featuring backend services, a React frontend, AI chat functionality, and easy deployment with the Serverless Framework. A live demo is available.

Apple study proves LLM-based AI models are flawed because they cannot reason

Apple's study reveals significant reasoning shortcomings in large language models from Meta and OpenAI, introducing the GSM-Symbolic benchmark and highlighting issues with accuracy due to minor query changes and irrelevant context.

Amazon Nova

Amazon has launched Amazon Nova, a suite of foundation models for generative AI, featuring understanding and creative models, customization options, and safety controls to enhance productivity and reduce costs.

10 comments

By @lsy - 4 months

I find it hard to believe that anything like this will be feasible or effective beyond a certain level of complexity. It seems like a willful denial of the complexity and ambiguity of natural language, and I am not looking forward to some poor developer trying to reason their way out of a two-hundred-step paradox that was accidentally created.

And for a use-case simple enough for this system to work (e.g. regurgitate a policy), it seems like the LLM is unnecessary. After all, if your system can perfectly interpret the question and answer and see if this rule set applies, then you can likely just use the rule set to generate the answer rather than wasting resources with a giant language model.

By @Metricon - 4 months

This amuses me tremendously. I began programming in the early 1980s and quickly developed an interest in Artificial Intelligence. At the time there was a great interest in the advancement of AI by the introduction of "Expert Systems" (which would later play a part in the ‘Second AI Winter’).

What Amazon appears to have done here is use a transformers based neural network (aka LLM) to translate natural language into symbolic logic rules which are collectively used together in what could be identified as an Expert System.

Full Circle. Hilarious.

For reference to those on the younger side: The Computer Chronicles (1984) https://www.youtube.com/watch?v=_S3m0V_ZF_Q

By @gibsonf1 - 4 months

If the automated reasoning worked, why would you need an LLM and its fabrications?

By @bloomingkales - 4 months

Just looking at this AWS workflow takes the joy out of programming for me.

By @pkoird - 4 months

I'll say this again, any sufficiently advanced LLM is indistinguishable from Prolog.

By @majestik - 4 months

I hadn't heard of Amazon Bedrock Guardrails before, but after reading about it, it seems similar to Nvidia NeMo Guardrails which I have heard of: https://docs.nvidia.com/nemo/guardrails/introduction.html

The approaches seem very different though. I'm curious if anyone here has used either or both and can share feedback.

By @nl - 4 months

This is an interesting approach.

By constraining the field it is trying to solve it makes grounding the natural language question in a knowledge graph tractable.

An analogy is type inference in a computer language: it can't solve every problem but it's very useful much of the time (actually this is a lot more than an analogy because you can view a knowledge graph as an actual type system in some circumstances).

By @tomlockwood - 4 months

If this is necessary, LLMs have officially jumped the shark. And I do wonder how much of this "necessary logic" has already been added to ChatGPT and other platforms, where they've offloaded the creation of logic-based heuristics to Mechanical Turk participants, and like the old meme, AI unmasked is a bit of LLM and a tonne of IF, THEN statements.

I get the vibe VC money is being burned with promises of an AGI that may never eventuate and there's no clear path to.

By @spartanatreyu - 4 months

Post title: Automated reasoning to remove LLM hallucinations

---

and yet, the paper that went around in March:

Paper Link: https://arxiv.org/pdf/2401.11817

Paper Title; Hallucination is Inevitable: An Innate Limitation of Large Language Models

---

Instead of trying to trick a bunch of people into thinking we can somehow ignore the flaws of post-LLM "AI" by also using the still flawed pre-LLM "AI", why don't we cut the salesman BS and just tell people not to use "AI" for the range of tasks it's not suited for.

By @drew-y - 4 months

How does automation reasoning actually check a response against the set of rules without using ML? Wouldn't it still need a language model to compare the response to the rule?

Automated reasoning to remove LLM hallucinations

Related

Leveraging AI for efficient incident response

Looming Liability Machines (LLMs)

AWS AI Stack – Ready-to-Deploy Serverless AI App on AWS and Bedrock

Apple study proves LLM-based AI models are flawed because they cannot reason

Amazon Nova

Related

Leveraging AI for efficient incident response

Looming Liability Machines (LLMs)

AWS AI Stack – Ready-to-Deploy Serverless AI App on AWS and Bedrock

Apple study proves LLM-based AI models are flawed because they cannot reason

Amazon Nova