Machine learning and information theory concepts towards an AI Mathematician
The paper by Yoshua Bengio and Nikolay Malkin explores AI's limitations in mathematical reasoning, proposing an information-theoretical approach to generate new conjectures, set for publication in 2024.
Read original articleThe paper titled "Machine learning and information theory concepts towards an AI Mathematician," authored by Yoshua Bengio and Nikolay Malkin, discusses the limitations of current artificial intelligence in mathematical reasoning compared to its proficiency in language. The authors suggest that while deep learning excels in intuitive, system 1 abilities, it falls short in system 2 abilities, which involve reasoning and uncertainty estimation. They propose an information-theoretical approach to explore what constitutes an interesting mathematical statement, aiming to guide the development of an AI mathematician. The focus is on generating new conjectures rather than merely proving existing theorems. The central hypothesis posits that a compact set of theorems can effectively summarize all provable statements, balancing brevity with proximity to numerous derivable statements. This work is set to be published in the Bulletin of the AMS in 2024.
- The paper addresses the gap in AI's mathematical reasoning capabilities.
- It distinguishes between system 1 (intuitive) and system 2 (reasoning) abilities in AI.
- The authors advocate for an information-theoretical perspective on mathematical statements.
- The goal is to develop AI that can generate new mathematical conjectures.
- The research will be published in the Bulletin of the AMS in 2024.
Related
How to Raise Your Artificial Intelligence: A Conversation
Alison Gopnik and Melanie Mitchell discuss AI complexities, emphasizing limitations of large language models (LLMs). They stress the importance of active engagement with the world for AI to develop conceptual understanding and reasoning abilities.
The Elegant Math of Machine Learning
Anil Ananthaswamy discusses his book on machine learning, emphasizing the importance of data for neural networks, their ability to approximate functions, and the mathematical elegance behind AI's potential applications.
Harmonic: Mathematical Reasoning by Vlad Tenev and Tudor Achim
Researchers are enhancing AI chatbots to reduce inaccuracies by integrating mathematical verification. Harmonic's Aristotle can prove answers, while Google DeepMind's AlphaProof shows potential in competitions, though real-world challenges persist.
We're Entering Uncharted Territory for Math
Terence Tao discusses AI's evolving role in mathematics, highlighting OpenAI's o1 series as a tool for complex tasks, enhancing collaboration, and emphasizing AI's complementary role to human mathematicians.
LLMs don't do formal reasoning
A study by Apple researchers reveals that large language models struggle with formal reasoning, relying on pattern matching. They suggest neurosymbolic AI may enhance reasoning capabilities, as current models are limited.
In my experience learning math, their claim that "The central hypothesis is that a desirable body of theorems better summarizes the set of all provable statements, for example by having a small description length" is not true. Short != Better, better is what gets me faster to form the correct intuitive idea about the mathematical statement. For example, several times I have experienced the fact of understanding the formal definitions and proofs of a theory, but it's not until I form the correct intuitions (maybe months later), that I truly understand the theory. And it's not until I have the correct intuitions, that I can successfully apply the theory to create meaningful new theorems.
Anyways, I understand that one has to start from somewhere and the point of view of the article is more tractable and explicit.
Then we have:
>> By analogy, we can now state the main hypothesis proposed in this paper: a crucial component of the usefulness of a new proven theorem t (in the context of previous theorems T(S)) is how efficiently T(S)∪{t} compresses the set of all provable mathematical statements M. That is, T (S) ∪ {t} is a good compression of M if many provable statements, beyond those in S, can be derived from T (S) ∪ {t}, say using at most k derivation steps.
which says "Occam's Razor, that's good isn't it?" in more words.
A fun detail is to be found in a footnote: "A mathematician is a person who can find analogies between theorems; a better mathematician is one who can see analogies between proofs and the best mathematician can notice analogies between theories. One can imagine that the ultimate mathematician is one who can see analogies between analogies.” (attributed to Stefan Banach in Randrianantoanina and Randrianantoanina [2007]).
And reading this, it seems to me this implies the "ultimate" mathematician is a poet.
P.S. I initially assumed "Randrianantoanina and Randrianantoanina" was the title of some nerdy speculative fiction. It turns out this is a regular paper reference. The authors are Malagasy, no doubt. They have such cool names there.
Here Bengio seems to just suggest using estimated entropy `E_P(t|T(S))[− log P(t | T(S))]` of the theorem to find interesting/surprising theorems to try and prove.
This reminds me of the early work in estimating LLM uncertainty/hallucinations using perplexity. It didn't work very well.
Actually generating good "exercises" for the system to "practice" on, and theorems to try and prove, still seems like the biggest road block for an AI mathematician. It connects to the philosophical question of why we are doing math in the first place?
community is having large debates on whether an LLM can reason outside of its training.This feels ignored in here.
### *Compression as a Measure of Theorem Usefulness*: The notion that a good theorem compresses provable statements is intriguing but may need more exploration in terms of practical utility. While compression aligns with Occam's Razor and Bayesian learning principles, it's not always clear whether the most "compressed" theorems are the most valuable, especially when considering the depth and complexity of many foundational theorems in mathematics.
### *Human-AI Collaboration*: The paper lightly touches on how this AI mathematician might work alongside humans, but the real power of such a system might lie in human-AI collaboration. A mathematician AI capable of generating insightful conjectures and proofs could dramatically accelerate research, but the interaction between AI and human intuition would be key.
### *Computational and Theoretical Limits*: There are also potential computational limits to the approach. The "compression" and "conjecture-making" frameworks proposed may be too complex to compute at scale, especially when considering the vast space of possible theorems and proofs. Developing approximation methods or heuristics that are effective in real-world applications will likely be necessary.
Here's how we can unpack this paper:
### *System 1 vs. System 2 Thinking*: - *System 1* refers to intuitive, fast, and automatic thinking, such as recognizing patterns or generating fluent responses based on past experience. AI systems like GPT-4 excel in this area, as they are trained to predict and generate plausible content based on large datasets (e.g., text completion, language generation). - *System 2* refers to deliberate, logical, and slow thinking, often involving reasoning, planning, and making sense of abstract ideas—such as solving a mathematical proof, engaging in formal logic, or synthesizing novel insights. The claim that AI lacks System 2 abilities suggests that while AI can mimic certain behaviors associated with intelligence, it struggles with tasks that require structured, step-by-step reasoning and deep conceptual understanding.
### "Not so much in terms of mathematical reasoning"
The claim is *partially true*, but it must be put into context:
- **Progress in AI**: AI has made **tremendous advances** in recent years, and while it may still lack sophisticated mathematical reasoning, there is significant progress in related areas like automated theorem proving (e.g., systems like Lean or Coq). Specialized systems can solve well-defined, formal mathematical problems—though these systems are not general-purpose AI and operate under specific constraints.
- **Scope of Current Models**: General-purpose models like GPT-4 weren't specifically designed for deep mathematical reasoning. Their training focuses on predicting likely sequences of tokens, not on formal logic or theorem proving. However, with enough specialized training or modularity, they could improve in these domains. We’ve already seen AI systems make progress in proving mathematical theorems with reinforcement learning and imitation learning techniques.
- **Frontiers of AI**: As AI continues to develop, future systems might incorporate elements of both System 1 and System 2 thinking by combining pattern recognition with symbolic reasoning and logical processing (e.g., systems that integrate neural networks with formal logic solvers or reasoning engines).
### Conclusion:
AI excels in tasks involving intuitive, pattern-based thinking but struggles with deliberate, goal-oriented reasoning required for deep mathematical work. However, as research evolves—especially in hybrid models that combine deep learning with symbolic reasoning and formal logic—these limitations may become less pronounced.The future of AI may very well involve systems that are capable of the same level of mathematical reasoning (or better) as "human experts."
Related
How to Raise Your Artificial Intelligence: A Conversation
Alison Gopnik and Melanie Mitchell discuss AI complexities, emphasizing limitations of large language models (LLMs). They stress the importance of active engagement with the world for AI to develop conceptual understanding and reasoning abilities.
The Elegant Math of Machine Learning
Anil Ananthaswamy discusses his book on machine learning, emphasizing the importance of data for neural networks, their ability to approximate functions, and the mathematical elegance behind AI's potential applications.
Harmonic: Mathematical Reasoning by Vlad Tenev and Tudor Achim
Researchers are enhancing AI chatbots to reduce inaccuracies by integrating mathematical verification. Harmonic's Aristotle can prove answers, while Google DeepMind's AlphaProof shows potential in competitions, though real-world challenges persist.
We're Entering Uncharted Territory for Math
Terence Tao discusses AI's evolving role in mathematics, highlighting OpenAI's o1 series as a tool for complex tasks, enhancing collaboration, and emphasizing AI's complementary role to human mathematicians.
LLMs don't do formal reasoning
A study by Apple researchers reveals that large language models struggle with formal reasoning, relying on pattern matching. They suggest neurosymbolic AI may enhance reasoning capabilities, as current models are limited.