Machine Learning and Information Theory Concepts towards an AI Mathematician

Type: Article
Publication Date: 2024-05-15
Citations: 1
DOI: https://doi.org/10.1090/bull/1839

Abstract

The current state of the art in artificial intelligence is impressive, especially in terms of mastery of language, but not so much in terms of mathematical reasoning. What could be missing? Can we learn something useful about that gap from how the brains of mathematicians go about their craft? This essay builds on the idea that current deep learning mostly succeeds at system 1 abilities—which correspond to our intuition and habitual behaviors—but still lacks something important regarding system 2 abilities—which include reasoning and robust uncertainty estimation. It takes an information-theoretical posture to ask questions about what constitutes an interesting mathematical statement, which could guide future work in crafting an AI mathematician. The focus is not on proving a given theorem but on discovering new and interesting <italic>conjectures</italic>. The central hypothesis is that a desirable body of theorems better summarizes the set of all provable statements, for example, by having a small description length while at the same time being close (in terms of number of derivation steps) to many provable statements.

Locations

  • arXiv (Cornell University)
  • Bulletin of the American Mathematical Society

Ask a Question About This Paper

Summary

This paper explores the potential of using machine learning, particularly deep learning, to create an AI mathematician capable of discovering new and interesting mathematical conjectures, not just proving existing theorems.


Significance: The paper addresses the gap between current AI's mastery of language and its limited mathematical reasoning abilities. It proposes a novel approach to mathematical discovery, shifting the focus from proving known theorems to finding new and interesting ones.


Key Innovations:
* Information-Theoretic Objective for Mathematical Discovery: The paper introduces the idea that a good set of theorems should compress the set of all provable statements. This means that the set should be small, but able to easily prove many other provable statements within a small number of derivation steps. This serves as a training objective for the AI.
* Proofs as Trajectories for Conjecture-Making Agents: The paper frames the process of finding a proof as an agent navigating a state-space, akin to reinforcement learning. Derivation steps become actions, and the AI learns to find paths (proofs) from existing knowledge to new theorems.
* Active Learning for Theorem Selection: The paper suggests using active learning to select the most informative theorems to add to the AI's knowledge base. This involves prioritizing theorems that are surprising (low probability given current knowledge) or uncertain.


Prior Ingredients Needed:
* Deep Learning and Generative Models: The paper builds on the successes of deep learning, particularly generative models like Large Language Models (LLMs), in creating realistic content.
* Information Theory: The paper leverages concepts from information theory, such as compression and description length, to define the objective function for mathematical discovery.
* Reinforcement Learning: The paper utilizes concepts from reinforcement learning to model the theorem-proving process as an agent navigating a state-space.
* Automated Theorem Proving (ATP): The paper acknowledges the existing work in automated theorem proving, particularly the use of proof tactics and formal verification systems like Coq and Lean. It aims to go beyond ATP by enabling the AI to generate new conjectures.

Similar Works

Action Title Date Authors
Machine learning and information theory concepts towards an AI Mathematician 2024-03-07 Yoshua Bengio Nikolay Malkin
Theorem proving in artificial neural networks: new frontiers in mathematical AI 2024-01-20 Markus Pantsar
Notes on a Path to AI Assistance in Mathematical Reasoning 2023-01-01 Alex Kontorovich
Mathematical reasoning and the computer 2025-02-11 Kevin Buzzard
A Simplicity Bubble Problem in Formal-Theoretic Learning Systems 2021-01-01 Felipe S. AbrahĂŁo HĂ©ctor Zenil FĂĄbio Porto Michael Winter Klaus Wehmuth Ítala M. Loffredo D’Ottaviano
Mathematical reasoning and the computer 2024-02-15 Kevin Buzzard
Algebraic Learning: Towards Interpretable Information Modeling 2022-01-01 Tong Owen Yang
Towards Machine Learning Mathematical Induction. 2018-12-04 Yutaka Nagashima
Mathematical intuition, deep learning, and Robbins' problem 2024-01-01 F. Thomas Bruss
A Triumvirate of AI Driven Theoretical Discovery 2024-05-30 Yang‐Hui He
+
The Theory of Mathematical Inference 1900-01-01 G. J. Stokes
+
Mathematical reasoning: induction, deduction and beyond 2006-08-25 David Sherry
Machine Learning for Mathematical Software 2018-01-01 Matthew England
A Moonshot for AI Oracles in the Sciences 2024-06-25 Bryan Kaiser Tailin Wu Maike Sonnewald Colin Thackray Skylar Callis
AI Reasoning Systems: PAC and Applied Methods 2018-01-01 Jeffrey Cheng
+
Proof and Computation: Perspectives for Mathematics, Computer Science, and Philosophy 2021-07-29 Klaus Mainzer
Symbolic Logic meets Machine Learning: A Brief Survey in Infinite Domains 2020-06-15 Vaishak Belle
Symbolic Logic meets Machine Learning: A Brief Survey in Infinite Domains 2020-01-01 Vaishak Belle
+
Learning About Mathematical Proof: Conviction and Validity 1999-02-01 Judith Segal
+
Learning as inference 2012-06-05 David Barber