Tools

Tools: Powerful Mathematics In The Library Of Babel

2026-02-21 0 views admin

Mathematics isn't only about saying true things. It's about asking the right questions, being confused, stumbling about, getting distracted, being wrong, recognizing when you're wrong, being stuck. Mostly being stuck. It's about clinging to a giant edifice and feeling it out until you understand some tiny piece of it. It's about finding meaning in and intuition for the texture of an object which, at first, can only be apprehended by bashing your skull into it until it imprints on your forehead. Then trying to convey some of that insight to someone else, and watching as they find their own way to it.

I started trying to get LLMs to do math in July 2020, through the game "AI Dungeon," one of the earliest applications powered by GPT-3. I first got GPT-3 to produce a correct proof (of Fermat's Little Theorem) in April 2022. At the time I did not think they would become useful for math research in the near term.

An attempt at getting GPT-3 to do arithmetic using the game "AI Dungeon” in 2020. It refused.

This changed when the first reasoning models were released: 1 on February 1, 2025, I wrote that the model o3-mini-high “clearly has passed the threshold of genuine usefulness” for research, while still making many, many mistakes. Since then, the models have improved, and ChatGPT 5.2 Pro (released in December 2025) can regularly provide reasonable proofs of lemmas that I would characterize as “involved but routine for experts,” though it still makes many errors. And I have been using Codex, OpenAI's coding/computer use agent, for scientific computing tasks I would not have considered attempting a few months ago. 2

In public comments, I've tried to credit successes while pushing back against hype. I've talked a lot about "slop" papers on arXiv. I have worried that we are polluting the scientific commons with incorrect mathematics whose errors are enormously difficult to detect. I've tried to focus on the present. In this essay I'll talk about the future.

Since early last year I've been telling my colleagues that I expected models to be able to autonomously produce research mathematics comparable to some of the output of the best human mathematicians by 2040, but it seemed unlikely that this would occur by 2030. I expected major successes before then, including perhaps proofs of minor but interesting open conjectures 3 in the next year and more important open conjectures not too long afterwards. I thought there were likely serious obstructions to autonomous m

Source: HackerNews

🏷️ Tags

applibrarycli

Tools: Powerful Mathematics In The Library Of Babel

🏷️ Tags

More from Tools

Tools: How to generate a PDF from HTML in Node.js (without Puppeteer)

Tools: How I Manage AI Coding Rules Across Claude Code, Cursor, and Codex With One CLI

Tools: Your Dev Tools Are Leaking Data. Here’s Why I Built Mine to Run Entirely in the Browser.

Tools: Vibe Coding is best for repid development but, most of programmer don't knows about .

Trending

CVE-2025-61481: Critical Remote Code Execution Vulnerability in MikroTik RouterOS & SwitchOS

CVE-2025-43939: Dell Unity OS Command Injection (High)

Google disputes false claims of massive Gmail data breach

Microsoft: DNS outage impacts Azure and Microsoft 365 services

3.5B Accounts, 1 Critical Flaw: Meta Closes WhatsApp Data-Harvesting