Erdos Problem 1196: Can AI now solve maths that no human can?

9 min

•May 16, 20262 months ago

Summary

An AI chatbot solved Erdos Problem 1196, a decades-old mathematical puzzle, in under 80 minutes—a problem that mathematician Jared Ducal Lictman had been working on for seven years. The episode explores what this breakthrough means for mathematics and whether AI is now capable of solving problems that have eluded human mathematicians.

Insights

AI is now capable of solving long-standing mathematical problems that have resisted human effort for decades, but only because human mathematicians have established the foundational understanding and frameworks.
The future of mathematics may involve AI as a collaborative tool rather than a replacement, providing intuitions and solutions that mathematicians can verify and build upon.
Unlike fields such as photography or journalism, mathematics will always require human experts to validate, interpret, and contextualize AI-generated solutions.
The speed of AI problem-solving (80 minutes vs. 7 years) represents a fundamental shift in how mathematical research can be conducted, though the underlying human expertise remains essential.
Mathematicians' reactions to AI solutions are pragmatic rather than defensive—they care about knowing the answer regardless of who or what solves it.

Trends

AI as mathematical collaborator rather than replacement for human mathematiciansAcceleration of research timelines in pure mathematics through AI-assisted problem solvingGrowing need for human expertise to validate and contextualize AI-generated mathematical proofsShift from individual mathematician achievement to human-AI collaborative discovery modelsIncreased accessibility of unsolved problems through centralized platforms (Erdos Problem website)AI capability expansion into abstract mathematical reasoning and proof generationPotential for AI to tackle clusters of related mathematical problems simultaneouslyRisk of over-reliance on AI outputs without rigorous human verification in mathematics

Topics

AI-assisted mathematical problem solving Erdos Problems and primitive set conjecture Pure mathematics research and unsolved problems AI proof verification and validation Human-AI collaboration in academic research Fermat's Last Theorem and mathematical history Chatbot capabilities in abstract reasoning Career implications of AI for mathematicians Mathematical proof generation by AI Riemann hypothesis and Millennium problems Number theory and prime numbers Academic job security in the age of AI Prompt engineering for mathematical problem solving Peer review and validation of AI-generated proofs The role of human intuition in mathematics

Companies

OpenAI

Creator of GPT-4 Pro, the AI chatbot used by Liam Price to solve Erdos Problem 1196 in under 80 minutes.

People

Charlotte McDonald

Host of the More or Less podcast episode discussing AI solving mathematical problems.

Katie Steckles

Discussed unsolved mathematical problems including Fermat's Last Theorem and the history of mathematical puzzles.

Jared Ducal Lictman

Spent 7 years working on Erdos Problem 1196 before AI solved it; verified the AI solution and discussed implications.

Liam Price

23-year-old who used OpenAI's GPT-4 Pro to solve Erdos Problem 1196 through clever prompt engineering in 80 minutes.

Paul Erdos

Prolific 20th-century mathematician after whom the Erdos Problems are named; known for itinerant collaboration style.

Quotes

"I have a solution but it's too big to fit in the margin."

Pierre de Fermat (referenced by Katie Steckles)•Early in episode

"After maybe about an hour or so of just kind of reading it through and kind of sifting through what the raw output looked like. Yeah, it was pretty clear that it was correct and the idea was very nice."

Jared Ducal Lictman•Mid-episode

"As a mathematician, you just want to know something is true. If someone else solves a problem like a human solves a problem that you cared about, I don't think people would necessarily be also asking, you know, did your colleague solving the problem? Like, are you upset that there's a solution?"

Jared Ducal Lictman•Late episode

"I think right now we're in a position where AI can actually provide intuitions like a trusted colleague could."

Jared Ducal Lictman•Closing discussion

"It's not like photography or journalism where the public can make sense of the output themselves. You're always going to need mathematicians even if AI keeps getting better at the solutions."

Charlotte McDonald•Analysis segment

Full Transcript

BBC Sounds Music Radio Podcasts Hello and thanks for downloading the More or Less podcast with a program that looks at the numbers in the news, in life and in 50 year old maths puzzles. I'm Charlotte McDonald. Ever since AI started creeping into our lives, people in certain professions have been worrying that it's come to steal their jobs. Software coders, insurance analysts and junior lawyers are all watching AI unfold with understandable trepidation. But recent events mean we might need to spare a thought for the select few whose job sits at the very pinnacle of academia, the brave souls who study pure mathematics. For hundreds of years, mathematicians have been straining their gigantic brains against fiendish maths problems that no one has yet been able to figure out. They ponder them, debate them, publish papers on them, solving them can be a life's work. Then, on the 13th of April this year, AI appeared to solve a mathematical problem that has so far eluded mere human thinkers known as Erdosch problem 1196. Experts in the field were surprised to say the least. But what is this problem? Has AI really solved it? And what does it mean for mathematics if it has? Unsolved maths problems have a sudden mystique. The one you might have heard of is Fermat's Last Theorem, which was discovered in a handwritten note on the edge of a page in a textbook written by the 17th century mathematician. He'd written, I have a solution but it's too big to fit in the margin. That's mathematician Katie Steckles and it wasn't solved until 1994. It was several hundred years before we actually got a resolution to this and the mathematician that proved it was using areas of maths that didn't even exist in Fermat's time. And while you might have only heard of one, there are plenty more. With all kinds of strange names, the Collatz conjecture, the Riemann hypothesis. Oh there's loads and we're finding more every day. Every area of maths has its sort of big questions. There are the Hilbert problems and the Millennium problems. But more recently, another, larger set of maths problems has been put together. These are known as the Erdosh problems. And they were first posed by one of maths's most friendedly productive minds. So Paul Erdosh was one of the most famous mathematicians in terms of being a story, right? He's a story to tell. He was what people call an itinerant mathematician. So he would spend almost all of his time travelling around visiting conferences, going to maths events. And he would stay at the houses of other mathematicians. I think there's an implication that he just kind of turned up and was like, here's my laundry. Please feed me some food, etc. People were quite happy to host him and to talk to him. There are over 1200 Erdosh problems and in 2023 they were collected together on a website so mathematicians could see which needed solving. And share their proofs. I first heard about this one particular Erdosh problem called the Erdosh Primitive Set Conjecture when I was a senior in my undergraduate. That's Jared Ducal Lictman, mathematician and number theorist at Stanford University. And kind of completely fell in love with the problem. The Erdosh problems we're talking about today are to do with primitive sets. You probably know about prime numbers, whole numbers which only divide by themselves and one. Well, a primitive set's a bit like that. But where you pick out the numbers specifically so that the rule still works even though the numbers aren't necessarily prime. So, for example, the numbers 4, 5 and 6, considered an isolation, are a primitive set. You can't divide 4 by 5 or 6 and get a whole number. And the same's true for 5 and 6. And that's all we're going to tell you about the maths in these Erdosh problems. There's something to do with primitive sets. Everything else is way too complicated. Back to Jared. You know, kind of on my own time and at night I would still think about this problem and couldn't put it down. And I ended up, you know, solving it after four years of kind of never giving up on the problem. What Jared solved in four years was Erdosh problem 164, which he went on to use in his doctorate. But that wasn't the only problem he was interested in. There was a cluster of other related Erdosh problems, including problem 1196. And he'd been thinking about that one too for all that time. What happens if the numbers in a primitive set are all larger than x and you want to understand how the score can grow as x tends to infinity? And he kept on thinking about it and working on it for another three years. Until last month, he woke up to find an email on his computer. I received a message saying that this essentially amateur mathematician had run GPT 5.4 Pro on this problem and received output that he thought could be a candidate solution to this problem. And when I received the message, I was immediately very skeptical. And I started reading through the output and again, the output looked very, very raw and very unstructured. This raw and unstructured proof had been teased from AI Chatbot, chat GPT, by a 23 year old Brit with after math degree. My name's Liam Price and I used an AI from OpenAI to solve Erdosh problem 1196. So I came up with a clever prompt and I gave it to the AI and after about 80 minutes of thinking, it came out with a solution. So I then passed it to another version of the model to say, here's the solution, please can you look at it and verify that it's correct. Of which it came out and saying that it essentially couldn't find any errors in the argument. Liam sent the solution to his mathematician friend Kevin, who then sent it on to Jared to look over. Let's get back to Jared to see how he got on with it. After maybe about an hour or so of just kind of reading it through and kind of sifting through what the raw output looked like. Yeah, it was pretty clear that it was correct and the idea was very nice. The problem that Jared had been thinking about for seven years had been solved in one prompt by AI in under 80 minutes. It's very strange. It's very strange. You know, I just kind of be sat here at my computer trying these problems, not really expecting anything to come of it and and having all of this kind of attention was unexpected, but but I'm happy about it. Yes. AI has solved Erdos problems before. Dozens of them in fact, but problem 11196 is different. Mathematicians have been working on it for decades and the proof that AI came up with has been celebrated by some of the greatest minds in the business. So was Jared upset that a chatbot had beat him to the punch? You know, instantly I was very happy and actually started working on these other clusters of problems. Jared is completely relaxed about the fact that AI is solving these difficult problems. As a mathematician, you just want to know something is true. If someone else solves a problem like a human solves a problem that you cared about, I don't think people would necessarily be also asking, you know, did your colleague solving the problem? Like, are you upset that there's a solution? So for certain problems, one really just wants to know the answer and then you can just say, you know, you just want to know the answer and have access to it regardless of how it is obtained. At the same time, what's very clear from this story is that AI could only work because there are people like Jared who actually understand pure mathematics and care about the solutions. It's not like photography or journalism where the public can make sense of the output themselves. You're always going to need mathematicians even if AI keeps getting better at the solutions. I think right now we're in a position where AI can actually provide intuitions like a trusted colleague could. So maybe you have someone down the hall who is coming up with crazy ideas and I think we're at the stage now that for some part of the time the output is going to actually bear fruit. And at the moment right now in 2026, we're at a point where we're kind of at a collaborator kind of feedback. AI may not be turning up at your house in the middle of the night like Erdosh and asking you to do its laundry, but it could still facilitate the kind of manic collaboration that the great man once did. That's it for this week. Thanks to Katie Steckles, Jared Ducat Lickman and Liam Price. If you've seen the number you think we should take a look at, email us on more or less at bbc.co.uk. Until next time, goodbye.