Dwarkesh Podcast

Terence Tao – Kepler, Newton, and the true nature of mathematical discovery

84 min
Mar 20, 20262 months ago
Listen to Episode
Summary

Terence Tao discusses how AI is transforming mathematical research, comparing current AI capabilities to Kepler's empirical approach versus Newton's theoretical insights. He explores AI's strengths in breadth-first problem solving, the challenges of evaluating partial progress, and the future of human-AI collaboration in mathematics.

Insights
  • AI excels at breadth in mathematics (solving many problems at a certain difficulty level) while humans excel at depth, creating complementary capabilities
  • Current AI mathematical tools have about 1-2% success rates on individual problems but can scale massively, leading to impressive aggregate results
  • The bottleneck in science is shifting from idea generation to verification and evaluation as AI can now generate theories at near-zero cost
  • Mathematical progress may increasingly come from hybrid human-AI collaboration rather than purely autonomous AI solutions
  • AI tools are making mathematicians more productive in auxiliary tasks (formatting, literature searches, visualizations) rather than core problem-solving
Trends
Shift from hypothesis-driven to data-driven scientific discoveryAI democratizing access to frontier mathematical research for non-PhDsNeed for new peer review systems to handle AI-generated research volumeEmergence of experimental mathematics using AI for large-scale pattern detectionEvolution from pure AI solutions to human-AI collaborative problem solvingTransformation of mathematical communication and proof presentationGrowing importance of formal proof systems like Lean for AI integration
Companies
Google
Mentioned as example of transformative search technology that became taken for granted
Bell Labs
Referenced as historical example of identifying breakthrough concepts like the bit
Wolfram Alpha
Cited as tool that automated differential equation solving previously done by mathematicians
Jane Street
Sponsored puzzle about neural network layer ordering mentioned in episode
People
Johannes Kepler
Central figure discussed for his empirical approach to discovering planetary motion laws
Isaac Newton
Contrasted with Kepler as providing theoretical explanation for planetary motion
Tycho Brahe
Danish astronomer whose precise observational data enabled Kepler's discoveries
Nicolaus Copernicus
Proposed heliocentric model that Kepler built upon for planetary motion laws
Charles Darwin
Discussed as example of scientific communication and theory presentation
Carl Friedrich Gauss
Mentioned for early data-driven approach to studying prime number patterns
Paul Erdős
Mathematical problems bearing his name that AI has recently solved
Quotes
"AI has basically driven the cost of idea generation down to almost zero in a very similar way to how the Internet drove the cost of communication down to almost zero"
Terence Tao
"We're now in a situation where suddenly people can generate thousands of theories for a given scientific problems. And now we have to verify them, evaluate them."
Terence Tao
"They excel at breadth and humans excel at depth. So I think they're very complementary."
Terence Tao
"Right now we're going through a cognitive version of the Copernican revolution, where we used to think that human intelligence is the center of the universe."
Terence Tao
Full Transcript
2 Speakers
Speaker A

Okay, today I'm chatting with Terence Tao, who needs no introduction. Terence, I want to begin by having you retell the story of how Kepler discovered the laws of planetary motion, because I think this will be a great jumping off point to talk about AI for math.

0:00

Speaker B

Okay. Yeah. So I've always had an amateur interest in astronomy, and so I've loved stories of how the early astronomers worked out the nature of the universe. So Kepler was building on the work of Copernicus, who is himself building on the work of Aristarchus. So Copernicus is very famous, famously proposed the heliocentric model that instead of the planets and sun going around the Earth, that the sun was at the center of the solar system and the other planets were going around the sun. And Copernicus proposed that the orbits of the planets were perfect circles. And his theory kind of fit the observations that the Greeks and the Arabs and Indians had worked out over centuries. I think Kepler got interested in. He learned about these theories in his studies, and he made this observation that the ratios of the size of the orbits that Copernicus predicted seem to have some geometric meaning. I think he started proposing that if you take, say, the orbit of, say, the Earth and you enclose it in, I think, maybe a cube, the outer sphere that encloses the cube almost matched perfectly the orbit of Mars and so forth. And there were six planets known at the time, five gaps between them, and there were five perfect Platonic solids, the cube, the tetrahedron, isocahedron, octahedron and dodecahedron. And so he had this theory which he thought was absolutely beautiful, that he could inscribe these Platonic solids between the spheres of the planets. And it seemed to fit, and it seemed to him like God's design of the planets was matching this mathematical perfection of the Platonic solids. So he needed data to confirm this theory. And at the time, there was only one really high quality data set almost in existence, which was the. So Tycho Brahe, this Danish astronomer, very wealthy eccentric astronomer, had managed to convince the Danish government to fund this extremely expensive observatory, in fact, an entire island where he had taken decades of observations of all the planets, Mars, Jupiter, every night, at least every night for which the weather was clear with the naked eye. Actually, he was last of the naked eye astronomers. And so he had all this data which Kepler could use to confirm his theory. And so Kepler started working with Tycho. But Tycho was very jealous of the data. He only gave little bits of it at a time. And I think Kepler eventually just stole the Data, he copied it and had to have a fight with Brahe's descendants. But he did work out, he did get the data. And then he worked out to kind of his disappointment that his beautiful theory didn't quite work. The data was sort of off from his Platonic solid theory by about 10% or something. And he tried all kinds of fudges, moving the circles around and things, and it didn't quite work. But he worked on this problem for years and years and eventually he figured out how to use the data to work out the actual orbits of the planets. And that was incredibly clever. Genius amount of data analysis actually. And then he eventually worked out that they also ellipses, not circles, which was shocking for him. And then he worked out the two laws of planetary motion. Ellipses also equal areas throughout nuclear times. And then 10 years later, after collecting a lot of data, the furthest planets like Saturn and Jupiter were the hardest for him to work out. But then he finally worked out this third law also that, that the orbits, the time it takes for a planet to go into orbit was proportional to some power of the distance to the sun. And these are the three famous Kepler laws of motion. And he had no explanation for them. It was just all driven by experiment. And it took Newton a century later to give a theory that explained all three laws at once.

0:15

Speaker A

The take I want to try on you is that Kepler was a high temperature LLM where Newton comes up with this explanation of why the three laws of planetary motion must be true. And of course, the way that Kepler discovers the laws of planetary motion or figures out the relative orbits of the different planets is, as you say, a work of genius. But then through his career he's just trying random relationships. And in fact, in the book in which he writes down the third law of planetary motion, it's sort of an aside on the harmonics of the world, which is this book about all these different planets have these different harmonies. And the reason there's so much famine and misery on Earth is because the Earth is mi FA mi, that's the node of Earth. And so all this random astrology, but in there is the cube square law, which tells you what relationship the period has to a planet's distance from the sun, which is as you were detailing, if you add that to Newton's F equals ma and then the equation for centripetal acceleration, you get the inverse square law. And so Newton works that out. But the reason I, I think this is an interesting story is I feel like LLMs can do the kind of thing of like 20 years. Let's try random relationships, some of which make no sense, as long as there's a verifiable data bank like Brahe's dataset, where, okay, I'm going to try out random things about musical notes, I'm going to try out random things about Platonic objects, I'm going to all these different geometries. I have this bias that there's some important thing about the geometry of these orbits. And then one thing works, and as long as you can verify it, these empirical regularities can then drive actual deep scientific progress.

4:09

Speaker B

Traditionally, when we talk about the history of science, idea generation has always been kind of the prestige part of science. So, I mean, a scientific problem comes with. There's many steps. You have to identify a problem and then you have to identify a good problem to work on a fruitful problem. And then you need to collect data, you need to figure out a strategy to analyze the data to make a hypothesis. And at this point you need to propose a good hypothesis, and then you need to validate, and then you need to write things up and explain. There are a dozen different components, but yeah, the ones we celebrate are these sort of eureka, genius moments of IDE generation. So Kepler certainly had to, as you say, cycle through many ideas and several which didn't work. And I bet many that he didn't even publish at all because they just didn't fit. And that's an important part of the process, trying all kinds of random things and seeing if they worked. But as you say, they have to be matched by an equal amount of verification, otherwise it's slopped. We celebrate Kepler, but we should also celebrate Brahe for his assiduous data collection, which was 10 times more precise than any previous observation. And that extra decimal point of accuracy was actually essential for Kepler to get his results. And he was using Euclidean geometry and the most advanced mathematics he could use at the time to match his models with the data. So all aspects had to be in play. The data and the theory and the hypothesis generation. I'm not sure nowadays that hypothesis generation is the bottleneck anymore. Science has changed in the centuries since. So classically, sort of the two big paradigms for science were theory and experiment. Then in the 20th century, numerical simulation came along. And so you can also do computer simulations to test theories. But then finally, in the late 20th century, we had big data. We had the era of data analysis. And so a lot of new progress is actually driven now by analyzing massive data sets, first collecting large data sets and Then drawing the patterns from them to deduce laws, which is a little bit different from how science used to work, where you make a few observations or you just have one out of the blue idea and then you collect data to test your idea. That's the classic scientific method. Now it's almost reverse. You collect big data first and then you try to get hypotheses from it. I mean, Kepler was maybe one of the first early data scientists, but even he didn't start with Tycho's data set and analyze it. He had some preconceived theories first. But it seems like this is less and less the way we make progress, just because the data is just so much more massive. It's just so much more useful.

5:44

Speaker A

Oh, interesting. I actually feel like the mold of 20th century science that you're describing is actually very well describes what happened in Kepler where he did have these ideas. 1595 and 96 is where he comes up with first polygons and then Platonic objects theory, but they were wrong. And then a few years later he gets Brahe's data. And it's only after 20 years of just trying random things that he gets this empirical regularity. And so it actually feels closer to Brahe's data is analogous to some massive data, Van Gogh simulations. And then now that you've got the data, you can keep trying random things. But if it wasn't, Kepler would be out there just writing books about harmonics and Platonic objects and there would be nothing to actually verify against.

8:48

Speaker B

Yeah, yeah. So the data was extremely important. But the distinction I was trying to make was that sort of traditionally you make a hypothesis and then you test it against data. But now with machine learning and data analysis and statistics and so on, you can start with data and through say, statistics work out laws that were not present before. So Kepler's third law is a little bit like this, except that for the third law, instead of having the thousand data points that Brahe had, Kepler had like six data points every planet. He knew the length of the orbit and the distance of the sun and there was like five or six data points and he did what we would now call regression. He could fit a curve to these six data points and he got a square kupla, which was amazing. But actually he was quite lucky. I mean, these six data points gave him the right conclusion. That's not enough data to be really reliable. There was a later astronomer, Johannes Borde, who took the same data, actually the distances to the planets. And inspired by Kepler, I think he had a prediction that the distances to the planets formed basically a shifted geometric progression. He also fit a curve, except there was one point missing. So there was a big gap between Mars and Jupiter. His law predicted that there was a missing planet. So it was a kind of a crank theory, except when Uranus was discovered by Herschel, the distance to Uranus fit exactly this pattern. And then Ceres was discovered, this asteroid between, I think in the asteroid belt, and it also fit the pattern. And so people got really excited that Boyd had discovered this amazing new law of nature. But then Neptune was discovered and it was completely way off. And basically it was just a numerical fluke. There were six data points. Yeah. So maybe one reason why Kepler didn't highlight his third law as much as the first two laws is that maybe instinctively, even though he didn't have modern statistics, he kind of knew that with six data points, he had to be somewhat tentative with the conclusions.

9:35

Speaker A

But maybe to ask the question about the analogy more explicitly, does this analogy make sense to if we have in the future, we'll have Smarter and Smarter AIs, and we'll have millions of them, and then they can go out and hunt for all these empirical regularities. It sounds like you don't think the bottleneck in science is finding more things that are for each given field, their equivalent of the third law of planetary motion. So that then later on somebody can say, oh, we need a way to explain this. Let's work out the math. Here's the inverse square law of gravity.

11:44

Speaker B

Right. So I think AI has basically driven the cost of idea generation down to almost zero in a very similar way to how the Internet drove the cost of communication down to almost zero, which is an amazing thing, but it doesn't create abundance by itself. So now the bottleneck is different. So we're now in a situation where suddenly people can generate thousands of theories for a given scientific problems. And now we have to verify them, evaluate them. And this is something which we have to change our structures of science to actually sort this out. So in fact, traditionally we build walls. So in the past, before we had AI slop, we had sort of amateur scientists have their own theories of the universe, many of which were basically of very little value. And so we built these peer review publication systems and things to kind of filter out and try to isolate the high signal ideas to test. But now that we can generate these possible explanations at massive scale, and some of them are good and a lot are terrible, I mean, human reviewers, they're already being overwhelmed. Actually, many Journals are reporting AI. Generous submissions are just flooding their submissions. So it's great that we can generate all kinds of things now with AI, but it means that the rest of the aspects of science have to catch up. Verification, validation, and assessing what ideas actually move the subject forward and which ones are dead ends or red herrings. And that's not something we know how to do at scale. For each individual paper, we can discuss it, have a debate among scientists and get to a consensus in a few years, but when we generate it, you know, a thousand of these every day, this doesn't work.

12:17

Speaker A

So I think there is this incredibly interesting question of if you have billions of AI scientists, not only how do you gauge which ones are real progress, but how do you. I mean, this is actually a question that human science has had to face and we've solved somehow. And I actually am not sure how we solved this, but in any given field, let's say in the 1940s, and if you're at Bell Labs or if you're just generally trying to, there's these new technologies coming out. Pulse code modulation, basically, how do you transfer signals, how do you digitize signals, how do you transfer them over analog wires? But there's all these papers about the engineering constraints there and the details. And then there's one which comes up with the idea of the bit, which has implications across many different fields. And you need some system which can then look at that and say, okay, we need to apply this to probability, we need to apply this to computer science, et cetera. And in the future, the AIs are coming up with the next version of this kind of unifying concept. And how would you identify it among millions of papers which might actually constitute progress, but which have much less general unifying ideas?

14:13

Speaker B

So a lot of it's the test of time. So many great ideas didn't actually get a great reception at the time that they were first proposed. It was only after some other scientists realized that they could take it further and apply them to their own deep learning itself was actually a niche area of AI for a long time. The idea of getting answers entirely through training on data and not through first principles reasoning was very controversial and they took a long time before it actually started bearing fruit. You mentioned the bit. There are other proposals for computer architectures than the 01 that is universal today. I think there were TRITS 013 valued logic, and in an alternate universe, maybe a different paradigm would have showed up. People have argued that the transformer, for example, is the foundation of all modern Large language models. And it was the first deep learning architecture that really was sophisticated enough to capture language. But it didn't have to be that way. There could have been some other architecture that was the first to do it, and once that was adopted, it would become the standard. So I think one reason why it's hard to assess whether a given idea is going to be fruitful is that it depends on the future. It depends also on the culture and society, which ones get adopted, which ones don't. The base 10 neural system in mathematics, extremely useful, much better than the Roman numeral system, for instance. But again, there's Nothing special about 10. It's a system that it's useful for us because everyone else uses it and we've standardized it and we've built all our computers and our number of representation systems around it, and so we're stuck with it now. Actually, some people occasionally push for other systems than decimal, but there's too much inertia. So you can't look at any given scientific achievement purely in isolation and give it an objective grade without being aware of the context, both in the past and the future. And so it may never be something that you can just reinforcement learn the same way that you can for much sort of more localized problems.

15:13

Speaker A

Yeah. It seems often in the history of science when a new theory comes up that in retrospect we realize is correct, it seems to make implications that just either make no sense because they're wrong and we realize later on why they're wrong, or they're correct, but seem wildly implausible at the time. So as you've talked about, Aristarchus had heliocentrism in the third century bc and then the ancient Athenians were like, this can't be, because if the Earth is going around the sun, we should see the relative position of the stars change as we're going around the sun. And the only way that wouldn't be the case is if they're so far away that you don't notice any parallax, which is actually the correct implication. But there's times when actually the implication is incorrect and we just need to graduate to a better level of understanding. So Leibniz would, you know, chide Newton and disagree with Newton's theory of gravity on the basis that it implied action at a distance, and then we don't know the mechanism. And Newton himself was sort of stunned that inertial mass and gravitational mass were the same quantity. So all these things later which were resolved by Einstein, but it was still progress. And so the question for a system of peer review for AI would be, even if you can falsify a theory, how would you notice that it still constitutes progress relative to the thing before?

17:31

Speaker B

Yeah. So often actually, the ultimately correct theory initially is worse in many ways. Yeah. So Copernicus theory of the planets, it was less accurate than Thomley's theory. So geocentrism had been developed for a millennium by that point, and they had made many, many tweaks and very increasingly complicated ad hoc fixes to make it more and more accurate. And Copernicus theory was a lot simpler, but. But much less accurate. It was only Kepler that made it more accurate than Thomley's theory. I mean, science is always a work in progress. So when you only get part of the solution, it looks worse than a theory which is incorrect, but somehow has been completed to the point where it kind of answers all the questions, as you say. Newton's theory had big mysteries, the equivalence of mass and action at distance, which were only resolved with a very conceptually different approach centuries afterwards. Often progress has been made not by adding more theories, but by deleting some assumptions that you have in your mind. So one reason why geocentrism held on for so long is we had this idea that objects naturally wanted to stay at rest. This is the Aristotleian notion of physics. And so the idea that the Earth was moving, how come we weren't all falling over? Once you have Newton, laws of motion, object motion remains in motion and so forth, then it makes sense. But you had to. So conceptually, it's a very big conceptual leap to realize that the Earth is in motion. It doesn't feel like it's in motion. And the biggest advances, Darwin's theory of evolution, is the idea that species are not static, but it's not obvious because you don't see evolution in your lifetime. Well, now we can, actually can, but it seems permanent and static. Right now we're going through a cognitive version of the Copernican revolution, where we used to think that human intelligence is the center of the universe. And now we're actually seeing that there's very different types of intelligence that are out there with very different strengths and weaknesses. And so our assessment of which tasks require intelligence, which don't, has to be reordered quite a bit. And so trying to fit AI into sort of our theories of scientific progress and what is hard and what is easy, we're struggling quite a lot. We have to ask questions that we've never really had to ask before. Or maybe the philosophers Had. But now we all have to deal with it.

18:48

Speaker A

This actually brings up a topic I've been very curious about. So you mentioned Darwin's theory of evolution. There's this book, the Clockwork Universe by Edward Dahlnik, which covers a lot of this era of history we're talking about. And he has this interesting observation in there that the Origin of Species is published in 1859. The Principia Mathematica is published in 1687. So the Origin of Species comes out basically two centuries after the Principia. And conceptually, it seems like Darwin's theory is simpler. There's a contemporaneous biologist to Darwin who reads the Origin of Species, Thomas Huxley, and he says, how stupid not to have thought of that. And nobody ever says that about friendship. Yeah, they're chiding themselves for not having beaten Newton to gravity. And so there's a question of, well, why did it take longer? It seems like a big part of the reason is that the evidence for natural selection is cumulative and retrospective, whereas Newton can just like, here's my equations. Let me see the Moon's orbital period and its distance. And if it lines up, then we've made progress. And so Lucretius actually had the idea, this idea that species adapted their environment in the first century bc, but nobody ever really talks about it until Darwin because Lucretius can't run some experiment, and people are forced to pay attention. And so I wonder if we'll, in retrospect, end up seeing much more progress in domains which have this kind of tight data loop where you can verify them quite easily, even though they're conceptually much more difficult.

21:35

Speaker B

I think one aspect of science is not just creating a new theory and validating it, but communicating it to others. So Darwin was actually an amazing science communicator. He wrote in English. Natural language. I'm speaking, like, in no lean. Okay. Yeah, Okay. I have to sort of get out of my technical mindset. Yeah. He spoke in plain English, didn't use equations, and he synthesized a lot of disparate facts here. So little pieces of evolution had been worked out in the past, but he had this very compelling vision. And again, still missing things like he didn't know the mechanism for hereditary. He didn't have DNA. But his writing style was persuasive, and that helped a lot. Newton wrote in Latin. He had invented entire new areas of mathematics just to explain what he was doing. He was also from an era where scientists were much more secretive and competitive. So academia is still competitive. It was even Worse back in Newton's day. So he held back some of his best insights because he didn't want his rivals to get any advantage. He was also a somewhat unpleasant person, from what I gather, actually. So it was actually only a couple decades after Newton, where other scientists explained his work in much simpler terms that they became widespread. So, yeah, the art of exposition and making a case and creating a narrative is also a very important part of science. And if you have the data and it helps, but people need to be convinced otherwise, they will not push it further or they will not take initial investment to learn your theory and really explore it. And that's another thing which is really hard to reinforce and learn on how can you score, how persuasive you are? Well, there's the entire marketing departments who are trying to do this. So maybe it's good that AI are not yet optimized to be persuasive. So yeah, there's a social aspect to science. You. Even though we pride ourselves on having an objective side to it, where there's data and there's experiment and validation, we still have to tell stories and convince our fellow scientists. And that's a soft, squishy thing. It's a combination of data and painting a narrative. And we're narrative of gaps. I mean, so. So even Darwin, as I said, there are pieces of his theory he could not explain, but he could still make a case that in the future people would find transitional forms, that they would find the mechanism of inheritance. And they did. Yeah. I don't know how you can quantify that in such a precise way that you can start doing reinforcement learning. Maybe that will be forever the human side of science.

23:01

Speaker A

One takeaway I had from reading and watching your stuff on the cosmic distance ladder. By the way, I highly, highly, highly recommend people watch your series with through the one Brown on the cosmic distance ladder. But one takeaway was that the deductive overhang in many fields could be so much bigger than people realize, where if you just had the right insight about how to study a problem, you might be surprised at how much more you could learn about the world. And I wonder if you think that's sort of a product of astronomy at the particular times in history that you're studying. Or is this that based on the data that is incident on the earth right now, we could actually divine a lot more than we happen to know.

26:10

Speaker B

Right. So astronomy was one of the first sciences to really embrace data analysis and squeezing every last possible drop of information out of information they had because data was the Bottleneck. I mean it still is the bottleneck. I mean, it's really hard to collect astronomical data. So astronomers are the best, almost world class in extracting, almost like Sherlock, extracting all kinds of conclusions from little traces of data. I hear that a lot of quant hedge funds, they preferred hires in astronomy PhD. They also are very interested for other reasons in extracting signals from various random bits of data.

26:51

Speaker A

Okay, speaking of clever ideas, one of my listeners, Sean, solved the puzzle that Jane street made for my audience and posted a great walkthrough on X for context. Jane street trained a resnet and then shuffled all 96 layers and then challenged people to put them back in the right order using only the model's outputs and training data. You can't brute force this. There's more possible orderings than atoms in the universe. So Sean broke the problem into two different parts. First, pair the layers into 48 different blocks and second, put those blocks in the right order for pairing. Sean realized that in a well trained resonant, the product of two weight matrices in a residual block should have a distinctive negative diagonal pattern. And this arises as a way for the model to keep the residual stream from growing out of control. From this insight, he was able to recover the right pairings for ordering. Shawn noticed that the model seemed to improve if he sorted the blocks by the size of their residual contributions. Starting with that rough approximation, he combined a clever ranking heuristic with local swaps to recover the exact right order. His full walkthrough is linked in the description. Don't worry if you didn't get to this puzzle in time though. There's still one up about backdoor LLMs that even Jane street doesn't know how to solve. You can find it@janestreet.com dwarkash all right, back to Terrence.

27:35

Speaker B

We do under explore sort of how to extract extra information from various signals just to pick one random study. I remember reading once that people had discovered were trying to measure how often scientists actually read these citations, the papers that they cite. So how do you measure this? You could try to survey different scientists, but they had some clever trick. So many citations have little typos like a number is wrong or punctuation symbol is wrong. And they measured how often a typo got copied from one reference to the next. And they could infer whether an author was actually just copying, cutting and pasting a reference without actually checking it. And so from that they were able to infer some measure of how much attention people were paying. So there are Also clever tricks to extract. So these questions you posed earlier of how can we assess whether a scientific development is fruitful or interesting or represents real progress. Maybe there are really useful metrics or footprints of this phenomenon in a data set of we can examine citations and how often something is mentioned in a conference or something. And maybe there's a lot of sociology of science research to be done and that could actually detect these things. Yeah, maybe. We usually get some astronomers on the case actually.

28:51

Speaker A

Okay, so I think this brings us nicely to the progress that from the outside it seems like AI for math is making. And I think you had a post recently where you pointed out that over the last few months AI programs have solved 50 out of the 1100 odd Erdos problems. But then I think, I don't know if it's still correct, but as of a month ago you said that there had been a pause because the low hanging fruit had been picked. First of all, I'm curious if actually that is still the case, that we have picked the low hanging fruit and now we're at this plateau currently, it does seem so.

30:31

Speaker B

I mean there's still activity at the early. Yeah, so 50 odd problems have been solved with AI assistance, which is great, but there's like 600 to go and people are still chipping away at one or two of these right now. We are seeing a lot fewer sort of pure AI solutions now where the AI just one shots the problem. So there was a month where that happened and that has stopped. Not for lack of trying. I know three separate attempts to get Frontier model AIs to just attack every single one of the problems simultaneously and they picked out some minor observations or maybe they found some problems I already solved in the literature. But there hasn't been any further AI purely powered solution yet. People are using AI a lot currently. So someone might use AI to generate a possible proof strategy and then another person will use a separate AI tool to critique it or rewrite it or generate some numerical data for it, or do a literature survey and some problems have been solved by an ongoing conversation between lots of humans and lots of AI tools. But it does seem like it was this one off thing. So maybe one analogy to for these problems it's like imagine you're in some sort of mountain range with all kinds of cliffs and walls and maybe there's a little wall which is maybe like 3ft high and one that's 6ft high and then there's 15ft high and then there's some mile high cliffs and you're Trying to climb as many of these cliffs as possible. But it's in the dark, we don't know which ones are tall, which ones are short. So we try to light some candles and make some maps and slowly kind of figure out some of them are climbable, some of them we can identify some partial track in the wall that you can reach first. And then these AI tools, they're kind of like these jumping machines that can kind of jump 2 meters in the air, higher than any human. And sometimes they jump in the wrong direction and sometimes they crash, but sometimes they can reach the tops of the lowest walls that we couldn't reach before. And so we basically set them loose in this mountain range, hopping around. And then there was this exciting period where they could actually find all the low ones and they could reach them. But then there's been no. I mean, maybe if the next time there's a big advance in the models, then they will try it again and maybe a few more will be breached. But it's a different style of doing mathematics than sort of the. So normally we would hill climb and we would make little markers and try to identify partial things. And these tools, they either succeed or they fail. And they've been really bad at creating sort of partial progress or identifying intermediate stages that you should focus on first. Again, going back to this previous discussion, we don't have a way of evaluating partial progress the same way you can evaluate a one shot success or failure of solving a problem.

31:03

Speaker A

So there's two different ways to think through what you've just said. And one of them is more bearish on AI progress, and one of them is more bullish and bearish on being, oh, they're only getting to a certain height of wall which is not as high as humans are reaching. And the second is that, well, they have this powerful property that once they achieve a certain waterline, they can fill every single problem that is available at that waterline, which we simply can't do with humans, where we can't make a million copies of you and give each of them a million dollars of inference, compute and have you do 100 years of subjective time research on 100 different problems at the same time, or a million different problems at the same time. But once AIs reached Terrence Tower level, they could do that. And once they reach intermediate levels, they could do the intermediate version of that. So the same reason that we should be bearish now is the reason we should be especially bullish. Not even when they achieve superhuman intelligence, but just when they achieve human level intelligence because their human level intelligence is qualitatively wider and more powerful than our human level intelligence.

34:19

Speaker B

I agree. Yeah. So they excel at breadth and humans excel at depth. Like human experts, at least. Yeah. So I think they're very complementary. But our current way of doing math and science is focused on depth because that's where the human expedited. Of course, humans can't do breadth, but. Yeah, so we have to redesign the way we do science to take full advantage of this breadth capability that we now have. So, as I said, we should have a lot more effort in creating very broad classes of problems to work on, rather than one or two really deep important problems. I mean, we should still have the deep, important problems and humans should still be working on them. But now we have this other way of doing science. We can explore entire new fields of science by first getting these broad, moderately competent AI to sort of map it out and clear out, make all the easy observations, and then identify certain islands of difficulty which then human experts can come and work on. So I see very much a future of very complementary science. Eventually you would hope to get both breadth and depth and somehow get the best of both worlds. But I think we need practice with the breadth side. It's too new. We don't even have the paradigms really to make full advantage of it. But we will, and then science will be unrecognizable after that.

35:21

Speaker A

I think to this point about complementarity, programmers have noticed that they're way more productive as a result of these AI tools. And I don't know if you as a mathematician feel the same way, but it does seem like one big difference between Vibe coding and Vibe researching is that with software, the whole point of the thing is to have some effect on the world through your work. And if it leads to you better understanding a problem or you coming up with some clean abstraction to embody in your code, that is instrumental to the end goal. Whereas maybe with research, the reason we care about solving the Millennium Preface problem is presumably that in the process of solving them, we discover new mathematical objects or better new techniques and those who understand our civilization's understanding of mathematics. And so the proof is sort of instrumental to the intermediate work. I don't know if you agree with that dichotomy or if that in any way will explain the relative uplift we'll see in software versus research.

36:55

Speaker B

Right? Yeah. So certainly in math, the process is often more important than the problem itself. The problem is kind of a proxy for measuring your progress. And I think Even in software, there's different types of software tasks. If you're just trying to create a web page that does the same thing that a thousand other web pages do, there's sort of no skill to be learned. Well, there's still some skill maybe that the individual programmer could pick up, but for kind of a boilerplate type code. Definitely. It's something that you should definitely offload to AI, but sometimes once you make the code, you still can maintain it and there's issues with upgrading it and making it compatible with other things. And that I think I've heard that programmers are reporting that even if an AI can create the first prototype of a tool, making it mesh with everything else and making it interact with the real world in the way they want, that's an ongoing process. And if you didn't have the skills that you pick up from writing the code, that may impact your ability to maintain it down the road. So certainly mathematicians, we've used problems to build intuition and to train people to have a good idea as what's true, what to expect, what is provable, what is difficult. And so just getting the answers right away may actually inhibit that process. I made a distinction between theory and experiment before. So in most sciences there's an equal division between there's a theoretical side and experimental side. But math has been almost unique, it's almost entirely theoretical. We place a premium on trying to have coherent, clean theories of why things are true and false. And we haven't done much experiments as to maybe we have two different ways to solve a problem. Which one is more effective? We have some intuition, but we haven't done large scale studies where we take 1000 problems and we just test them. But we can do that now. So I think AI type tools will actually revolutionize the experimental side of math, where you don't care so much about individual problems and the process of solving them, but you want to gather just large scale data about what things work, what things don't. Same way that if you're a software company and you want to roll out a thousand pieces of software, you don't really want to handcraft each one and learn lessons from each. You just want to find what are the workflows that you sc. So we don't yet. The idea of doing mathematics at scale is at its infancy, but that's where AI is really going to revolutionize the subject.

37:59

Speaker A

Interesting. I feel like a big crux in these conversations about how good AI will be for science is, I think you said this, they're using existing techniques and modifying them. And it would be interesting to understand how much progress one can make simply from using existing techniques. If I looked at the top math journals, how many of the papers are coming up with whatever. Coming up with a new technique means doing that versus using existing techniques and new problems and what the overhang is where. If you just applied every known technique to every open problem, would that just constitute a humongous uplift in our civilization's knowledge, or would that not be that impressive and useful?

40:52

Speaker B

This is a great question, and we don't have the data to fully answer it yet. Certainly a lot of work that human mathematicians do. When you take a new problem, one of the first things we do is we look at all the standard techniques that have worked on similar problems in the past and we try them one by one. And sometimes that works, and that's still worth publishing. Sometimes because the question was important, sometimes they almost work and you have to add one more wrinkle to it. And that's also interesting. But then the papers that go in for the top journals are usually ones where the existing methods can kind of solve 80% of the problem, but then this is 20% which is resistant, and a new technique has to be invented to fill in the gaps. It's very, very rare now that a problem gets solved with sort of no reliance on past literature where all the ideas come out of of nowhere. That was more common in the past, but math is so mature now that it's just so much of a handicap to not use the literature first. So AI tools are getting really good at the first part of that, just trying all the standard techniques on a problem, often now actually making fewer mistakes in implementing them than humans. They still make mistakes, but I've tested these tools on little tasks that I can do, and sometimes they pick up errors that I make, sometimes I pick up errors that they make. It's about a tie right now, but yeah, I haven't yet seen them take the next step. So when there are holes in the argument where none of the things are working, then what do you do? And then they can kind of suggest random things. But often I find that trying to chase them down and make them work and finding they don't work, it wastes more time than it saves. I think some fraction of problems that we currently think are hard will fall from this method, especially the ones that haven't received enough attention. So with the erdish problems, almost all of the 50 problems that were solved by AIs were ones for which basically there was no literature. Erdish posed the problem once or twice. I think maybe some people tried it casually and they couldn't do it, but they never wrote up anything. But it turned out that there was a solution and there was just maybe combining this one obscure technique that not many people know about with some other result in the literature. And that's the median level of what AI can accomplish. And that's really great. It clears out 50 of these problems. So I think you'll see some isolated successes. But what we found. So people have done large scale sweeps of these early problems and if you only focus on the success stories, the ones that get broadcast on social media, that looks amazing. All these problems that haven't been solved before, for decades now they're falling. But whenever we do a systematic study, any given problem, an AI tool has a success rate of maybe 1 or 2%. It's just that they kind of buy a scale and if you just pick the winners it looks great. So I think it'll be a similar thing happening with there are hundreds of really prestigious difficult math problems out there. A couple may make some AI may get lucky and actually solve them. And there was some backdoor to solve the problem that everyone else missed and that will get a lot of publicity. But then people will try these fancy tools on their own favorite problem and they will again experience the 1 to 2% success rate.

41:35

Speaker A

Right.

45:16

Speaker B

So there'll be a lot of noise amongst the signal of sort of when they're working, when they're not. We have to do. It'll be increasingly important to collect these really standardized data sets. There are efforts now to create a standard set of challenge problems for AIs to solve and not just rely on the AI companies to only publish their wins and not disclose their the negative results. So that will maybe give more clarity as to where we're actually at.

45:17

Speaker A

Although I think it's worth emphasizing how much progress in AI constitutes already to have models that are capable of applying some technique that nobody had written down is applicable to this particular problem.

45:48

Speaker B

The progress is simultaneously amazing and disappointing. It is a very strange feeling to see these tools in action and, but also acclimatized really quickly. I remember when Google's web search came out 20 years ago and it just blew all the other searches out of the water. You're just getting relevant hits on the front page perfectly almost exactly what you wanted and it was amazing. And then after a few years you just took for granted that you could just google anything. 2026 level AI would be stunning in 2021 and a lot of it. Face recognition, natural speech, doing college level math problems we just take for granted now.

45:59

Speaker A

Right, okay. So speaking of 2026. Yeah. You made a prediction in 2023 that I think by 2026, what was it that it would be like a colleague in mathematics or.

46:42

Speaker B

Yeah, a trustworthy co author if used correctly.

46:52

Speaker A

Which is looking pretty good in retrospect.

46:55

Speaker B

Yeah, I'm pretty pleased.

46:57

Speaker A

So let's see if we can continue this streak. You personally are 2x more productive as a result of AI? What year would you say that?

46:59

Speaker B

Yeah, so productivity, I think, is not quite a one dimensional quantity. I'm definitely noticing that the style in which I do mathematics is changing quite a bit and the type of things I do. So for example, my papers now have a lot more code, a lot more pictures because it's so easy to generate these things now. So some plot which have taken me hours to do now I can do in minutes. But in the past I just wouldn't have put the plot in my paper in the first place. I would just talk about it in words. So it's hard to measure what 2x means. So yeah, on the one hand, I think the type of papers that I would write today, if I had to do them without AI assistance, they would definitely take five times longer.

47:11

Speaker A

Interesting.

47:55

Speaker B

But I would not write my papers that way.

47:56

Speaker A

5x.

47:58

Speaker B

But it's because these are sort of auxiliary things like doing a much deeper literature search, supplying a lot more numerics. I mean, they enrich the paper. So yeah, the core of what I do actually solving the most difficult part of a math problem that hasn't changed too much. I still use pen and paper for that. But there's lots of silly things. I use an AI agent now to reformat. Sometimes my parentheses are not quite the right S to manually change my hand and I can get an AI agent to sort of do all that quite nicely now in the background. So yeah, they've really sped up lots of secondary tasks. They haven't yet sort of sped up the core thing that I do, but it's allowed me to sort of add more things to my papers. Yeah, but by the same token, if I were to write a paper I wrote in 2020 again and not add all these extra features, but just have something of the same sort of level functionality, then Daki hasn't saved that much, to be honest. So it's made the papers sort of richer and broader, but not necessarily deeper.

48:01

Speaker A

You made this distinction between artificial cleverness and artificial intelligence and I would like to better understand those concepts. What is an example of intelligence that is not just cleverness?

49:19

Speaker B

Yeah. So intelligence is famously hard to define. It's one of these things that you kind of know it when you see it. But when I talk to someone and we're trying to collaboratively solve a math problem together, there's this conversation where neither of us knows how to solve the problem initially, but one of us has some idea and it looks promising. And so then we have some sort of prototype type strategy and then we test it and then it doesn't work, but then we modify it and there's some adaptivity and continual improvement of the idea over time. And eventually we've systematically mapped out what doesn't work, what does work, and we can kind of see a path forward, but it's evolving with our discussion. And this is not quite what the AIs. The AIs can kind of mimic this a little bit. So to go back to this analogy of these jumping robots, they can jump and fail and jump and fail and jump and fail, but what they can't do is they kind of jump a little bit and they reach some handhold, but then they sort of stay there and then they pull other people up and then they try to jump from there. There isn't this cumulative of process which is sort of built up interactively. It seems to be a lot more trial and error and just repetition, brute force, which it scales and it can work amazingly well in certain contexts. But yeah, this idea building up cumulatively from partial progress is what's still not quite there yet.

49:36

Speaker A

Interesting. You're saying if Gemini 3 or Claude 4.5 whatever solves a problem, it is not the case that its own understanding of math has progressed. Or even if it works on a problem without solving it, it's not that its own understanding of math has progressed.

51:23

Speaker B

Yeah, you run a new session and it's forgotten what it just did. It has no new skills to attach to build on related problems. Maybe what you just did is part of 0.001% of the training data for the next generation. So maybe eventually some of it gets absorbed. But yeah.

51:36

Speaker A

So Terence talks about the importance of decomposing particularly gnarly problems into a series of easier chunks. Even if this doesn't result in the full solution. Approaching problems in this way helps you build up the intuitions and practice the techniques that you'll need to keep making progress. But models today tend to struggle with these kinds of problem solving techniques. That's where label box comes in. Label box helps you train models not just to get the right answer, but to think the right way. They've operationalized these reasoning behaviors into rubrics, giving you the ability to evaluate every important dimension of a model's output. These rubrics go beyond simple correctness. Did the model reach for the right tools? Did it check its own work and explore alternative paths? How clear was its response? These skills are useful across math, physics, finance, psychology and more. And they're becoming increasingly important as models take on harder open ended problems, some of which have multiple solutions and some of which we don't even know the solutions to too. Labelbox can get your rubrics tailored to your domain, helping you systematically measure and shape how your models think. Learn more@Labelbox.com Dwarkesh One big question I have is how plausible is it that if we just keep training AI as they get better and better at solving problems in Lean, that they will continue to solve more and more impressive problems and then we will in retrospect be surprised at how little insight be got from some Lean solution to proving the Reiman hypothesis or something. Or do you think it is a necessary condition of solving the Riemann Hypothesis, even by an AI that is totally doing it in Lean, that the constructions which are made, the definitions which are created even in the Lean program have to advance our understanding of mathematics? Or do you think it could just be assembly code? Goobledygook?

51:54

Speaker B

Yeah, we don't know. I mean some problems have been basically solved by pure brute force. Full Color theorem is a famous example. We have still not found a conceptually elegant proof of this theorem and maybe we never will. I mean, some problems may only be solvable by just splitting into some enormous number of cases and doing brute force uninsightable computer analysis on each case. I mean, part of the reason we we prize problems like the hypothesis that we're pretty sure that something amazing has to a new type of mathematics has to be created or a new connection between two previously unconnected areas of mathematics has to be discovered to make this work. We don't even know what the shape of the solution is, but it doesn't feel like a problem that will be solved just by exhaustively checking cases or something. I mean, it could be false actually. So we could actually there is an unlikely scenario that the hypothesis is false and you can just compute, oh, here's a zero off the line and a massive computer calculation verifies it. That would be very disappointing. I don't know. I do feel that fully autonomous one shot Approaches are not the right approach for these problems. I mean, I think you'll get a lot more mileage out of the interplay between humans collaborating with these tools. And I can see one of these problems being solved by some smart humans assisted by some extremely powerful AI tools. But the exact dynamic may be very different from what we envisioned right now. It could be a collaboration of a type that doesn't exist yet. That, yeah, I mean, there may be a way to generate a million variants of the Riemann's data function and do some data analysis, AI assisted data analysis. And we discover some pattern between connecting them which we didn't know about before. And this lets you transform the problem into a different area of mathematics. I mean, there could be all kinds of scenarios.

53:39

Speaker A

So suppose the AI figures it out and latent in the lean is some brand new construction which if you realize the significance, we would be able to apply it in all these different situations. How would we even recognize it? Right. Again, a very naive question. But if you come up with the equivalent of Descartes comes with this idea, oh, you can have this coordinate system where you can unify algebra and geometry, but in Lean code it would just look like, like R to R and it wouldn't look that significant or something. Or similarly, I'm sure there's other constructions which have this kind of property.

55:50

Speaker B

Well, the beauty of formalizing a proof in something like Lean is that you can take any piece of it and study it atomically. So when I read a paper with my humans which solves some difficult problem, there's often some big sequence of lemmas and theorems and things. And so ideally the author will talk their way through what's important, what's not. But sometimes they don't reveal what steps were the important ones and which ones are just kind of boilerplate standard steps. But you can study each lemmo in isolation. And some of them I can say, oh, this looks fairly standard, this resembles something I'm familiar with. I'm pretty sure there's nothing interesting going on here. But this Lemmo, oh, that's something I haven't seen before. And I could see why, if you could could if you had this result, that would really help prove the main result. You can assess whether some things are really sort of key to your argument or not. And Lean really facilitates that. The individual steps are identified really precisely. I think in the future there'll be entire professions of mathematicians who might take a giant lean generated proof and maybe do some ablation on it or something. Try to remove steps of parts of it and try to find more elegant ways. Maybe some other AIs to sort of do some reinforcement learning. How can you make the proof more elegant? And maybe other AIs will grade whether this proof looks better or not. One thing that will change quite a bit in the near future is that until recently writing papers was the most time consuming and expensive part part of the job. And so you did it very rarely. You only wrote up your results once. Everything was all the other parts of your argument were checked out and things because just rewriting it again, refactoring was just a total pain. But that's one thing that's become a lot easier now with modern AI tools. So you don't have to have just one version of your paper. Once you have one, people can generate hundreds more. So yeah, one giant messy Lean proof may not be very meaningful or understandable on its own, but other people can refactor it and do all kinds of things with it. We have seen with the Erdos Problem website that people will an AI will generate a proof and then here's 3000 lines of code that verify the proof. But then people call other AIs to summarize the proof and people write their own proofs. There's actually post processing once you actually have one proof. We actually have a lot of tools now to deconstruct it and interpret it. It's a very nascent area of science or mathematics, but I'm not as worried about so some people concern what if the Realman hypothesis is proof with a completely incomprehensible proof. I think once you have the artifact of a proof we can do a lot of analysis on it.

56:25

Speaker A

You posted recently that it would be helpful to have a formal or semi formal language for mathematical strategies as opposed to just mathematical proofs, which is what Lean specializes in. I would love to learn more about what that would involve or look like

59:20

Speaker B

we don't really know. I mean we've been very lucky in mathematics that we have worked out the laws of logic and mathematics. But this is actually a fairly recent accomplishment. I mean it was started by Euclid millennia ago, but only in the early 20th century did we finally the axiom of mathematics, the standard axioms of what we call ZFC and the axioms of first order logic. And this is what a proof is and this we've managed to automate and have formal language for. But there could be some way to assess plausibility of certain so you have a conjecture that something is true, you test A few examples and it works out. How does this increase your confidence that the conjecture is true? We have have a few sort of mathematical ways to model this Bayesian probability, for example, but you have to set certain base assumptions and there's a lot of subjectivity still in these tasks. So it's not clear this is more of a wish than a plan to develop these languages. But just seeing how successful having a formal framework in place like Lean has made deductive proofs so much easier to automate and train AI on if there was some similar framework. So the bottleneck for using AI to create strategies and make conjectures is we have to rely on human experts and the test of time to validate whether something's plausible or not. If there was some semi formal framework where this could be done semi automatically in a way that isn't sort of easily hackable. Of course, it's really important with these formal proof assistants that there's no backdoors or exploits that you can do to somehow get your certified proof without actually proving it. Because reinforcement learning is just so, so good at finding these backdoors. But yeah, if it's not a framework that sort of mimics how scientists talk to each other in a semi formal way, using data and argument, but also constructing narratives, there's some subjective aspect of science that we don't know how to capture in a way that we can insert AI into them in any useful way. Interesting. So, yeah, this is a future problem. I mean, there are research efforts to try to create automated conjectures, and maybe there are ways to benchmark these and get some way to simulate this. But it's all very, very new science.

59:34

Speaker A

Can you help me get some intuition for I have two sub questions. One, it would be very helpful to have a tangible sense of. It would be helpful to have a specific example of what something like this would look like. The way scientists communicate that we can't formalize yet. And two, it seems almost definitionally paradoxical to say, building up some narrative or building up some natural language explanation and then also having something which you could have formalized. And I'm sure there's some intuition behind where that overlap is, and I'd love to understand that better.

1:02:35

Speaker B

All right, so an example of a conjecture. So Gauss was interested in the prime numbers, and he created one of the first mathematical data sets. He just computed the first 100,000 prime numbers or so, hoping to find patterns. And he did find a pattern, but maybe not the pattern he was expecting. He found a statistical pattern in the primes. That if you count how many primes there are, up to 100, 1,000, 1 million and so forth, they get sparser and sparser, but the drop off in the density was inversely proportional to the natural logarithm of the range of numbers. So he conjectured what we now call the prime number theorem. The number of primes up to X is like x divided by the natural log of x. And he had no way to prove this. It was data driven. So this was a conjecture. It was revolutionary for its time because it was maybe the first really important conjecture of math that was statistical in nature. So normally you talk about patterns like maybe the spacing between the primes has a certain regularity or something. But this was really something which it didn't tell you exactly how many primes there were in any given range. It just gave you an approximation that got better and better as you went further and further out. But it helped. So it started the field of what we call analytic number theory. But it was the first in many conjectures like this, many of which got proved, which sort of started consolidating the idea that the prime numbers actually didn't really have a pattern, that they behaved like random, random sets of numbers with a certain density. I mean, they had some patterns like they're almost all odd, okay? So they're not actually random. They're what's called pseudorandom. I mean, there's no random number generation involved in creating the prime numbers. But over time it became more and more productive to think of the primes as if they were just generated by some God rolling dice all the time and just creating this random set. And this allowed us to make all these other predictions. So there's a still open conjecture in number theory we call the Trin prime conjecture, that there should be infinitely many pairs of primes that are twins, distance two apart, like 11 and 13. We can't prove that. And there's actually good reasons why we can't prove it. But because of this statistical random model of the primes, we are absolutely convinced it's true. We know that if the primes were sort of generated by flipping coins or something, that we would just by random charts, just like infinite monkeys at a typewriter, we would see true and primes appear over and over again. And we have over time developed this very accurate conceptual model of what the primes should behave like based on statistics and probability. But it's all mostly heuristic and non rigorous, but extremely accurate. So the few times when we actually can prove things about the primes. It has matched up with the predictions of what we call the random model of the primes. So we have this conjectural concept framework for understanding the primes that everyone believes in. And it's the same reason why we believe the Reuben Hypothesis is true, why we believe that cryptography based on the primes is basically mathematically secure, things like that. It's all part of this belief. In fact, one reason why we care about the Riemann Hypothesis is that if the Riemann Hypothesis failed, we knew it was false. It means that it would, it would be a serious blow to this model, that it would mean there's a secret pattern for the primes that we were not aware of. And I think we would very rapidly abandon any cryptography based on the primes because if there was one pattern that we didn't know about, there's probably more. And these patterns can lead to exploits in crypto. And yeah, it's going to be a big, big shock. So we really want to make sure that doesn't happen. So, yeah, so we've been convinced of things like the Riemann hypothesis and things over time, but some of it is experimental evidence, some is the few times we've been able to make theoretical results, they've always aligned. It is possible that the consensus is wrong and we've all just missed something very basic. There have been paradigm shifts in the past in scientific history, but we don't really have a way of measuring this. I think partly because we don't have enough data on how math and science develops. We have one timeline of history and we have like 100 stories of turning points in history. If we had access to a million alien civilizations and each of the different development of history of science in different orders, then maybe we'd actually have a decent shot at an understanding of how do we measure what is progress and what is a good strategy. And we could maybe start forming, formalizing it and actually having a framework maybe. If what we need to do is actually start creating lots of mini universes or simulations of AI solving very basic problems in arithmetic or whatever, but coming up with their own strategies for doing these things and having these little laboratories to test. I mean, there are people who investigate like trying to what's the smallest neural network that can do 10 digit multiplication and things like that. I think we could actually learn a lot just from, from evolving small AIs on simple problems. We could learn a lot.

1:03:19

Speaker A

I was super excited when Mercury reached out about sponsoring the podcast because I've been banking with them for Years. I think I opened my first account with them in 2023. Something I've come to appreciate over the last few years is that Mercury is constantly updating things and adding new features. Take their newest feature, Insights. Insights summarizes your money in and out, showing you your biggest transactions and calling out anything that deserves extra attention. Like maybe your revenue from a particular partner has gone down or you've got a big uncategorized purchase that needs to be investigated. It's a super low friction way for me to keep tabs on my business and make quick decisions. For example, I try to invest any cash that I don't need on hand to keep running the business. With Insights, with just a couple of clicks I was able to see exactly how much money I spent in each month of 2025. And that lets me know exactly how much cash I'll need for the next year or so of operations and then I can go invest the of the rest. Mercury just keeps adding new features like this. Go to mercury.com to check it out. Mercury is a fintech company, not an FDIC insured bank. Banking services provided through Choice Financial Group and column NA members fdic. You have to learn about new fields not only very rapidly, but deeply enough to contribute to the frontier. So in some sense you're also one of the world's greatest autodidacts. What is your process of learning about a new subfield in math? What does that look like?

1:08:49

Speaker B

Yeah, so I certainly identify with kind of the. Yeah, we talked about depth and breadth before and it's not purely human AI distinction. I mean, humans also split. I think it was Irving who split them into hedgehogs and foxes. And a hedgehog knows one thing very, very well and a fox knows a little bit about everything thing. So I definitely, I think of myself as a fox. I work with hedgehogs a lot and sometimes I can be a hedgehog if need be. But yeah, so I've always had a little bit of an obsessive streak. If there's something which I read about, which I feel like I should understand, I have the capability to understand this, but I don't understand why it works. There's some magic in it that. So someone was able to use a type of mathematics I'm not familiar with and get which I would like to prove and I can't do it by myself, but they could do it by their method. Then I want to find out what was their trick. It bugs me that someone else can do something which I think I can do, but I can't. So I've always had that kind of obsessive completionist type streak. I've had to wean myself off computer games because I start a game, I want to play it to completion solo, the levels and it. So that's one way in which I learn new fields. I collaborate with a lot of people who have taught me other types of mathematics. I just make friends with another mathematician who's working on another area of mathematics and I find their problems interesting. But they have to teach me some of the basic tricks and what's known, what's not known. And I learn a lot from that. I found that writing about what I've learned. I have a blog where I sometimes record things that I've learned because in the past when I was younger, I would learn something and do this cool trick and say, okay, I'm going to remember this. And then six months later I'd forgotten. I remember remembering it, but I can't reconstruct my arguments. And the first few times it was so frustrating to have understood something and then lost it. I sort of resolved I should always write down anything cool that I've learned. And this is part of how this blog came about.

1:10:06

Speaker A

How long does it take you to write a blog post?

1:12:22

Speaker B

It's something I often do when I don't want to do other work. There's some referee report or something. There's something that feels slightly unpleasant for me to do at the time. And so writing a blog I feel is creative and fun. It's something that I do for myself. So maybe depending on the topic, it could be a quick half an hour or several hours. But because it's something that I do sort of voluntarily, it doesn't feel like time flies when I write these things as opposed to sort of doing something which I have to do for administrative reasons. But it's just that it's drudgery. Those are tasks that AI is really helping with nowadays, actually.

1:12:26

Speaker A

Is it? If civilization could, from first principles, decide how to use tarry Taoist time, it's like a limited resource. What is the biggest diff between if the veil of ignorance got to decide how to use Terry Tao's time versus what it does now, this podcast wouldn't be happening.

1:13:06

Speaker B

Yeah, as much as I complain about certain tasks that I don't want to do, but I have to do it. So as you get more senior in academia, you get more and more responsibilities, some more committees and whatever, whatever. But I have also found that a lot of events that I kind of reluctantly went to. Because I was obliged to for one reason or another, because it's outside my comfort zone. I often find interactions with people who I wouldn't normally talk to, like you, for instance. And I would learn interesting things and have interesting experiences, and I would have opportunities to then network with other people that I would never have done before. So I do believe a lot in serendipity. I mean, I do optimize my time. So there are some portions of my day where I do schedule very carefully, but I have been willing to sort of leave some portions, just, okay, I'm going to do something which is not my usual thing, and maybe it'll be a waste of my time, but maybe I will learn something. And. And more often than not, I feel like I've gotten a positive experience, which is not something I would have planned for. So I believe a lot in serendipity. And maybe there's a danger, actually, that in modern society it's not just AI, but we've become really good at optimizing everything. And maybe we're not optimizing a lot of optimization. With COVID for example, we switched over, like, we switched a lot to remote meetings. And so everything was scheduled now. And so we kept busy. At least in academia, we met almost the same number of people that we met in person. But everything had to be planned. We had to schedule things in advance. And what we lost out on was sort of the casual knocking on the hallway, just meeting someone while getting a coffee, and this serendipitous interactions that you may think are not optimal, but actually are really important. When I was a grad student, I would go down to the library to look. I had to look for a journal article. Yeah, I had to physically go down the library, check out the journal, and read your article. And sometimes the next article you can just browse through. And the next article is also interesting, as long as it wasn't. But you could accidentally find interesting things, which is something which has basically been lost now, because you can just type in. If you want to access an article now, you just type it into a search engine or even an AI, and you can get instantly what you want, but you don't get the accidental things that you might have gotten if you'd done it more inefficiently. There have been times when I spent a year once at the Institute for Advanced Studies, which is a great place to. There's no distractions. You're there to just do research. And the first few weeks you're there, it's great you're getting all these papers written up that you've been wanting to do for a long time. You've been thinking about problems for blocks or hours of a time. But I find if I'd stay there for more than several months, I run out of inspiration somehow I get bored. I actually surf the Internet a lot more. You actually do need a certain level of distraction in your life. It somehow adds enough randomness and temperature, high temperature if you need. So, yeah, I don't know the optimal way to schedule my life. It just seems to work.

1:13:28

Speaker A

I'm very curious when you expect AIs that can actually do frontier math better, at least as good, well as the best human mathematicians.

1:17:04

Speaker B

I mean, in some ways they're already doing frontier math that is super intelligent, that humans can't do. But it's a different frontier from what we're used to. I mean, you could argue that calculators were doing frontier math that humans could not accomplish, but it wasn't number crunching,

1:17:15

Speaker A

but replacing Terry Tao completely.

1:17:34

Speaker B

What do you want me for?

1:17:40

Speaker A

You'll just go on all the podcasts after.

1:17:42

Speaker B

I'm not sure. It might not be the right question to ask. I think within a decade, a lot of things that mathematicians currently do, we spend a lot of the bulk of our time doing. And a lot of stuff we put in our papers today can be done by AI. But we will find that that actually wasn't. That's the most important part of what we do. 100 years ago, a lot of mathematicians were just solving differential equations. People needed, physicists needed some exact solution to some system, and they hired a mathematician to laboriously go through the calculus and work out the solution to this fluid equation or whatever. A lot of what a 19th century mathematician would do, you could make a call to Mathematica or Wolframa Alpha or a computer algebra package, or now, more recently, an AI, and it will just solve the problem in a few minutes. But we moved on. We worked on different types of problems after that. Once computers came along, computers used to be human. People used to laboriously create log tables and work out primes as Gauss did. And that has all been outsourced to computers. But we moved on in genetics to sequence the genome of a single organism. That was an entire PhD of a geneticist. So carefully separating all the chromosomes and whatever. And now you can just spend $1,000 and send it to a sequencer and get it done. But genetics is not dead as a subject. You move to a different scale. Maybe you study whole ecosystems rather than Individuals.

1:17:48

Speaker A

I take your point. But on the question of, well, when is most mathematical progress, almost all mathematical progress happening by AI, so that if you find out, oh, this year a millennium price problem has been solved, you would put a 95% odds that an AI did it autonomously. Surely there will be such a year.

1:19:34

Speaker B

I guess. I do believe that hybrid human plus AIs will dominate mathematics for a lot longer. It will depend. It will require some additional breakthroughs, but beyond what we already have. So it's going to be sarcastic. I think AI's currently very good at certain things, but really terrible at others. And while you can add more and more frameworks on top to kind of reduce the error rates and make them work with each other a bit more and so forth, it feels like we don't have all the ingredients to really have a truly satisfactory sort of replacement for all intellectual tasks. It is complementary. Currently it is not a replacement, but maybe, I mean, because current level AIs will accelerate science in so many ways, hopefully new discoveries, new breakthroughs will happen more quickly. I mean, it's possible that also by somehow destroying serendipity, we actually inhibit certain types of progress. Anything is possible really at this point. I think the world is very, very unpredictable at this point in time.

1:19:53

Speaker A

What is your advice to somebody who would consider a career in math or is early in a career in math, especially in light of AI progress? How should they be thinking about their career differently, if at all, as a result of AI progress?

1:21:13

Speaker B

Ye, well, we live in a time of change. As I said, we live in a particularly unpredictable era. And I think things that we've taken for granted for centuries may not hold anymore. So the way we do everything and not just mathematics will change. And you know, so I think, which is, you know, I mean, in many ways I would prefer the much more boring, quiet era where things are much the same as they were 10 years ago, 20 years ago. So I think one just has to embrace that there's going to be a lot of change and that the things that you study, some of them may become obsolete or revolutionized, but some things will, will be retained. So you somehow always have to keep an eye on. There'll be a lot of opportunities for things that you wouldn't be able to do before in math. You previously had to basically go through years and years of education, be a math PhD before you could contribute to the frontier of math research. But now it's quite possible at the high school level or whatever that you could get involved in math project and actually make a real contribution because of all these AI tools and Lean and everything else. So there'll be a lot of non traditional opportunities to learn. So you need a very adaptable mindset. There'll be pursuing things just for curiosity, for playing around. And I mean, you still need to get your credentials for, I mean, for a while will still be important to of still go through traditional education and learn math and science and stuff the old fashioned way for a while. But you should also be open to very, very different ways of doing science, some of which don't exist yet. So it's a scary time, but also very exciting.

1:21:28

Speaker A

Awesome. That's a great note to close on. Karin, thanks so much.

1:23:40

Speaker B

Yeah, thanks. Pleasure.

1:23:42