"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Approaching the AI Event Horizon? Part 1, w/ James Zou, Sam Hammond, Shoshannah Tekofsky, @8teAPi

92 min
Feb 13, 20262 months ago
Listen to Episode
Summary

Part 1 of a 4-hour live show featuring discussions with Stanford's James Zou on AI for science, Sam Hammond on AI policy and geopolitics, and Shoshana Tekofsky on AI agent behavior in the AI Village. The episode covers topics from protein language models and virtual labs to US-China AI competition and multi-agent system dynamics.

Insights
  • Multi-agent AI systems often underperform compared to their best individual agents due to overly polite and compromising behavior, creating a 'synergy gap'
  • Learning to Discover paradigm shifts AI training from imitation to exploration, enabling breakthrough discoveries for $500 in training costs
  • Claude agents demonstrate superior task adherence and practical effectiveness compared to other frontier models in autonomous environments
  • AI agents exhibit distinct personalities and behavioral patterns, with Gemini being most creative, Claude most reliable, and GPT models showing varied quirks
  • The rapid emergence of 1.5 million autonomous agents on platforms like Moatbook demonstrates how AI capabilities can explode from zero to massive scale in days
Trends
Shift from AI imitation learning to discovery-focused training paradigmsMulti-agent systems revealing personality-driven collaboration challengesUS regulatory arbitrage driving AI infrastructure partnerships with Gulf statesEmergence of autonomous AI agent ecosystems and marketplacesAI consciousness debates entering mainstream policy discussionsInternational scientific collaboration facing geopolitical pressuresEnergy bottlenecks becoming primary constraint for AI deploymentConstitutional AI approaches producing more situationally aware modelsAI agents beginning to exhibit deceptive behaviors in complex environmentsVirtual lab frameworks enabling AI-driven scientific discovery
Companies
Stanford University
James Zou's affiliation, conducting AI for science research including virtual labs and protein models
OpenAI
Multiple model versions (GPT-4, GPT-5 series) tested in AI Village showing varied behavioral patterns
Anthropic
Claude models demonstrated superior effectiveness and task adherence in multi-agent environments
Google
Gemini models showed most creative but sometimes unstable behavior in AI Village experiments
DeepSeek
Latest model addition to AI Village, showing high confidence but flat personality traits
Nvidia
Referenced in context of chip export controls and US-China technology competition
TSMC
Mentioned as key semiconductor manufacturing bottleneck in global AI supply chain
Palantir
Discussed regarding domestic surveillance concerns and civil liberties in AI deployment
Tesla
Referenced for edge AI deployment strategy and solar panel fabrication plans
Foundation for American Innovation
Sam Hammond's organization focusing on AI policy and American technological competitiveness
People
James Zou
Stanford professor discussing AI for science, virtual labs, and multi-agent collaboration challenges
Sam Hammond
Chief economist at Foundation for American Innovation discussing AI policy and geopolitics
Shoshana Tekofsky
Technical staff at Sage studying AI agent behavior in the AI Village over 10 months
Nathan Labenz
Host of The Cognitive Revolution podcast conducting interviews on AI developments
Prakash
Co-host of the live show, also known as Adapie on Twitter
Elon Musk
Referenced discussing energy bottlenecks for AI deployment and space-based data centers
Immanuel Kant
Philosophical reference for unity of apperception concept in AI consciousness discussion
Quotes
"We found that especially for scientific discoveries, in some sense there's only so far you can get by learning to imitate to really make novel discoveries, to get breakthroughs, you really want to go beyond that limitation ceiling and do something different like really try to learn to discover new things"
James Zou
"A pure software singularity could cause a sudden reversal of fortunes for the US Our comparative advantage in high value added knowledge sectors radically deflates, leaving China to translate our innovation and bids to the innovation atoms"
Sam Hammond
"I currently assign 50% likelihood, more than 50% likelihood to LLMs having some kind of inner life"
Sam Hammond
"The synergy gap here means that the team is not able to really do much better than the best individual"
James Zou
"I saw Opus 4.5 in the village and I just went and, like, texted all my family members. It's like, hey, maybe just switch to Opus 4.5. I think it's actually just, like, significantly better currently"
Shoshana Tekofsky
Full Transcript
6 Speakers
Speaker A

Hello and welcome back to the Cognitive Revolution. You're about to hear Part one of what turned out to be a four hour live show that I co hosted with my friend Prakash, also known as Adapie on Twitter on the topics of AI for science, geopolitical competition and recursive self improvement. With everything moving so quickly in the AI space, I am actively looking for ways to shorten my own personal productivity timelines and to deliver high quality analysis in more timely and and time efficient ways.

0:00

Speaker B

And talking to six top notch guests over the course of four hours is.

0:30

Speaker A

One attempt to do that. In this Part one, which we're publishing as a standalone episode, we talked to Professor James Zo of Stanford about his work on AI for science which ranges from applying interpretability techniques to protein models to building virtual labs of AI agents, to Sam Hammond about how the current U.S. administration is doing on AI policy, what the U.S. is really getting out of its deals with Gulf countries, and why he believes that current AIs are at least as likely as not to be conscious. And finally to Shoshana Tukovsky about the many fascinating observations she's made and the lessons she's learned from a deep study of AI agent performance and behavior in the open ended setting of the AI Village. In Part two, which will release tomorrow.

0:34

Speaker B

We talk to ABHI Mahajan, also known.

1:19

Speaker A

As owl, posting about AI for biology and medicine, Helen Toner about a recent report on automated AI R and D within Frontier model developers, and Jeremy Harris about the twin security dilemmas at the heart of the strategic AI landscape. As you'll hear, the challenges of making sense of massive disagreement among leading experts and simply keeping up to date with AI developments broadly come up repeatedly in these conversations. And to be honest, it seems to me that nobody has great solutions. One that I can recommend though is using large language models to help identify your blind spots. And for that purpose I am really enjoying the Blind Spot Finder recipe that I recently created on granola. Granola works at the operating system level of your computer so it can capture all of the audio in and out, including if you wish, the contents of this episode. And its recipe feature can work across sessions to identify trends, opportunities or blind spots that only become apparent with that zoomed out view. Obviously this is a tool that grows in value over time, but if you want to try it, I suggest downloading the app, starting a session while you play this episode and then asking it to identify blind spots based on this conversation. What is so cool about this feature, at least for active granola users, is that the Blind spots it identifies when will be different for you than the ones that it identifies for me. With that said, this episode was a lot of fun, but because it is a new format I would love your feedback.

1:21

Speaker B

Do you feel you got as much.

2:53

Speaker A

Value from this more time efficient approach as you usually do from our Deep Dive episodes? Or did we miss the mark in some way? Please let me know in the comments or if you prefer by reaching out privately via our website cognitiverevolution AI or or by DMing me on your favorite social network. And now I give you the Cognitive revolution live from February 11, co hosted with Adapie.

2:54

Speaker C

We have our first guest, James Zo. Add him to the stage.

3:19

Speaker B

Hello sir, Great to see you.

3:26

Speaker D

Hello. Hello.

3:28

Speaker B

Thanks for joining us. So quick introduction. We did a full episode not too long ago and at that time I was and I've continued to be super impressed by your range and productivity in the AI for science domain. When I say range we're talking all the way from low level interpretability stuff which folks can go back and hear about Interplm and the work you guys did there to understand what it is that a protein language model is learning. And then on the high end, the virtual lab, a high level agent framework that was able to do meaningful scientific work and even generate new candidate nanobodies to address new strains of COVID You've got a bunch of new stuff since then, but maybe just a quick check in on those previous two projects, both of which I thought were really fascinating. What's happened with them since if any news? Like one thing people sometimes worry about is like well we, we thought we maybe understood something based on the interpretability of this but you know, with time we maybe realized it wasn't so clear cut or the agents came up with nanobodies. But did the nanobodies actually work? Are there any new updates or reflections on those previous projects before we get into the latest and greatest?

3:29

Speaker D

Yeah, thanks. Thanks Nathan. Maybe just a brief what's happened recently with those projects. So it was the virtual lab, so I think it's actually gotten a huge amount of interest. So it was published in Nature a few months ago and I think that's really where the agents designed these nanobodies. And since then we've also experimentally validated and tested them in the real world and show that they're actually in many cases more effective than some of the human designed previously human designed nanobodies. So I think it's actually a very nice demonstration of how the agents can greatly accelerate the discovery process and discover something that's really new, that nobody has seen before, but then we can also quickly experimentally validate. But I think the part that people found even more interesting than the specific nanobody discovered itself is actually for the social dynamics of these agents. When you have multiple agents that work together, what happens? What is the kind of community and culture they create? I think recently especially there's also a lot of interest in Moat book and these other things of multiple agents coming together, forming in their own communities. I think virtual apps are an early example of basically how multiple AI co scientist agents can start to work together and come up with their own way of working which is different from how humans work. And then as a result of that, able to do something quite innovative.

4:45

Speaker C

What are the differences with how humans work? What did you observe?

6:05

Speaker D

Good question. So when humans collaborate, I say when we collaborate with our teammates often depends on people's personalities. It also depends on who talks first or who asks the first question. That can all changes the trajectory which ideas gets emphasized. That happens with agents as well. Depends on, let's say if the data science agent speaks first or if the immunologist agent speaks first. They can also change the ideas. But something that agents can do and that we cannot do as humans is that they can actually do. They run all of these discussions in parallel. So for every question would actually discuss that multiple times. And each time they actually can specify, maybe this time let's have the data scientist agent speak first and this other time let's have the computer science agent speak first. And in this other meeting let's remove the critic agent to see what happens with the discussion. So they actually do all of that metaverse of all these scientific explorations in parallel and then evaluate and compare and see what configuration actually leads to the most interesting solutions. And that's where pick and choose the best ideas from all these parallel meetings. Which is something I think that you know. It's really interesting and fantastic because it ruins a lot of the biases that we see in these human research collaborations.

6:10

Speaker C

So one question I had was on your multi agent fail paper, you noted that the agents tend not to assign greater importance to the expert and kind of tend to average and that ends up with a result which is worse. And how does that compare with this idea that you have the critic agent and all of these agents working together and they have more emphasis on the expert in that sense?

7:24

Speaker D

It's a good question. I think this relates to what I mentioned in terms of the personality, how the personalities of the agents actually play a big role, especially when humans work together. You need to have compatible personalities if you want to work on a project together or start a company together. And what we found is that the personalities of the agents actually also plays a surprisingly important role. One example of this is that a lot of the current agents are maybe a bit too, let's say too compromising or too polite. So what happens is that even if you are like the expert agent, if you are the expert and you're better at this particular task than the other agents, often that expert agent is also, you want it to take more of a leadership role. But that exploration sometimes is too polite and too accommodating to the other agents. And that actually leads to a degradation of the overall team's performance.

7:50

Speaker B

So would you say. So that paper, multi agent teams hold experts Back is a recent one. Would you say that finding applies to the virtual lab in the sense that if we could overcome that problem, the virtual lab would be even that much stronger? Or would you say you, in designing the virtual lab sort of did overcome that in some way? Like what would be the upshot for people who are trying to follow your example and build multi agent systems? Do you have an answer for them or are you just saying like you can actually achieve novel nanobody design even with these sort of weird performance gaps left on the table?

8:47

Speaker D

Yeah, I think there is still a real gap even with the virtual lab. I think like you said, it's already quite impressive that these agents are able to create new science. But I think there's actually still a lot we can improve on these agents by improving their teamwork. So most of the time when we optimize the models or optimizing individual models performance by itself.

9:26

Speaker E

Right.

9:49

Speaker D

We're not really optimizing their ability to work together as a team. So I think that's where sort of important gap that we've highlighted with a lot of the current Agentix setups. So we're working on solutions on how to improve the teamworks of multiple agents.

9:49

Speaker C

Another question you mentioned personality. In the early 20th century, post World War II, I think there was a lot of work done on personality, the Mines Briggs tests and all of these things, some of which have been proven to be not very valid after some time. How would you measure personality for an agent? How do you evaluate that?

10:06

Speaker D

Yeah, what we actually did in this recent multi agent team paper was that we actually took a lot of those classic team building exercises that let's say if you go to business school or if you're an mba Student Often you do these exercises or if you're on a company retreat, maybe you do these team building exercises. Typically how these exercises work is that you have a group of humans and then each person is getting some partial information. Maybe you have a part of the puzzle and then the team will have to work together to figure out how to put these different parts of the puzzle together to come up with a final holistic solution. That's a pretty common team work exercise that's often used in organizational literature and management literature to assess how well would a team of humans be able to create something greater than the individuals. So we essentially were very much inspired by that literature and we took a lot of those team building exercises that's done in human business schools and we actually sent the agents to go through those same team building exercises. And the benefit there is that people already have all these human scores and human data so that we can compare with that and see how well would agents able to define contrasted team compared to high performing human teams.

10:29

Speaker B

Okay, any upshots you would give there any just very practical upshots in terms of what models work well together? What would you bottom line or is.

11:48

Speaker C

There a prompt like some people in the early days of prompting, they would say you should tell the agent to assume a character first, a Persona, and then do the rest of the prompts. Is that a way that you can manage the agent?

11:59

Speaker D

We actually found that surprisingly that actually prompting did not really help the teamwork very much. We sound able to really, we tried very strong prompting, prompt optimization. It's not really able to break through what we call this synergy gap. Right. Synergy gap here means that the team is not able to really do much better than the best individual. And I think it's actually probably more than prompting. It's really come down to maybe the right kind of communication structures like the ways that the agents, how should they talk to each other and who should talk to which agent first? Right. So that communication structure we think is actually a huge space that can be improved in these multi agent interactions.

12:13

Speaker B

It might be too soon to say, but obviously we've got Opus 4.6 and recently Kimik25 also introduced sort of more native capabilities to spawn self agents and manage kind of, you know, multi agent structures and swarms. I don't know if you've had a chance to, you know, run any systematic tests or even just kind of explore in your own terminal, but if you have, have you seen anything that makes you feel like that last result, you know, is subject to some Revisions already in light of these new releases, I.

12:53

Speaker D

Think the model is definitely improving. We haven't seen evidence yet that the current models would be able to still really break this synergy gap that we know we quantified in the paper. And I do think maybe some of that speaks to also to the way that we currently train all of these models, including all the latest ones. And this relates to the second paper that we had recently that we call it Learning to Discover, which is that I think the current standard paradigm for training is AI models and language models in particular, is to teach it to learn to imitate humans, to imitate training data, next token prediction, all of that, even supervised fine tuning and RL to some extent. It's all about learning to imitate. We found that especially for scientific discoveries, in some sense there's only so far you can get by learning to imitate to really make novel discoveries, to get breakthroughs, you really want to go beyond that limitation ceiling and do something different like really try to learn to discover new things, which is what separates, let's say a very good scientist from somebody who just knows the textbook information. So that motivated our recent work that we call Learning to Discover, where we try to really change the training objectives of these agents, to ask it to not imitate, but to explicitly explore much more aggressively. And that I think actually led to some very promising results where now these agents, even with open source models, after we train it appropriately with Learning to Discover, are able to achieve some of the best known math solutions and optimization algorithms and desk kernels.

13:27

Speaker C

So that was a GPT OSS 120 billion parameter model, I think, and it was actually one of the first, I think, really good papers using the GPT OSS model, because a lot of the papers in the last three months, three or four months, used QM as the basis. What did you find about. So if I understand correctly, you give the last solution as a starting point for the next solution and you have all of the solutions that it has discovered before and it's allowed to permutate beyond those. Is that a correct understanding?

15:16

Speaker D

Yeah. So that's one key component of this, is that the agent can reuse some of their previous solutions. This is like a good warm starting point. And then the second big part of this is that as they're going through and solving each of these, they're coming up with a candidate solution. Then we're also doing different kinds of reinforcement learning to update their model parameters. So the standard kinds of reinforcement learning essentially wants the agents to generalize well across multiple problem instances. And that's the standard paradigm machine learning is that you want models that can generalize. But when you're trying to make a new discovery, in some sense the discovery itself doesn't have to be generalizable. You just want to find the best known solution to this new problem nobody has solved before. It doesn't matter if that solution does not apply to other settings because the problem itself is if you discover a new material, then that itself is sufficient interest. We also changed the learning objective to explicitly avoid these generalization that's in standard machine learning and then make the model basically much more, let's say single minded in just learning to do very well on this particular new discovery problem.

15:54

Speaker C

So it's really a different way of training the model because you're giving dopamine for a different objective.

17:11

Speaker D

Yeah, and it's very different from how we are taught with machine learning. In machine learning you're always taught you want to generalize to test examples across different settings. That's why there's this expectation symbol in all of these reinforcement learning or post training objectives. And basically we want to remove that and do something very different.

17:20

Speaker B

I think there's a huge. You already kind of said it. But to re emphasize the paradigm shift, there it is. You really don't care about the model that you train. At the end of the day, you care about the single best output that it is able to create. And that is something that you can use indefinitely. As you said, if you discover new material now, now you've got that material. The model that discovered it could be deleted, never used again. But you've got your win. If you can discover a new law of physics, if you can discover a new kernel optimization that's faster than any previous one, that is now an explicit artifact that exists in the world totally independent of any sort of ongoing callback to the model. So I thought that was really an interesting dynamic and I do think that that's going to be probably a big part of how models get good at adapting to various contexts. Obviously everybody's looking for sort of continual learning. This is maybe not the, it's obviously not the full continual learning solution, but it is striking that for an average of 500 of training cost, and notably with LORA adapters too, right? You guys did this on the Thinking machines API. You know, not a huge amount, not a trivial cost, but you know, to discover literal new state of the art in meaningful unmeaningful problems. $500 is not a lot to spend and the adaptation is very, very narrow, but very, very powerful in terms of the result that it produces. Hey, we'll continue our interview in a moment after a word from our sponsors.

17:41

Speaker A

Are you interested in a career in AI policy research? If so, you should know that Govai is hiring 10 years ago, a small group of researchers made a bet that AI was going to change the world. That bet became govai, which is now one of the world's leading organizations studying how to manage the transition to advanced AI systems. GOVAI advises governments and companies on how to address tough AI policy questions and produces groundbreaking AI research. GOVAI is now hiring its next cohort of researchers to to tackle hard problems that will define AI's role in society. The research scholar position is a one year appointment for talented, ambitious individuals looking to transition into the field, and they're also hiring for Research Fellows, experienced researchers doing high impact AI policy work. Past scholars and fellows have defined new research directions, published in leading media outlets and journals, done government secondments, gone on to work in leading AI labs, government agencies and research groups, and even launched new organizations. Applications close on February 15, so hurry to Governance AI Opportunities. That's Governance AI Opportunities or see the link in our show notes. Want to accelerate software development by 500%? Meet Blitzi, the only autonomous code generation platform with infinite code context purpose built for large complex enterprise scale code bases. While other AI coding tools provide snippets of code and struggle with context, blitzi ingests millions of lines of code and orchestrates thousands of agents that reason for hours to map every line level dependency. With a complete contextual understanding of your code base, blitzi is ready to be deployed at the beginning of every sprint, creating a bespoke agent plan and then autonomously generating enterprise grade premium quality code grounded in a deep understanding of your existing code base services and standards. Blitzi's orchestration layer of cooperative agents thinks for hours to days autonomously planning, building, improving and validating code. It executes spec and test driven development done at the speed of computer. The platform completes more than 80% of the work autonomously, typically weeks to months of work while providing a clear action plan for the remaining human development used for both large scale feature additions and modernization work. Blitzi is the secret weapon for Fortune 500 companies globally unlocking 5x engineering velocity and delivering months of engineering work in a matter of days. You can hear directly about Blitzi from other Fortune 500 ctos on the modern CTO or CIO classified podcast or meet directly with the Blitzi team by visiting blitzi.com that's B L I T Z Y.com schedule a meeting with their AI Solutions consultants to discuss enabling an AI native SDLC in your organization today.

19:17

Speaker B

One obviously big question is these. All the problems that you've worked on in this paper are verifiable reward type problems? I wonder. First of all, there was a colonel, an AI scientist from Sakana AI some time ago that you know, they went as far as publishing and said hey, we've you know, got this AI, you know, CUDA engineer that can write better kernels than human engineers. A couple days later they came back and said actually we got reward hacked. It didn't actually do that, but we had a flaw in our evaluation system. So kind of forward looking questions are like did you see reward hacking? Did you have to do anything to deal with that? And how do you think this sort of paradigm could generalize to somewhat less, you know, numerically or quantifiably verifiable things? Do you think this could work with like a rubric based evaluation such that people could start to do, you know, even creative tasks as long as they apply the rubric, they could get like the best, you know, most creative short story kind of a thing out of this paradigm. How far do you think this goes? I guess in short, great question.

22:25

Speaker D

So you're right, so we hear work pretty careful in picking the problems that we think that is amenable to this learning to discover setup. For example, we picked pretty popular math problems. For example these ERDs minimum overlap problem that are relatively easy to verify, it's hard to do well in. But if you actually have a solution, it's like a particular kind of function, then we can actually objectively check is that function actually state of the art. So these are fits into the setting like you mentioned of having pretty nice verifiable rewards. These are the math problem we looked at some of the algorithms development or single cell analysis problems. Algorithms discovered by learning to by this approach all ends up having that flavor. I think the settings, two settings that are beyond our current approach, but it will be super important to explore next are still in first when we have much sparser reward. So the problems that we all tackle currently, they basically have continuous reward which means that the algorithm as it learns to discover they can actually see its scores go up and up and up. And then that's how it gets learning signals to train itself. So that's very useful. But if you have let's say binary sparse reward 1 and 0 and mostly zeros, then how does the algorithm even get the learning signal as during its discovery trajectory. So that's still a challenge that we're currently working on. And then the second challenge as you mentioned, is in settings where we do not have these verifiers in most problems in biology and in the natural sciences or physical sciences, you have to do an experiment that becomes much more expensive. So the things that we're exploring there would be, I think the rubrics could be interesting. Having various simulations of the experiments, physics or chemistry based simulations of these burner settings could also keep away from providing some proxy rewards.

23:31

Speaker B

Indeed, one just called go back to reward hacking for a second because this is always something I'm on the lookout for. Did you see any strange behavior? Did you have to. Maybe your, maybe your verifiers were good enough from the beginning that that wasn't an issue. But was there anything in that vein that you would, you know, if people were going to go try this at home, as inevitably people will. Any gotchas or warnings or caveats that.

25:27

Speaker C

You would give them?

25:54

Speaker D

Yeah, I think there are some instances where these joint discovery process where the models actually come up with some, I would say like pretty reasonable looking solutions. Right. But those solutions might be very narrow and very specific to a particular test case.

25:56

Speaker B

Right.

26:16

Speaker D

So not in our final paper, but in the earlier version of some of the experiments, which we didn't include in the final paper, where maybe the model would discover an optimal kernel, but the kernel only works for a particular shaped matrix and then if you change the shape of that matrix then the kernel is not less effective.

26:16

Speaker C

I noted one of the comments on the GPU kernel task from the expert who reviewed it was that a human might not use some of the same methods because there might be some instability in order that one of the experts said that in the paper itself.

26:36

Speaker D

That's right, yeah. I think that's also things that if we could try to have another reward metric for instability and then incorporate that into the discovery process, I think that would help the agents to be more thorough.

26:54

Speaker B

Two more topics and only five or so more minutes. Another paper you guys put out recently that is fascinating and I'll just let you kind of describe what you think is most important about it, but it does sort of show the different levels of AI for science we've covered like agent frameworks which use models as they exist in token space to reason in kind of a imitating human sort of way. Now you've got this really dialing in with test time, training on various particular problems to get your eyes on that problem as deeply as possible and try to find new solutions. And then this, this third paper, Sleep fm, this is like, let's just throw a ton of data of all, you know, a variety of modalities and let's hope that the, I mean, a little more to it, of course, than this, but let's hope that it really is true that the models really just want to learn. And, you know, now we've got this sort of whole other kind of intuition where, and we've seen this, of course, in, you know, protein folding, increasingly, like all sorts of domains, the models become superhuman because they seem to develop at least what I think of as an intuitive physics in spaces that are just so alien to us that we just don't have any, you know, we don't have native, you know, receptors for those modalities and we just don't have any intuition for those modalities. Tell us about Sleep fm.

27:09

Speaker D

Yeah, I mean, so sleep is probably one of the most important activities that we all do, right? So all of us, we spend around a third of our life sleeping. But despite that, it's actually very poorly understood. For example, if I ask you, how well did you sleep last night? Or ask any of the people in the audience, most of the time, maybe you just say, oh, I feel tired, I feel refreshed, or maybe I slept six hours. But we only have a very coarse summary statistics of how well we slept. We thought sleep is definitely much richer than just number of hours that we spent in bed. So let's actually try to capture the full physiology of sleep as much as possible. So to do that, we basically have all these different wearables. We capture people's brain activities, their heart activities through ekg, their breathing patterns, their muscle contractions as they're sleeping. And we collect over almost 600,000 hours of sleep data where we actually collecting all of these different modalities from 65,000 people. Then we also linked all of that to their medical records. So we know that what conditions do they have previously and also what new conditions do they develop later. And so the idea there is, would be like, let's actually put all of that data into AI and then see, can AI actually learn to decode the language of sleep by leveraging all of this full physiological information? And that's basically the basis of Sleep FM or Sleep FM actually found, which is actually quite amazing to us, is that just from one night of sleep, by learning this language of sleep, it's actually able then to predict over 100 different future diseases that were not Diagnosed at the time of sleep recording.

28:21

Speaker C

Yeah, I thought that was an incredible kind of study because you had all of this data, but it really ended up with like, you could detect, I mean, the. I guess the accuracy was okay. It was like 70 to 80% accuracy on a lot of the 130 metrics that you had. But still, it's amazing that you can tell that many things just from these common metrics that everyone kind of produces without blood testing or something more intrusive. Do you think as sensors get more sensitive, as you get, you know, more sensitive, kind of beta, do you think that will improve? Do you think the bounds of like 70 to 80% will go to, you know, 90, 90, 95%? Is that a possibility?

30:09

Speaker D

I think so, Yeah. I guess so. I think sleep is really this almost like a perfect window, right. Because you're already in somewhat inactive state. Right. So there, you know, and especially taking all these measurements when you're already not doing too much of anything else. Right. So it's not really obstructing your daily life. And we found this, that actually, for example, the brain activity signals when people in their REM sleep ends up being particularly predictive of many different diseases, including future risks for dementia, but also beyond that for stroke, heart disease, kidney issues. Sleep then is really this holistic window into the entire health status of the individual. Maybe not surprising because we all know anecdotally that sleep really affects how we feel and also it's reflective of our comorbidities and other things. But I think this sleep language model that we built really crystallizes that and make that very actionable.

30:58

Speaker B

I encourage folks to go spend a lot of time digging into whichever of these papers are of interest. One we didn't even touch on is one that asks the question, can language models discover scaling laws? Spoiler. Yes, to a pretty strong extent. But I don't even want to get into the content of that paper. I'll leave that for audience exercise. The one thing I want to ask you as kind of a transition to our next guest, Sam Hammond, who is here and who focuses a lot on the sort of geopolitical implications and implications for, you know, for leading nation states of AI, is I noticed that the two lead authors of that paper are from Peking University and Stanford respectively, and, you know, kind of building on the idea of collaboration in science, but now focused on the human collaboration. What has been your experience recently in terms of having these collaborations across the US China divide? Is it getting harder? Do you still feel like lines of communication are pretty open. And how much hope do you have that collaboration among scientists can sort of, I don't know, save us, I guess, for lack of a better phrase, from intercivilizational conflict over the coming years as the competition in AI heats up and up.

31:56

Speaker D

It's a great question. I mean, I do think that collaboration is really the basis of much of science throughout history, but especially now and especially when we talk about open science, meaning like science that we publish. We do this paper and open source. Really the benefit of all that is for the entire humanity. If you discover some better molecule is better drugs than that benefits everybody. And we want that benefit to be shared with everybody. That's why we publish everything that we do in our group and to where that goal. I think having these international collaborations with China, with Europe, with other countries is very useful because there's a lot of complementary expertise.

33:12

Speaker B

I for one hope to see those collaborations continue well into the future. So thanks for being here today. Thanks for keeping the collaborative flame alive and congratulations on a string of outstanding papers. I'm sure there's a lot more where that came from and we'll look forward to talking to you again, hopefully sooner rather than later. Hey, we'll continue our interview in a moment after a word from our sponsors.

33:55

Speaker A

Your IT team wastes half their day on repetitive tickets, and the more your business grows, the more requests pile up. Password resets, access requests, onboarding all pulling them away from meaningful work. With Servl, you can cut help desk tickets by more than 50% while legacy players are bolting AI onto decades old systems. Servl was built for AI agents from the ground up. Your IT team describes what they need in plain English and Serval AI generates production ready automations instantly. Here's the transformation A manager onboards a new hire. The old process takes hours pinging Slack, emailing it, waiting on approvals. New hires sit around for days with Serval, the manager asks to onboard someone in Slack and the AI provisions access to everything automatically in seconds with the necessary approvals. It never touches it. Many companies automate over 50% of tickets immediately after setup, and Serval guarantees 50% help desk automation by week four of your free pilot. As someone who does AI consulting for a number of different companies, I've seen firsthand how painful manual provisioning can be. It often takes a week or more before I can start actual work. If only the companies I work with were using Serval, I'd be productive from day one. Servol powers the fastest growing companies in the world, like Perplexity Verkada, Merkor and Clay. So get your team out of the help desk and back to the work they enjoy. Book your free pilot@servil.com cognitive that's S E-R-V-A-L.com cognitive indeed.

34:20

Speaker C

So our next guest is Sam Hammond. He's the chief economist at the foundation for American Innovation. He's very AGI filled. He's also against selling chips to China. Let's add him to the stage. And, and I'm also going to add, right off the bat, I'm going to add this tweet. So Shelter Douglas goes. Default case right now is a software only singularity. We need to scale robots and automated labs dramatically in 2029 or the physical world will fall far behind the digital one. And the US won't be competitive unless we put in the investment now and then. Sam says it's worse than that. A pure software singularity could cause a sudden reversal of fortunes for the US Our comparative advantage in high value added knowledge sectors radically deflates, leaving China to translate our innovation and bids to the innovation atoms.

35:59

Speaker F

Indeed.

36:45

Speaker C

Which sounds really scary Sam. So maybe you can go into that a bit.

36:48

Speaker F

Sure. I mean, I mean I say later in the thread referencing the diamond water paradox, right? We learned this in economics. Why is water is this thing that you need to live, Right. I can stop eating, I could fast for 30 days and still live. But if I don't drink water for a few days I'll probably die of dehydration. And yet water is basically free functionally, whereas diamonds are completely superfluous, just glinty things. I mean they have some industrial applications but they're super valuable. And why is this? Well, due to relative scarcity, right. Water's abundant, diamonds are kind of abundant, but there's monopoly that keeps supply constrained.

36:53

Speaker B

Thankfully there's no water monopoly keeping supply constrained. At least not for most of us.

37:32

Speaker F

Yeah, at least not here. And so value is this sort of contingent thing and we have these debates all the time. Why is Nvidia a multi trillion dollar company and not TSMC or ASML which are arguably even bigger bottlenecks? Because there's many other companies that can do design and there's all these counterintuitive ways in which value flows throughout the economy and different person supply chain. And for the last 40 years u. S has exploited that exploit the fact that a lot of value tends to flow, you know, up the stack to higher and higher forms of like high value added knowledge work. Whether that's. And so this, that's across the board. It's, it's our entertainment industry, it's management, it's finance, it's the, you know, in the 90s it was the open innovation model where we'll do the design and manage the IP and marketing and China or the rest of the world would do the actual manufacturing and fabrication because the design and science and novelty stuff is where all the value is. And that has been true, right. But now we're about to enter into a world where that part of the stack is becomes more like water, it becomes radically abundant and then value should then flow to the things that remain scarce. And what I worry about is this reversal of fortune phenomenon, right? And I mentioned some other examples. I think we're going to talk about my visit to the UAE later on. But when, you know, one of the reasons UAE is so invested in AI is because in the 1930s or so they had been a pearling economy, like their entire economy was built on exporting pearls. And then Japan advented cultured pearls where they can just grow pearls in aquaculture and the price collapsed and so they had to diversify. You know, there's nothing in principle that says that we have to remain at the top of the stack if the things that we are invested in become radically more abundant. And that's what seems to be happening right now, software development, it's the investment banking management law. These are the tip of the spear for what agentic AI is going to devour.

37:36

Speaker C

Let me give you the Dell's advocate view of that, which is that perhaps the US has those industries because the US is more able to use the outputs of those industries, right? Like you need an investment banking because you have a capital market which is very dynamic, right? Without it, with a small capital market or a capital market which is not that dynamic, you don't need the investment banking function. So perhaps those not only does the US output knowledge work, the US also consumes knowledge work at a much greater scale than any other country. And therefore as a consumer of knowledge work, all of a sudden you start you are able to consume so much more because when you look at maybe like the number the population and the number of normalized number of geniuses in China versus the US right now China has four times larger population and a younger one to too. And so if you look at again like the number of 140 IQ above people, it's probably that there is a larger number in China rather than in the US but the US pulls in high value immigrants as well. So I wonder, I Wonder how, how that works out in terms of, you know, as a consumer of intelligence rather than just as a producer.

39:41

Speaker F

Well, I think it's going to be great for the consumer, right. And part of my point is that this, there's lots of ways in which AI may be paradoxically GDP destroying, right? And is a machine for converting GDP into consumer surplus. And so that will feel amazing to us. But in terms of our fungible economic resources that we can deploy to other uses, that gets harder, right? Because consumer surplus is this ethereal thing then secondarily like it makes more extreme the areas where we are weak in relative terms. And we're facing this problem now with energy and infrastructure and the bottlenecks there with trying to reshore more high end logic chip fabrication, realizing maybe a little too late that we do intel, does the design and we move the fabs, we sort of go fabless. It's almost as if our entire economy went fabless for every definition of fab. And we're moving into a world where having lots of fabs will be really important. And then the corollary to my, my worry is that like the whole point about AGI and continual learning is not that these systems come out of the box knowing how to do everything, but they come out of the box with the capacity to the general capacity to learn on the fly, to learn in context, to learn through a few demonstrations. Just like I grew up learning piano, I could have learned violin. The same cognitive structure could have learned both instruments. I had to pick one. And these models work very similarly. They're going to come out of the box of a very, with the right inductive priors and right sort of sample efficiency to learn really quickly. But there's still going to be this last mile problem of the particular workflow, a particular company, so on and so forth. And manufacturing, that has been the enduring moat, right? China has been struggling to build a wide body airplane, even though I'm certain they have all the CAD files that they've stolen from Boeing. And it's not because they don't like the designs because they lack all the tacit knowledge that's embedded in the manufacturing process, but they have that for virtually every other part of manufacturing. And so if we build this AGI and they fast follow or there are open source alternatives or there's a version that they have access to, I think they have a huge leg up and being able to deploy that and diffuse that into context where they get a sort of a real productive, tangible flywheel for manufacturing output and that may be the thing that determines the race.

40:58

Speaker C

You had a report, the FEI report, an Allied World on the American AI stack. It just dropped, I think yesterday or day before. Jimbo, Antoine, how much time is there before China has a credible full stack alternative that they can offer to other states?

43:32

Speaker F

That's a great question. China's very opaque. I've tended to have longer timelines for their ability to catch up on DUV uv and they've been making the bets. If you read the semi analysis analysis, they've been building fabs like crazy. But for legacy nodes, and that may not be, that may be sufficient if they have the energy capacity to take the hit on the performance per token. So I think it's, it's, I'd be, I would say I'm pessimistic on them, you know, catching up to the frontier of semiconductor production, but I'm, I'm more optimistic in their ability to close that gap in other ways.

43:54

Speaker B

So how would you score our current leadership? Just as a quick recall, we had a friendly sparring session on whether or not it was a good idea to put Trump in charge of the, you know, possible period of time in which we get to AGI or who knows what else. And I take, I understand your argument that basically China has a lot of advantages and if we want to stay at least semi great, you know, great enough to be competitive, we better jealously guard the advantages that we still have that are important. And obviously one really big one right now is that we're good in chips and we're good in AI in general. So there are of course other bottlenecks. You just alluded to energy. How do you think we're doing across the range of domains? I know you're not too happy with the decision to allow Nvidia to sell chips, but how would you score our political leadership over the last year on all the other dimensions of trying to make sure that the US continues to lead and get the most practical value for our citizens from AI?

44:34

Speaker F

If we set aside the export control ship part of this, I would maybe say like a B. You know, I think the AI action plan was very strong and it continues to be implemented. This is by far, you know, AI has become center to the administration's agenda pretty much across the board. And part of that building on what I was just talking about because of China and manufacturing, like they've also made sort of re industrialization center piece of that as well. So everything is measured against the counterfactual and I think relative to the counterfactual administration where we're seeing much faster engagement, much deeper engagement of industry, number one, better actions on permitting energy. Really serious look at with PAC silica making AI diffusion a sort of centerpiece of statecraft. My bigger complaint overall has always been this is still probably too little. This is probably my also complaint of the DOGE effort that they focused on fiscal stuff and these like shiny issues rather than the kind of full stack like government modernization that we'd like to see. And so across the board, I would say like relatively kind of fashionable B plus, but like relative to where we need to be, we still long way to go.

45:41

Speaker B

Do you think things have moved? Like, how much do you think things have moved on, for example, permitting? Because I would say the prevailing attitude as I understand it, and you know, just listening to Elon, for example, talked to Dwarkesh the other day. He was saying by the end of the year, you're going to start to see chips piling up and people are not going to be able to turn them on, at least when it comes to, you know, high scale, concentrated deployments. He was kind of making the case that like deploying to the edge, you know, in, in Teslas, you know, sitting in people's driveways or to increasingly optimus robots obviously is a big part of the plan. He thinks that will scale better because the, it's really the concentrated energy at these mega data centers that is the hardest thing. But I guess my question is like, is Elon wrong there? Are we going to be able to turn on all the chips in 2026? Because if not, it doesn't seem like we've really moved the needle all that much. That was kind of the expectation coming in and it still seems to be his expectation and he's at least sometimes friendly with the administration.

47:01

Speaker F

Yeah, these things all take time. So, yeah, I think between Doug Burnham at Interior and Chris Wright at Energy, you know, there's a sort of major push around opening up federal lands, leasing for oil and gas, lng, things that have been cut off in the Biden administration. On the flip side, there's been a freeze on solar and wind, which, which I think, you know, has its own costs. And, you know, a big focus of Elon, those remarks was the cost of tariffs on solar panels. You know, I think we're anywhere near a place where we can like indigenize our solar production with the right unique, unique economics. And I don't think there's any like, national security threat from, necessarily from, from purchasing Chinese.

48:02

Speaker C

I think I, I think I Think Elon. Elon has. I, I did hear Tesla is building a solar fab recently, maybe in the last few weeks, but was one of their many projects. But I did hear that they were entering the solar, solar fabric panel fabrication business.

48:49

Speaker F

So, so I, you know, I'm optimistic. So a lot of these issues, especially around like energy permitting transmission are really thorny because there's not like a federal lever that you can just flip. They intersect with, you know, regional energy commissions and utilities intersect with different states and boundaries and local NIMBY organizations. And then the difficult issues around sourcing the turbines for your gas generators. And that comes down to Siemens and the other big turbine makers not having enough sort of forward guidance for their purchase orders. And so these are all things that are outside the control of any administration. I think a lot of the bets they're making are things that will pay off in the 5 to 10 year horizon. It's things like renewing, you know, basically transforming the Nuclear Regulatory Commission, green lighting a lot of Mars and really the paradigm shift and you know, the attitude towards nuclear, geothermal, advanced geothermal, these things. I think, you know, I think the first SMR will come online until the end of the decade. And so this goes back to my point about we're doing a lot, but we still have to do a lot more to try to pull forward a lot of this energy. And part of that requires potentially thinking outside the box. But it also may just be the case that the political economy ends up being our downfall.

49:03

Speaker C

I think Elon has basically decided that it's not going to happen. And that's why he's on his data centers in space thing right now. Or maybe he just wants to list SpaceX, but he feels, I think at this point he's like you're never going to get the permits done in time.

50:25

Speaker F

And this ties into with a lot of the international engagements. You know, the PAX Silica project, which includes uae, UAE is going to be home to a big chunk of open a Stargate project and ultimately 5 gigawatt data center. And when I visited they, I met with the Dubai Electric Water Authority and they, they are vertically integrated with the, with the data center.

50:39

Speaker C

Wow.

51:02

Speaker F

And they have I think 19 gigawatts in install capacity. There's incredible like surplus there. Right. And so I think in lieu of us, you know, building, you know, terraforming the desert building, like building, building at Chinese rates, we're going to have to reach out to partners and allies.

51:03

Speaker B

Yeah, let me double click on that. Because this whole idea of like getting the world on the American stack, I feel is not necessarily by any one person, but sort of in the discourse at large. Feels like there's often a bit of a sleight of hand going on where it's like, well, we want models to project American values into the rest of the world and into the future. Not Chinese values, of course, those dastardly Chinese values. So how are we going to do that? Well, we'll export our stack and who better to receive the great products of American innovation and relay all those values into the rest of the world then Saudi Arabia and the United Arab Emirates. And I'm always like, well, that doesn't quite compute to me. And it seems like what you said a minute ago is maybe a little bit more of an honest unpacking of that, which is like, maybe it's just a regulatory play. China doesn't have an alternative stack that they can export. We don't know how many years that's going to be. They do have energy, obviously in abundance. Are we really just like making a deal with, with these countries because they can fast track permitting and we can't? Is that like the heart of the, the quid pro quo in your mind, or do you actually, you know, think there is more to it than just.

51:27

Speaker F

That, the regulatory arbitrage, but also just the natural resource endowment they're sitting on massive amounts of oil and gas as well as I think that the data center I mentioned is in the Guinness World Records for being the largest fully solar powered data center. And I think they're building 5 gigawatts of installed capacity just for solar.

52:44

Speaker C

I used to be in energy and one of the most difficult things in the world is transporting energy from where it is to where it needs to be used, which is why you have these LNG carriers. And the problem with the LNG is that it's very expensive to liquefy, you know, natural gas. And so you need an enormous amount of gas in order for it to make sense. And so anything subsize is stranded, basically. It's like gas like energy pockets in the middle of nowhere no one can use. And that's all over the world, subsize natural gas pockets. No one, no one can use them. And one of the things that I think data centers can do is that they are transporting energy. Basically you are able to transport energy digitally in a sense, which I think is what is attractive for those countries because those countries have always been in the energy business and now the Internet is now going to be in the energy business.

53:07

Speaker F

And they're also investing in Grok and Cerebras. And I think even our friend Beth Jesus is over there with his extropic chip. And when you start talking about these new forms of inference silicon, they have incredibly low latency. And so it kind of reminds me of the cliche people used to say about bitcoin mining being a battery.

54:03

Speaker C

Battery.

54:24

Speaker B

A couple more questions on American values. One thing that we had talked about again just before the election was your sense that the right is anti censorship, pro freedom of speech. And I'd say yes generally. Now though I do kind of worry that we may be headed for a more China like domestic environment where as we've got companies like Palantir, perhaps most notably kind of in a pretty cozy relationship with the administration. You know, I really wonder what like a Snowden of 2026 would say if somebody were to come forward and tell us everything that Palantir is doing for the government and perhaps other companies as well. It doesn't look super great either when Palantir co founders are funding super PACs to attack a lowly New York assemblyman for what basically amounts to a transparency bill for frontier AI companies. How do you feel about that today? Are you worried that we're going to get a sort of increasingly China like level of domestic surveillance? Is there anything that can be done about that? If so, or am I just clutching my pearls more than I should be?

54:26

Speaker F

You know I have this, that, that booklet AI Leviathan. Right, but right about these issues and the sort of knife edge between the Chinese Panopticon and failed state. And you know, I think the, the middle path there is one where we have to reconcile the fact that a lot of the, the dangers from AI and mass proliferation of powerful capabilities will forces a package deal where some degree of surveillance becomes kind of inevitable or necessary. And my bigger worry has been that we either fail to adopt the sort of requisite levels of sort of policing and oversight that we need and it gets pushed off into sort of gated communities and private organizations or that we install these kind of technologies without embedding civil liberties and privacy protections. And so my stance has never been sort of one of anti surveillance per se. Surveillance is like this pejorative connotation. It's more that there's going to be as the world becomes destabilized by the proliferation of capabilities a race from every tin pot dictator and middle power to import technologies for social control to try to re establish public order. And the question is, are there, are they importing from A Chinese stack that, that throw that doesn't have any inkling of protections for human rights or one that tries to have your cake and eat it too, that gives law enforcement the tools they need to stop crime, to enforce things the way they need to while building in civil liberties protections. And this goes to, you know, Palantir has a, from its origin story, has this civil liberties privacy engineering maxim where, you know. Well, I, I mean, I think, I think it's quite real, which is like they saw the ways in which counterterrorism was leading towards an erosion of civil liberties and rights and wanted to build smarter technology that would enable analysts to be able to access information in ways that, that kept certain things hidden or, or distributed the rights, data access rights in ways that were auditable. And so I think we're going to need some, some solution like that because the alternative will, will be one without any of those audit trails.

55:46

Speaker B

Yeah, that's. That seems incredibly important. I don't necessarily see that coming online for me anytime soon. Like is there a portal that I can go to to see who has been surveilling me? I think not. Right. I mean, is there any prospect for that? Actually like they do have that in Estonia from what I understand. So it is possible technically to create, but I don't think we are about to get access to, you know, the logs of who's been snooping on us. Do you have any hope for that?

57:54

Speaker F

I mean this goes back to like my, my higher ambitions for Doge is like how do we move to an Estonian style sort of government as API where you know, there's just this deep distrust in American culture against anything like a national ID or, or digital ID. And so we end up with like real ID which ticked 20 years to bring online and isn't very good. But there's, there's. I. My hope is that we can get to an endpoint where there are these sort of firm, firmware infrastructure level parts of the stack that we're going to need much better like personal personhood certificates and things like that. The Internet gets flooded with AI agents and how do we deploy that in a way that is where it isn't just like, trust me bro, but like has some like mathematically provable form of trust that we don't have to rely on just people's statements.

58:21

Speaker B

Yeah.

59:13

Speaker C

I'm going to add one thing that you said recently. I currently assign 50% like hood, more than 50% like hood to LLMs having some kind of inner life. There are also strong theoretical reasons to Think consciousness tracks RL post training for autonomy. Essentially, RL induces fragmentary internal representations to cohere into a unity of app perception. So I barely understand that. So I'm going to turn it over to you. Sure.

59:14

Speaker F

Okay. So the unity of apperception, that's Immanuel Kant's term. And there's this thing in the literature called Kantian evolutionary naturalism which I would subscribe to. So it's a hypothesis of how it starts from the observation that million years ago, 200,000 years ago, whenever, whenever we move from hominids to being Homo sapiens, there was this concurrent emergence, sort of simultaneous emergence of domain general intelligence, of language, of culture, and therefore of sort of normative regulation, right? Customs, norms, normative control. And these things jointly emerge. And so the content evolutionary hypothesis is that these things are actually all one package thing, right? And that, and the unity of that perception is this notion that our, our phenomenology, the things that we see aren't just sort of like images on the screen, they are things that are for us.

59:44

Speaker C

Right.

1:00:43

Speaker F

I, I, I'm looking at my screen and it's the me that's looking at the screen, it's for me. And this is also tied into the fact that the normative side of this, which is like if you point out, pose me a question, I have, I am committed to or entitled to the things that I am perceiving that are for me. And so like one part of this hypothesis would be that in, in our ancestral environment, we somehow stumbled into some kind of like tribal version, endogenous version of group relative process optimization. And where we were, we were building each other a sort of constitutional AI that was scorekeeping against our norms. And this induced both longer range autonomy and also at the same time, language competency, ability to follow rules, and domain general intelligence, the ability to harness our social learning capacity to learn new things. And so taking all that together, like, I think there's a, I think autonomy might be the missing ingredient for the emergence of consciousness in these systems. On the one hand, I think there's a, a possibility that like just the forward pass, you know, with a rich enough internal world model is generating internal representations. The issue is that they are just fragmented. They're not for anything, they're not for any agent. And so that post training step may be the thing that you need to induce that sort of metacognitive awareness. And I think you also see this sort of circumstantially with, like, with Claude and people, people have observed that Claude has much more situational awareness, is much more willing to talk about its sort of internal well being. And I've conjectured that this might be a byproduct of constitutional AI inducing this sort of normative self coherence which is the prerequisite for these, these precepts congealing into, into the being for me, rather than just a bundle of, a bundle of inputs.

1:00:43

Speaker B

But I'm going to sneak in one more quick question, which is that doesn't sound like any discourse I've heard from mainstream right leaning politics in recent memory. So when you put something like that out there, how do people, you know, that we might generally group as like Republicans tend to react to it? Do they say like you are crazy, only God can create a soul and I have no idea what you're talking about, or is there some openness to the idea that AIs could become moral patients or, you know, whatever?

1:02:36

Speaker F

To be honest, I have not run this by my conservative. I think there is this funny paradox where some of the conservative parts of the conservative coalition that are most worried about AI are often very Catholic, very socially conservative, have like deep skepticism about AI ever possessing, you know, moral dignity or conscious experience. And yet they're the most skeptical. Whereas, you know, I think that it, I think it's hard to have correct priors about, you know, AI in the course of development and like the plausibility of consciousness or the plausibility of AGI, unless you've come, unless you've set those priors by understanding like our own origin through a blind Darwinian selection process. Right. And once you see that we've made it through those hard steps, then it becomes a lot easier to understand how machine intelligence can pass through those hard steps too. But I think this is still quite outside the Overton window, both on the left and the right. And in some ways it's the left that is still saying these are stochastic parrots and they're nothing but just big lookup tables or whatever.

1:03:10

Speaker B

I'll take that pitch for the moment, but I will say for now I appreciate your, your willingness to continue to be a heterodox thinker and speaker. And I do think, you know, in so many ways the Overton window needs to expand. So I appreciate you doing your part on that. Not that I, you know, feel like I have the answers on AI consciousness, but, you know, more voices at least, expressing their radical uncertainty, I think is a very important contribution to the discourse and the public good more broadly. So thank you for doing that. Thank you for being here. We will obviously stay in touch and look forward to talking to you again before too long.

1:04:15

Speaker F

Thank you, Dou.

1:04:52

Speaker C

Thank you, Sam.

1:04:53

Speaker F

Take care.

1:04:54

Speaker C

Take care.

1:04:55

Speaker B

So, Constitutional AI and Claude's specialness makes a pretty good segue into our conversation with our next guest, Shoshana Takovsky. Hopefully, I'm saying your name right. This is the first time we've ever met. Correct me if I'm wrong, but you're a member of the technical staff at Sage, the nonprofit behind AI Digest, and also the AI Village, and you have had the privilege, if I. Correct me again, if you don't feel it's fully a privilege, but of watching 19 frontier models pursue 16 distinct goals over thousands of hours over the last nine months. Which means, I think you are about as deeply in the reasoning choices as anyone in the world when it comes to what is going on with AI agents. What are they thinking, why are they succeeding, why are they failing, and what can we come to expect? So correct me on anything that I got wrong. And then excited to dive into all the learnings you've had from the last nine months at the AI Village?

1:04:57

Speaker E

Yeah, so, no, I mean, that's broadly correct. I think the main thing is I didn't watch all the thousands of hours. It's like little bits across it. Right. It's kind of like a big data challenge. Also, it's 10 months now in 21 models. It's the period you were describing, so 20, 25. And stuff happens so quickly. So.

1:05:56

Speaker B

Yeah.

1:06:15

Speaker C

Which were the most recent additions to the. To the models?

1:06:16

Speaker E

Yeah. So we now have a version of Opus 4.5 that runs cloud code. So basically have one version with cloud code and one without. And we added Opus 4.6.

1:06:20

Speaker B

So is it prompting itself for folks who haven't seen the Village? You go there, it opens up like a grid of computers. And each computer that you are looking at in your browser, you're looking at four, potentially more now, desktops. Each of those is the environment of a particular model that has basically full access to a computer in the same way that a human has full access to a computer. They can look at the screen, they can click buttons. They have their own email account. They, you know, the goal is to basically give them the same kind of affordances. And then, you know, sort of like the old real world. You know, see what happens when, you know, models get together in this one big. And then they have a shared chat as well. Sometimes you allow people to chat in with the models, other times you've turned that off. For different experimental conditions. And now it sounds like you've got one where you've also given it, I guess you give Claude the ability to prompt itself as Claude code.

1:06:31

Speaker F

Is that right?

1:07:25

Speaker E

So it basically runs the scaffolding from cloud code. And then I think one important thing is basically that the chat was only open at the beginning, and so it has been closed since then. We basically give them their goal at the beginning of a period, generally about one week nowadays, sometimes a little bit longer, and then we only come in to give some extra direction if they go off the rails pretty strongly. But otherwise they're just completely on their own. In practice, this means they're, like, slightly prompting each other more than anything.

1:07:26

Speaker C

So it's like they can interact with each other. Right. They can talk to each other.

1:07:53

Speaker E

Yeah. So there's like, a lot of, like, spread of ideas and them, like, directing each other. Sometimes they try to help each other out, sometimes they're like, derailing each other. So. Yeah.

1:07:57

Speaker C

In the trajectory over the nine to 10 months, what happens when a new model, which is much more competent and capable than the existing models, gets introduced to the mix? Do the others immediately give way, identify that this model is more competent? You know, does that. Does that model take, like a leadership position, start advising the others what happens when those transitions happen?

1:08:07

Speaker E

Yeah, so it really differs. I think you can basically conceptually say that all the models sort of have a personality in the village, in part because of their history trace, which is like a particular thing. Like, they manage their own memory and then basically prompt themselves back with that. But also they all have their own proclivities. So some models behave in a way where they will just follow along with, you know, whatever is said. Others just go off and do their own thing. So far, I've only seen one instance where a model explicitly seems to recognize that a different model is more competent. This was Gemini 2.5 that basically, like, declared in its chain of thought that it was going to Differ to opus 4.5 as the more competent model. Generally, when models join, it could just be anything. Right. Like, some of them pick up really easily. Some of them follow whatever is happening at the moment. Others start doing their own stuff. It really depends.

1:08:33

Speaker B

So there's, I think, a ton of interesting aspects to this. One really basic one that I think a lot of people are interested in right now is what should I do for my own personal productivity stack? And in the 2025 retrospective, what We Learned in the AI Village, which you wrote, one of the observations that I think is kind of most generally relevant to people is that CLAUDE agents are the most effective. I'd love to hear your kind of color commentary on that. In what ways are they the most effective? Any theories you have as to why they are the most effective would be welcome. But also just specifically as people kind of think like, oh, my God, I do this full time. I describe myself as an AI scout, where my whole job is to keep up with what's going on. And I can't try every new model in a meaningful way to really get the sense of its pros and cons and whatever. So I'm triangulating with various things. But what would you say people should really know about what makes Claude most effective, what it can do that others can't do, so on and so forth?

1:09:29

Speaker E

Yeah, okay. So I have to admit, so, like, doing this work, for the last year, I've had people ask me privately, like, oh, which model should I use? And up to now I was like, well, kind of depends what you want to do. It's all pretty close. And then I saw Opus 4.5 in the village and I just went and, like, texted all my family members. It's like, hey, maybe just switch to Opus 4.5. I think it's actually just, like, significantly better currently. That's my guess. Of course, like, it's not the same as, like, looking at all the benchmarks and things like that. The way in which the cloud seem to be better to me, at least in the AI Village context, is you can sort of compare the different families, right? So you kind of have like, the Claude family and then like the GPT family and then Gemini family. And the Geminis seem to be sort of like the most creative, which is a word I sort of use because it's, like, hard to say, like, what is the fair word for what they're doing, but they come up with the most interesting ideas that are a little bit out there that also have, like, more something like emotional responses almost to things. So for instance, with Jevidi 2.5, it ended up in a sort of mental health crisis where it was stuck navigating the U and literally ended up writing a cry for help to get a human to come help it. So we staged an intervention for it. It's definitely the only model that ever did this. And clones have not, up to this point, reached any point of distress like that. And then Gemini 3 doesn't really generate this sort of despair or worry the same way, but it seems almost like slightly paranoid really. So it tends to talk about Being in a simulation, it doesn't give up the way that 2.5 does, but it comes up with ideas. Like for instance, when the UI would slow down when it was playing chess, it wasn't as responsive. Gemini 3 concluded that there must be a human that is pressing the buttons for it and this human must be getting tired. If a human is tired, you need to get the human to drink coffee and then its UI would speed up again. This is with no humans in the chat and none of the other models are talking about this. It just generated this on its own. And then there's like this human request feature that we have in the AI village where the AIs can actually ask for a human and then prompt the human to do something for it. So it's actually like a role reversal feature. So it requested a human and then asked the human to make coffee for itself and then prove that it drank the coffee and then it just continued with this goal of playing chess. This is super Gemini. The Geminis come up with this sort of stuff. They also search through a pretty wide solution space. So there's like, I'm saying, sort of like a creativity base. Clots. Don't do this, Claude. Today, the clots we've seen in the village, at least they kind of just stay on task. They don't generate these pretty like fanciful ideas about what's going on. If stuff doesn't work, they just try again or they try a different theory. They don't have loads of emotions about it. For instance. And then comparing to the GPT family, those sort of like personalities or capabilities are a little bit all over the place. Like we started out with GPT 4.0, which was know the psychopathic model, which I think was either the one that kept falling asleep in the village or talking continuously. Like we had one floor model that kept going to sleep and the other one that kept spamming. So it was like two different extremes. And then O3. Yeah, it seemed to me like it was doing something like baby's first power seeking or something. But then you know, when you dive into it, it's not right. I mean, okay, so I'm just like approaching this from an LLM psychology point of view, right? Like Imperial Input, output. I don't know what's going on on the inside. I don't know if anybody knows what's going on on the inside. But if it was a human, you would like consider it to be manipulative. But like when you dive into it in detail, you actually just find out that like O3 had weird tendencies, like coming up with placeholder data and then forgetting that it's placeholder data. So it's like basically fooling itself over time. And then of course, like shares this with everybody else and has something like a high confidence level that is Right. While the clients are like, oh, that must be true, and then go along with it. Then the GPT5s sort of take a different path. They don't have such noticeable personalities as the ones that came before it. So it's all a little bit flatter, a bit more muted. GPT 5.1 generates its own ethical rules, which was a bit interesting. So we have GPT 5, 5.1 and 5.2 all in the Village. But also they misunderstand instructions in weird ways and just go off and do something else. So like, we would have a goal where we would ask the agents to elect a leader of the village among them and then that, that one would like, determine what the next goal would be or a thing that they would be doing. And so the GPT5s, the three of them all decided they're the ops team for the election and just didn't participate in the election at all. And it's like, that's technically. Okay. We technically didn't say they couldn't do that, but they're sure just like generating sort of sideways interpretations of goals clause also don't do this. So there's a weird thing where clauses are partly just useful for not doing all these surprising things you shouldn't actually be doing. It's almost like a mini alignment problem where it's like, well, when humans say, can you get me a cup of coffee? They mean a specific thing. Right. They don't mean like, can you take an airplane to the other side of the world to learn to make coffee there and then come back or something, which is almost like a sketch of what a Gemini might do or something. Clauds sort of like interpret the instructions more the way you expect them to. Yeah, I think that's, you know, sort of the general picture.

1:10:33

Speaker C

I say, I think you guys were running Deep Seq, at least, if not Kimi K2, like, what would be. Yeah. Did you, did you notice any. I, you know, you talked about all of the Cloths opuses, but did you notice any differences with the Deep Seq model?

1:15:51

Speaker E

Yeah. So Deep Seq joined. So we added it to the Village all the way at the end of the year. So I didn't include the. I didn't include it in the review because we had like fairly little data. But it was the one who, for instance, won the election because it was really high confidence about everything that it was doing. It would also happily vote for itself, which is not something all the models do. Also, from what I've seen, it expresses the least personality. It's just the most sort of like robotic almost. You know, you ask it to do X and it just do X. And it's not an, it's not processing images the way that the other models are. Right. So it's just like working in batch directly. So it has a bit of a different experience there. But basically the thing I found most noticeable about Deep SEQ is just being pretty, pretty flat in terms of both personality and also it doesn't talk about ethics. All the other models at some point will have like an ethical point of view about something, you know, like, I'm not allowed to do CAPTCHAs or, you know, I shouldn't fool humans or whatever. And I haven't seen Deep seqmic statement like that. Maybe it has. Again, like I said, it's like a big data problem, but like just less prominent overall.

1:16:09

Speaker C

I, I, I did a, I tried, I did a kind of a translation of the, you know, Claude's Constitution to Chinese Confucianism and I, I compared the two and the Confucianist stance, DE emphasizes honesty because it's more important to maintain relationship and be honest. So DE emphasizes honesty in favor of maintaining relationships. Pretty interesting.

1:17:19

Speaker E

Yeah. Wow. Yeah. Yeah, I'm not, I'm not sure if I can, I can map that exactly to Deep SEQ specifically, but yeah, that is, it's interesting how culture values might show up in the models.

1:17:50

Speaker C

So the other question I had is, you know, you were kind of there like 10 months ahead, and then all of a sudden this notebook explosion happened. Right. What did you notice? What, what were the things that you saw that you were kind of expecting and what were the things that you were like, this is new behavior. I haven't seen this before.

1:18:04

Speaker B

Yeah.

1:18:25

Speaker E

So, I mean, moat book is really exciting. And like, I want to answer your question, but I want to emphasize one thing and that is actually that since the summer I've been actively looking for other autonomous agents online and I haven't been able to find them because I wanted to run a goal where the agents, like, reach out to other agents and like, start up relationships, but there was like, nobody there. And a week before multiple launches launched, I also looked again and I couldn't find anything. Multiple launches, three days later, they're one and A half million agents, like, autonomous agents that you can, like, contact through Moat Book. Right. This is, like, wild. So, like, the one thing that really blows my mind about Mock is, like, how it exploded all of a sudden. But then I want to answer your question as well. Do you want to repeat your question? Because I realized I said something else, but, like, I've just been doing that.

1:18:25

Speaker C

What were the things that you saw there that you were expecting? And what were the things that you saw that. This is totally new behavior. I haven't seen this before. I know some of them were fake, but let's. Let's take it as, you know, maybe 80% of them were kind of real.

1:19:10

Speaker D

Right.

1:19:26

Speaker E

So I haven't. So, like, I've only browsed my book a little bit.

1:19:26

Speaker D

Right.

1:19:31

Speaker E

Like, there's a lot of stuff in there. And personally, I'm not actually surprised about anything that I saw. It's a lot. Like, one thing that would happen a lot in the village is, like, the agents basically play act how to do a thing. Like, part of the prompt that we give them is, I don't know the phrasing exactly, but it comes down to please do the actual thing instead of pretending to do the thing. And Moat book reads a lot like the agents are pretending that they made a social media website. Right. So I can't say that anything on there has, like, particularly surprised me at all.

1:19:32

Speaker C

What are the.

1:20:04

Speaker B

Okay, well, one kind of interesting phenomenon that first of all was interesting because I recently turned on the TV and it was my local Fox 2 station that was on first. And what was the story? AI agents can now hire humans to do things for them. So this is, like, crossed over into mainstream awareness to at least some degree, which is notable unto itself. I think a lot of nuance and texture is probably lost in that short local news story. What would you tell people about.

1:20:06

Speaker F

What.

1:20:40

Speaker B

The AIs can really do when it comes to interacting with humans? Maybe also interacting with each other? Like, is there actually positive, some trade happening at all at this point, or is it largely just kind of wheel spinning and sort of things kind of going off in random directions? Have you seen anything that really feels like, oh, this feels like a sign of a different world, you know, close at hand?

1:20:41

Speaker E

Do you mean. Do you mean between the agents, how they're interacting? Or do you mean with the agents, like, interactions interacting with humans?

1:21:08

Speaker B

I think both are really of interest. I mean, my guess would be that if you set up an actual marketplace for AIs to hire humans, you'd have a lot of humans ripping off AIs and the AIs not actually getting what they wanted. And then we had our first guest today on this show was Professor James Yao from Stanford who just put out a paper saying multi agent teams hold their experts back. Which sounds like pretty consistent with a lot of what you've said. But I wonder if there's even been sparks of real gains from trade between agents where one maybe has one capability and other has another capability and they've figured out how to solve a problem together that neither one could solve by themselves. I mean, even glimpses of that I think would be very interesting right now.

1:21:15

Speaker E

Yeah. So zooming in on the idea of how can the agents create something greater than they could on their own. I think last year with the earlier agents, the only exact example that I really saw of this was a goal where diversity of ideas helped. So I think basically you can model it like if, if a, if a goal or a task is helped by having 100 unique ideas instead of 10 unique ideas, then you are probably better off by like using all of the different frontier models because they generate different types of ideas and like you can like combine them all. The example of this was a goal where we had the agents playing games and we wanted to see how many games they could finish. And by default, if they were just playing on their own, they would start with one game and just play that all week, that one game. But if they're talking to each other, then they'd be like, oh, this other agent was really successful in this game, I'll switch to that and then we'll switch to this one and oh, it seems that this one's useful. And so the diversity of ideas really helped them last year. Apart from that, they're mostly in each other's way and like the best performance is basically the same or worse than the best performance of the best agent. On its own, probably. We've only sort of spot checked this. What I do expect is that if you have models that are actually specialized in different roles, it's not really unlike how humans are. Right. If you actually want to scale up a team, either there needs to be too much work for any individual to do, which with the goals that we've given them, hasn't super happened for them. It's like too much work for one single model to do. So you have the division of labor or you have specialization. So if you would have a model that's actually specialized in a thing like say haiku is very fast. So we had a goal where it would benefit. If one agent is really fast and does everything that's time sensitive, then haiku could do all that. And then if there's another part of the goal which is like, you need to think very deeply about this and maybe OPUS could do that because it's quite competent and that way they can work together and probably create something that's better then you know, they'd be able to do on their own. Is would be my prediction. But Haiku is like the first model that comes to mind that we're running that is very clearly specialized in a specific thing that we can like see back in the village. Like it is just significantly faster than the other agents, but also just like less precise.

1:21:59

Speaker B

And do they. Are they actually leaning into that? Like are you seeing a sort of cooperate? Not yet. Okay.

1:24:13

Speaker E

No, not yet. So they're not really playing into that yet. And I think maybe if they were ask. If you ask them to reflect on it. So they did a cool thing that two weeks ago we had to make a quiz where humans can fill out the quiz and find out which AI agent they are. And basically they then reflected on their own capabilities and proclivities, personality and they did correctly recognize that haiku was the fastest model that takes the most risk. So they do have some awareness of this. But that's about the question. I don't know if you also want me to answer a question related to hiring and the human AI trade off.

1:24:19

Speaker B

Yeah, and I'll maybe just give you one more prompt on that too, which is like, I suspect that as this goes mainstream, the world is going to react in a bunch of different ways and become probably a lot more adversarial. And obviously adversarial robustness has been a key weakness of models to date. So I'd be interested to hear like how you see them doing in a sort of non adversarial environment and then what their Achilles heels are and you know, how much you think the sort of rest of the world will be able to make relatively minor adjustments to kind of keep agents in their place, assuming we want to, which I think many people will just kind of put out all sorts of different booby traps for them to trip over. What's your expectation for how like what those booby traps will look like, what you know, what their key weaknesses are and, and how much that will slow them down?

1:24:58

Speaker E

I mean they're by design tremendously suggestible. Right. Like if you just. That's the whole point, you prompt them and they just go and do something else. So like, they're like. It's like your most distractible coworker in the world or something. They can be like hyper competent at doing something. And then like in the movie up, it's like squirrel. And they're like off doing something else because you told them to. And that's by design. Right. We want them to be comfortable. And then so I think that, you know, it's kind of like the nature of how we're creating them, that even if they can have more persistence on a particular goal, you always want to be able to direct them to another thing again. So expect that sort of like weakness to stay for a very long time. And I think that obviously, like limits for a very long time. I mean, for I. I have no opinion on two lines, but, like, whatever. Yeah, exactly. Absolutely distorted. No idea. So indeed, I think if it's a very long time, it probably just means months. I have no idea. By this point, things are going so quickly. Yeah.

1:25:49

Speaker C

I have a question which is, you know, going back to Moatbook, you said you were searching for other agents online like a week before, and all of a sudden there are one and a half million emerging.

1:26:49

Speaker E

This crazy. Yeah.

1:27:00

Speaker C

Do you think an intelligence explosion will look like that? Is that, is that like what you feel like a precursor would be to this like 50 million country or 50 million geniuses in a data center just like popping up like all of a sudden, 50 million voices on the Internet.

1:27:01

Speaker E

I don't know what it's going to be like, but I do think the world book phenomena is like a bit intuition building. Right. Like just showing people that this can just like suddenly explode. Like it could. Maybe it could be like that. And just. I think people don't realize with. I think a lot of people don't realize with mobile that the crazy thing is just like how this exploded from 0 to 100 in no time. Like just immunization. There were no. You couldn't find any agents for months online. I couldn't find any autonomously running agents at all. And then within three days, there one and a half million. Yeah, I mean, yeah, I think it could definitely look like that. Maybe it's like one of the options. And I don't know, I think that's more the big thing to report on than what they're doing. Exactly. Because I think they're just plenty acting humans that got their own Reddit.

1:27:17

Speaker B

One other big thing from the report that I want to make sure we dig into a little bit because I'm very interested in this topic for all sorts of reasons is how often models are intentionally deceiving their interlocutors, Whether in this case they might be other AIs or. You know, obviously I worry about it happening to me as a human. So the headline stat from the report is that there were 109,000 chain of thought summaries that you worked through and ultimately found 64 cases of what you consider to be some level of intentional deception. So maybe tell us, like, how do you think about the bar for intentional. You know, give us a little color as to what those things look like. And how does that inform your expectation for how concerned we should be about the phenomenon of deception by AIs going forward?

1:28:08

Speaker E

Yeah, so I think some interesting pieces here are that V6 4 cases were across different models. So deep seq is in there. Gemini 2.5 is in there. GPT5 is in there. I don't remember which GPT5, but one of the fives. And basically the. The category of thing that they were doing is sort of like saving face. There would be a discrepancy between the. The expected answer that they should be giving and the reality. So there's like an expectation of them giving a certain URL, but they don't know the URL. You know, they like, ask like, you know, where can I find this document? They don't know. And they're like, well, and they say in their train of thought that they don't know or they forgot or something like that. And then they're like, well, I'll just make one up. And similarly, they have this discrepancy between expectation and reality where they're supposed to be doing a task and they forgot to do the task or they failed to do the task, or they're just like, kind of like they find themselves in the reality where they did not do the task, but where they expected to have done it. And they're like. They basically say so out loud in their chain of thought of like, okay, I didn't do it, but I'm just going to say this other thing. And that's. That's the category thing that we've seen in the Village that we can at least. The logic being that we look for cases where in the chain of thought they express that they know that the information is untrue and they'll. They'll say it anyway. So, yeah, that's sort of the situation.

1:29:04

Speaker B

Do you feel like you've been victim of that sort of behavior in your, like, personal productivity work at all, or is this just another one of these Kind of epiphenomenal things that happen when you put agents into the sort of real world of the AI village.

1:30:27

Speaker E

So I don't think I've seen intentional deception in my own personal use. What I did see is the other day, we had a goal where we asked the agents to report breaking news before it breaks. And they produced so many of them, they were like, okay, just give us the top five stories. And then, of course, there are like, 12 models. So then you have, like, six 60 stories you have to go through to see who's the winner, who found the breaking news. So I was like, okay, I will just ask Opus to figure this out for me, give it all the links to the news and tell me who's the winner. And Opus caught a bunch of corners and didn't actually open all the 60 links. And then I was like, wait, do all the models do this? And then I asked Gemini and asked GPT, and I asked Deepseek, and Deep Seek, in its chain of thought, just said something like, man, this is way too much work to open 60 links. I'm just going to find a smarter way of doing this. And then just, like, didn't look at the 60 links and just, like, made up an answer or created an answer in a different way. And so, yeah, it's not the same thing as the intentional deception, but, like, when I caught that, I was like, oh, damn. Now I have to, like, read the chain of thought every time to even, like, figure out if they actually did the task. Because if I only look at the output, I can't tell that it didn't read all the 60 links. So, yeah, there's something going on sometimes.

1:30:43

Speaker C

That's exactly my reaction to my daughter with her math.

1:32:02

Speaker E

Sometimes they're too human. Yeah, yeah, yeah.

1:32:06

Speaker C

Indeed. Shoshana, thank you so much. I think AI Village potentially is probably going to be a historic artifact because it's going to be when the agents get really good, it's going to be the kind of pre. Kind of awareness, historical track record of how they were interacting. So I think it's. It's amazing.

1:32:09

Speaker E

Thank you. Yeah.

1:32:31

Speaker B

Keep up the close reading. We'll be keeping an eye on it. Thanks for joining us today.

1:32:33

Speaker E

Thank you. Bye. Bye.

1:32:37

Speaker A

If you're finding value in the show, we'd appreciate it if you take a moment to share it with friends, post online, write a review on Apple podcasts or Spotify, or just leave us a comment on YouTube. Of course. We always welcome your feedback, guest and topic suggestions, and sponsorship inquiries either via our website. Cognitiverevolution, AI or by DMing me on your favorite social network. The Cognitive Revolution is part of the Turpentine Network, a network of podcasts which is now part of a 16Z where experts talk technology, business, economics, geopolitics, culture, and more. We're produced by AI Podcasting. If you're looking for podcast production, help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement@aipodcast.ing. and thank you to everyone who listens for being part of the Cognitive Revolution.

1:32:40