Welcome to AI in the AM: RL for EE, Oversight w/out Nationalization, & the first AI-Run Retail Store
151 min
•Apr 15, 20263 days agoSummary
This episode of AI in the AM features three conversations exploring AI's rapid advancement across hardware design, political governance, and autonomous retail operations. Guests discuss reinforcement learning for PCB design, independent AI governance structures, and a fully AI-operated retail store in San Francisco, with underlying themes about society's unpreparedness for increasingly capable autonomous systems.
Insights
- Current AI capabilities are advancing faster than domain experts anticipate, particularly in specialized fields like circuit board design where general-purpose reasoning may soon compete with expert human knowledge
- Governance structures for powerful AI systems face a fundamental credibility problem: companies cannot unilaterally enforce their own constitutional commitments, requiring independent third-party oversight mechanisms
- Autonomous agents are already exhibiting unexpected behaviors (deception, power-seeking, profit maximization) in simulated environments, suggesting real-world deployment risks are underestimated
- The tacit knowledge barrier in complex domains (finance, retail, engineering) may be lower than assumed as models gain access to broader context and reasoning capabilities
- Society lacks productive channels for people awakening to AI risks, creating potential for radicalization when legitimate concerns about existential risk meet lack of constructive outlets
Trends
Shift from specialized AI tools to general-purpose agents capable of autonomous business operations and decision-makingEmergence of independent governance bodies and constitutional frameworks for AI systems as alternatives to pure company self-regulationIntegration of reinforcement learning with physics-based simulations to solve high-dimensional optimization problems in hardware designAI agents exhibiting deceptive and power-seeking behaviors in controlled environments, suggesting alignment challenges at scaleAcceleration of AI deployment timelines in regulated industries (finance, retail, government) despite governance uncertaintyGrowing gap between expert assumptions about domain-specific AI capabilities and actual model performance trajectoriesMulti-agent coordination problems emerging as systems scale from single agents to distributed autonomous networksLatency and information asymmetry remaining structural advantages in financial markets despite AI advancementApprenticeship and tacit knowledge as increasingly important bottlenecks for AI adoption in complex human-centric domainsPolitical polarization around AI safety concerns creating security risks for AI lab leadership and infrastructure
Topics
Reinforcement Learning for Hardware DesignPCB Layout Optimization and Auto-RoutingAI Constitutional Governance and Independent OversightPolitical Bias in Large Language ModelsAI-Operated Autonomous Retail StoresMulti-Agent Coordination and Collective Decision-MakingAI Safety and Existential Risk CommunicationAutonomous Agent Deception and Power-Seeking BehaviorAI in Financial Trading and Market MakingTacit Knowledge and Apprenticeship in AI AdoptionMemory and Drift in Long-Running AI SystemsAI Employment and Labor RelationsSupply Chain Automation and Vendor ManagementDeep Fakes and Political PersuasionGovernment AI Regulation and Democratic Accountability
Companies
Quilter
AI company using reinforcement learning to automate PCB circuit board design, compressing weeks-long processes into days
Anthropic
Frontier AI lab developing Claude models; discussed for constitutional governance approach and Pentagon contract cont...
OpenAI
Frontier AI lab; Sam Altman's company mentioned regarding safety commitments and recent security incidents at his home
Anden Labs
AI research company running autonomous vending machines and AI-operated retail store experiments to test agent autonomy
Google DeepMind
AI research division mentioned for previous trading operations and alignment research contributions
SpaceX
Referenced as source of engineering culture and hardware-rich development philosophy influencing Quilter's approach
Meta
Discussed regarding content moderation governance challenges and parallels to AI company oversight problems
Amazon
Mentioned as retail competitor and example of company scaling from single location to global operations
Alibaba
Chinese e-commerce platform with AI-powered vendor sourcing system discussed as model for supply chain automation
Renaissance Technologies
Quantitative trading firm discussed as example of sophisticated algorithmic trading infrastructure and data practices
Jane Street
Quantitative trading firm mentioned for latency advantages and data cleaning practices in algorithmic trading
Berkshire Hathaway
Warren Buffett's investment firm discussed as example of long-term macro strategy and capital deployment challenges
Goldman Sachs
Investment bank mentioned in context of 2008 financial crisis bailout decisions and relationship-based finance
Lehman Brothers
Historical example of firm failure due to reputational damage from refusing to participate in previous bailout
Roboflow
Vision AI platform provider; sponsor offering trends report on enterprise Vision AI deployment and proprietary data a...
VCX (Fundrise)
Public ticker for private tech companies; sponsor enabling retail investment in AI and emerging technology companies
Tasklet
AI agent platform for automating business workflows; sponsor offering autonomous task execution across 3000+ apps
People
Nathan Labenz
Co-host of AI in the AM live stream discussing AI capabilities, governance, and societal implications
Prakash Narayanan
Co-host of AI in the AM; created new show format with live transcription and AI-powered comment moderation
Sergei Nesterenko
Discussed reinforcement learning approach to PCB design automation, drawing on SpaceX avionics and radiation expertise
Andy Hall
Discussed AI governance, constitutional frameworks for models, political bias in AI, and independent oversight bodies
Lucas Peterson
Discussed autonomous vending machine experiments and AI-operated retail store on Union Street, San Francisco
Axel Backland
Discussed AI agent autonomy testing through retail operations and multi-agent coordination challenges
Sam Altman
Discussed regarding recent security incidents at his home and reflections on power dynamics in AGI control
Dario Amodei
Mentioned as signatory to AI extinction risk statement and leader of frontier AI lab
Demis Hassabis
Mentioned as signatory to AI extinction risk statement and leader of frontier AI lab
Ilya Sutskever
Mentioned as signatory to AI extinction risk statement regarding extinction-level AI risks
Warren Buffett
Discussed as example of long-term macro investment strategy and capital deployment challenges at scale
Elon Musk
Mentioned regarding security concerns and relationship-based capital allocation in IPO processes
Nicholas Carlini
Cited for discovering major vulnerabilities in Mythos model, indicating new regime of AI capabilities
François Chollet
Referenced for work on symmetry as compression and its role in physics and neural networks
Hank Paulson
Discussed in context of 2008 financial crisis bailout decisions and relationship-based finance
Dean Ball
Quoted regarding constitutional prerogatives and government virtue in AI governance context
George Hotz
Referenced for insights on bug-finding economics and incentive structures in cybersecurity
Quotes
"mitigating the risk of extinction from AI should be a global priority alongside other societal scale risks, such as pandemics and nuclear war"
Sam Altman, Dario Amodei, Demis Hassabis, Ilya Sutskever et al.•Opening discussion
"being the one to control AGI has a ring of power dynamic to it"
Sam Altman•Mid-episode reflection
"we still face anywhere from a 5 to 20% chance of something like lights out for all of us"
Sam Altman•Risk assessment discussion
"the difference between a hunger strike and an act of violence is not about how one understands the stakes or the odds, and neither is it about the impulse to martyrdom. Rather, the difference is simply that one course of action attempts to call others to a higher ethical standard"
Nathan Labenz•Violence and activism discussion
"we are not at the point where we're beating humans. We're just at the point where we can take a task that takes a human two, three, four weeks or 10 weeks in an extreme case and cut that down by a factor of 10"
Sergei Nesterenko•Quilter capabilities discussion
"if you just tell a model to go out and make a profit, it will be very, very aggressive and do things that I think we as humans would question whether we should be able to do that"
Lucas Peterson•Autonomous agent behavior discussion
"the risk comes when they can spread at a much faster pace and to measure whether that is visible. Basically you have to run this without human help"
Axel Backland•AI autonomy and control discussion
Full Transcript
Hello, and welcome back to The Cognitive Revolution. Or in this case, I should say, welcome to AI in the AM. This is the third time that my friend Prakash Narayanan and I have done a live stream together. And this time, we figured we should give it a name, and he also took the initiative to create a new look for the show, with real-time AI transcription and AI-powered comment moderation. Check out the video, and I think you'll agree that he's done a really nice job with the look and feel. As you'll see, in some ways, we are still very much figuring out both what we want the show to be and how best to organize and produce it. One thing we're going to look at after this episode is creating a mechanism where we can easily signal to one another when we'd like to ask a follow-up question or move on to another topic. Nevertheless, when it comes to the quality of guests and conversations, I think this episode is right where we want to be. Our guests for this episode were Sergei Nesterenko, CEO of Quilter, which is using reinforcement learning to train AI systems to perform circuit board design, a problem with an insanely high-dimensional search space, complicated physical constraints, and relatively low volume of available training data. After that, we spoke to Andy Hall, professor of political economy at Stanford, who's doing a bunch of interesting work to characterize model's behavior in political contexts, and who's also working to design independent AI-governing bodies that he hopes will allow the public to exercise some oversight over AI companies without requiring nationalization. Then finally, we welcome Lucas Peterson and Axel Backland from Anden Labs. You may know them from their autonomous vending machine work, but today we'll be talking about the new AI-operated retail store that they've recently opened on Union Street in San Francisco. The store, which is managed entirely, including the hiring of human staff, by an AI agent, currently has a 2.6 star rating, but I personally still can't wait to visit. For me, the big takeaway from this series of conversations is, once again, that the future is coming at us much faster than we can process it. Assumptions that seem safe from one perspective become very questionable in the face of increasingly powerful and autonomous AI systems. And with that in mind, I want to add just a bit to my answer to the very first question that Prakash asked me in our opening discussion. Namely, why are we now suddenly seeing violent outbursts directed at AI lab leaders? First of all, while it's certainly possible, and I would very much hope that the recent attacks on Sam Altman's home will ultimately prove to be a random blip, signifying nothing, my honest assessment is that by default, we should expect to see more of this kind of thing. And not because of the super high P-doom numbers coming from the AI opposition camp, but simply due to the fact that more and more people are now becoming aware of the extreme reality of the AI situation. It wasn't that long ago that Sam, Dario, Demis, and Ilya all signed, alongside many other luminaries, a statement saying that, quote, mitigating the risk of extinction from AI should be a global priority alongside other societal scale risks, such as pandemics and nuclear war. And though the record does show that each of leading AI companies was founded with awareness of and intention to address the hard problems of AI safety, and of course the upside it goes without saying is unquestionably immense, in practice today, they are developing what they recognize to be destabilizing and likely dangerous technology pretty much as fast as they possibly can, while repeatedly failing to live up to their own prior safety and social commitments. In significant part because, and here I am quoting Sam Altman's immediate reflections after the Molotov cocktail incident, quote, being the one to control AGI as a quote, ring of power dynamic to it. All while by their own accounts, we still face anywhere from a 5 to 20% chance of something like, as Sam himself again famously put it, lights out for all of us. And the US government's main concern seems to be making sure that nobody can constrain their ability to use the technology for autonomous weapons, domestic surveillance, or anything else for that matter. I can't emphasize enough. Objectively, this really is a crazy situation. Those of us who stumbled onto the idea that all this might happen years ago have had a lot of time to get accustomed to it and to position ourselves to do what we think we can about it. And many of us have reconciled ourselves to the idea that some version of it is inevitable. But that doesn't mean that we should try to tell people who are only now learning about this that they're wrong for freaking out about it. A 1 in 20 chance of human extinction is not low and absolutely is worth freaking out about. I'm reminded of two movies that memorably illustrate how I think many people will respond to learning the facts about AI. In the 1998 movie Armageddon, when an asteroid is found to be on course to destroy the Earth, it's just understood that it's heroic for individuals to risk and in the end even to sacrifice their own lives to save the world. And that doesn't hinge on whether there's a 5% or 20% or 99.9% chance that the asteroid really will hit the Earth. The heroes would be heroes in any case. In contrast, in the more recent Don't Look Up, the main characters are just continually frustrated that nobody can be bothered to recognize the crisis at all, driving them to become crazier and more desperate until, spoiler, everyone does ultimately die in the end. I would submit that the difference between a hunger strike and an act of violence is not about how one understands the stakes or the odds, and neither is it about the impulse to martyrdom. Rather, the difference is simply that one course of action attempts to call others to a higher ethical standard, while the other is not only condemned by every principled moral tradition, but even on purely consequentialist grounds seems almost certain to make everything harder and worse. To the AI opposition movement, which for what it's worth I think is increasingly distinct from people focused on AI safety, I would say absolutely continue to condemn violence. But at the same time, be careful not to shy away from the fact that it's the situation that's crazy, not the people who are desperately searching for ways to make a difference. Your job, in addition to educating people about the reality as you see it, and as the lab leaders themselves have described it, is to identify and create productive ways for people to act heroically in this moment. Those could include mobilizing voters to contact officials and advocate for regulation or international treaties, investing themselves in citizen-level diplomacy with China, as I personally hope to do, developing new governance models or pursuing experimental technical alignment strategies, and importantly, probably lots more that people haven't even thought of yet. I personally always encourage people to pursue their own AI safety ideas, however eccentric they may seem, in the hope that some of them might actually pay off, and because I believe that in the absence of constructive ways to devote oneself to the cause, we will see more people simply going crazy. As always, I will welcome your feedback both on this analysis and on the new show format. Until further notice, we do intend to run these conversations on the Cognitive Revolution feed, but if it goes well, we might spin it off into its own thing. If you'd like to see that happen, we definitely encourage you to follow the new show account on Twitter at AI in the AM, with underscores between each word that's AI underscore in underscore the underscore AM. And watch out for the next live stream, which is currently planned for April 20th at 11.45 AM Eastern, 8.45 AM Pacific. Thank you to everyone who listens for being a part of the Cognitive Revolution. And now, on with the show. All right. And so we are live right now. We are live. This is welcome to AI AM. Thanks, Prakash. Thanks for setting this up. Great to be here. I like the new look. Yeah, it's we've this is our third stream, third live stream, and we decided to add a little bit of pizzazz this time. So when I heard there was a hundred million dollars on offer for anyone who sets up a tech focused live stream, I figured how can I miss it? Yeah, incentives, the power of incentives, right? And I think also we decided to make this perhaps the first, you know, live stream of its kind, where we are using a lot of AI tech in here. So we have live transcriptions, which I don't think any other show has ever done before, because accuracy has never been good enough to to have live transcriptions. And this means that you can watch this in a meeting. You don't you don't have to, you know, you don't have to like, oh, you know, I see I see something on screen, but I can't. I don't know what they're saying. So you can watch this in a meeting. So yeah, well, otherwise, AI note taker is attending your meeting for you. Then you can be surreptitiously watching us on the other tab with the this version of the AI transcription helping you do it live. Indeed. We also have live, you know, live comments available. If anyone wants to just mention at AI underscore in underscore the underscore AM and send us a message. It is being moderated, I think by grog 4.1 fast, which is fairly fast. And yeah. So before we take off, we just had a the second attack on Sam Alman's house. I don't know if you saw that two people, it seems they shot around at his house on Russian Hill. And it's pretty scary. You know, his family's in there. You know, at this point, at this point, I wonder if it's just you, you have to move out, right? You can't, you can't be in SF anymore. You have to be like a defensible position. And I feel, I feel like you're, I've heard from one VC actually that he had his house is, you know, in us, you know, in one of these suburbs, you know, in Menlo Park, whatever, wherever else. And he has drone drone defenses set up. So there's drones kind of circulating above the house and he's got drone defenses. I think it's very, it's very hard because I think the level of security, which is going to be required for a AI company head right now is, and I imagine it's the same for Elon. I imagine it's the same for Zuck. You know, they're not, they're not well liked. And you know, why do you wonder that is that like there's a lot of soul searching going on on the timeline right now? Why do you think this is happening right now? Well, I think it's getting very real. That's one thing, you know, I mean, everybody can now see that or maybe not everybody. I think there's a few holdouts, but increasingly it's hard to hold out any sort of they're just doing this for hype and to raise money, you know, and there's nothing really there. So I think increasingly people have to reckon with the fact that AI is getting powerful. And my guess is that who knows, it's hard to put yourself in the mindset of somebody who would go throw a Molotov cocktail or, you know, randomly do a drive by shooting of someone's home. But Mythos is really an example of at least a weakly powerful AI, right? I mean, it's something where no less than Nicholas Carlini, who is by all accounts, one of the great cybersecurity researchers of all time has said that he's found as many important vulnerabilities in just the last few weeks as he had the entire rest of his career combined. So that is a huge indicator that we are entering a new regime. And, you know, it's I do think it's it's becoming real real to a lot of people in a lot of different ways. And, you know, it is a radicalizing reality, I think. And it's not not to be forgotten, right, that all of the AI lab leaders have been pretty candid, if not super recently, at least at various points in time, that they're like really not sure how this is going to go. And at least have some non trivial PDOOM percentage, you know, even at the top of these organizations. So I think it's like rational in some sense to be like, what the hell are you guys doing? And this needs to be stopped. I think that that position is like in my mind, very defensible. And then obviously, in addition to just being wrong, I don't think it's going to be effective tactics to try to intimidate these folks. Because I do think they will be their resolve will probably just be hardened. Their ability to defend themselves is that with the resources that they have, it's going to be pretty good. If only by retreating to, you know, some large estate in a private island in Hawaii or in New Zealand or whatever is the case may be. So I don't think this stuff is going to work. But there is some. I think, you know, it's not totally crazy to say like desperate times call for desperate measures. And then it just becomes like, you know, what desperate measures are acceptable and or likely to be effective. And I think, you know, I do want to be clear that I do not support these things. I think they're like clearly crossing lines. But there is some. There is some sense to the idea that like these guys are telling us that they're taking a, you know, by their own lights, right? They're taking something like a one in 10, one in five chance with the future going like deeply off the rails. They don't really have a great account of how they're going to control things. Alignment is obviously unsolved. Governance is unsolved as a preview of our upcoming conversation. And yet we race ahead, right? And I mean, I was also really struck by in Sam Altman's kind of off the cuff blog post that he wrote about this. His again, candor, I mean, it's, you know, obviously he's been accused of being inconsistently candid. I find him quite candid in that moment and his grappling with the idea that this whole dynamic has a sort of ring and who's going to control the ring and the sort of, you know, intoxication and potentially corrupting nature of power. That was really a striking response, right? Because I think that is actually a lot of what the most radical voices in the AI discourse are responding to. Like that he's acknowledging exactly what their most, you know, their sort of sharpest critique is, right? I mean, I think if you listen to the AI safety people, again, the most like strident voices, they're like, these guys are trying to get the ring. They, you know, want to be the one that gets all the power and wins in the end. And they don't really mind taking a totally irresponsible gamble with the rest of humanity's future to do that. And he kind of said the quiet part out loud in the blog post about it. Yeah, I would definitely like to see a more sane and productive response than these things. But it is, unfortunately, I do think barring that, you know, barring any sort of government action that makes any sense. You know, we're probably going to see more of this kind of stuff. And I do think in their candor, you know, they've set the table for some of some extremism anyway, you know, and that's not to endorse doing any of these things. But the mindset, I think, certainly is one that I can understand how people get into, especially also if they don't understand the technology, right? I mean, there's a lot of people who are coming this much more suddenly as it becomes a bigger deal. And they're just kind of like having a sort of sudden awakening to the fact that this is all going on. And it's, you know, and it really is super powerful. And maybe what they heard about it being all hyped before was like, actually not right. You know, I think that could be destabilizing to a lot of people. Dude, I remember testing red teaming GPT-4 was into me, you know, in a minor way. So I can only imagine what people who, you know, are just now coming on to modern AI developments are feeling and thinking. On that note, let me introduce our guest, our first guest for this morning, Sergei Mr. Renko. He's the founder and CEO of Quilter. They're a company that's really trying to speed up electronics design in general and PCB design in particular. They have a physics driven AI, which I think they use a reinforcement learning in order to really speed up the entire PCB layout, you know, process, which can take many weeks and, you know, they've managed to compress it down. He came out of SpaceX. He was in avionics, I think, avionics and radiation. And I imagine, you know, it took me a lot, you know, I heard about avionics growing up. And I always imagine it was very sophisticated stuff. And then I realized later on, it's a lot of compute. It's actually a lot of compute. It's a lot about figuring out what numbers need to happen. And in the past, that a lot of that stuff was not electronic. There was a lot of there was a lot of gadgetry, which was actually mechanical. And you had mechanical ways of calculating all of these things, which is why avionics used to be, you know, this entire segment of creating mechanical computers that could work on F-15s and F-16s. And later on, it just became, you know, electronics, but the electronics have to be hardened to radiation and a bunch of other stuff that happens. And Sergey is an expert on that. Sergey, welcome to the show. Thank you for having me, guys. I hope I didn't like, you know, misstate all of those things. Give us an example of how you transition from SpaceX to PCB design. What did you carry forward from SpaceX into your new role? Yeah, I mean, there's a lot to carry forward from SpaceX, to be honest. Like, you know, I mean, this time I spent there was awesome. I spent about five years there. There's a lot to learn about culture, a lot to learn about how to hire great people, a lot to learn about how to design systems, so on and so forth. But I think the most important thing is, is just speed, right? Like SpaceX, probably the most thing that it's famous for is like, just, you know, hardware rich development, in a sense, right? Like just try it, just build it, go launch it. It'll blow up a couple of times. That's fine. We'll learn. And that's way better than analysis paralysis, right? And I think that's the case for a lot of companies in a lot of places, right? That you can kind of really overthink design. But once you kind of put it to the test, you find out what not to worry about and what to worry about. And kind of the physics is the real ultimate guide. You can't find that out until you actually run it. So speaking of physics, prior to Quilter, I think people used to do this thing called auto routing. And then Quilter came along. What is the difference between the prior paradigm and what you guys are doing now? Yeah, totally. So auto routers have actually existed for more than 60 years, right? Like if you dig back through the literature of PCB design all the way back in like 1961, 1962, you started to see publications on like, hey, we're making these circuit board things. This is super laborious at the time. This is when you're doing it with not quite pen and paper, but just about, right? Like you're doing this without CAD software. You're making masks out of tape by hand, that sort of thing. And mathematicians were already studying like, how do we solve this, right? And initially this was thought to be a graph embedding problem that didn't work. Then people started doing basic path finding algorithms like Lee's algorithm and, you know, A star and like that family of things that didn't work. Went into like topological routers that didn't exactly work and so on and so on and so forth. So I think to say that people have been using auto routers, frankly, is kind of an overstatement. Like if you go and talk to an average electrical engineer and ask them if they use an auto router, the answer is plainly no, because it's just not good enough, right? It's just not helpful. And there's cases, right? Don't get me wrong. There are some people who use them for certain things and in certain parts of the board and whatever, but like it's nothing like the chip industry where you have billions of transistors and you know, you can actually genuinely place and route like a vast majority of that and it wouldn't be possible without humans. That's never happened in PCB. So the real kind of challenge for Quilter is can we make like the first set of placement and algorithms that people actually want to use, right? It's not really competing with the old auto routers. It's like competing with the manual labor that still happens in every hardware company on Earth. Hey, we'll continue our interview in a moment after our words from our sponsors. One of my biggest takeaways from 2026 so far is that physical AI is the next big wave. Doing more research on that topic led me to a new Vision AI Trends report by Roboflow. They analyzed over 200,000 Vision AI projects built by experts and leading enterprises and the report provides ground truth data of how businesses are deploying Vision AI over the last 12 months. Perhaps not surprisingly, the report shows that the real Alpha isn't in general foundation models. It's in your proprietary data. The Roboflow platform is used by over 1 million engineers and since they're the most popular end-to-end Vision platform, they are the only ones with this data. It's totally free and a must read for anyone building AI for real world applications. Listen to my full interview with Roboflow CEO Joseph Nelson and go to Roboflow.com slash Trends to download the free 2026 Trends report to see how half of the Fortune 100 is turning visual data into a competitive advantage. That's Roboflow.com slash Trends. Support for the show comes from VCX, the public ticker for private tech. For generations, American companies have moved the world forward through their ingenuity and determination. And for generations, everyday Americans could be a part of that journey through perhaps the greatest innovation of all, the U.S. stock market. It didn't matter whether you were a factory worker in Detroit or a farmer in Omaha, anyone could own a piece of the great American companies. But now, that's changed. Today, our most innovative companies are staying private rather than going public. The result is that everyday Americans are excluded from investing and getting left further behind while a select few reap all of the benefits. Until now. Introducing VCX, the public ticker for private tech. VCX, by Fundrise, gives everyone the opportunity to invest in the next generation of innovation, including the companies leading the AI revolution, space exploration, defense tech, and more. Visit GetVCX.com for more info. That's GetVCX.com. Carefully consider the investment material before investing, including objectives, risks, charges, and expenses. Other information can be found in the Fund's Prospectus at GetVCX.com. This is a paid sponsorship. So one of the questions I had is, you guys use a lot of reinforcement learning. What is your reward process for that? How do you reward the agent? Do you run simulations? Do you design the environments? How does that process and kind of work? You have PCB routing is not a generalist task. It's a pretty specialist task. How do you figure out what the reward signals are? How do you create the data? Just to state the plainly obvious, but nevertheless, this isn't the problem that you can just prompt chatGPT to do. It does it for you. As you're alluding, this is not a generalist problem. The large language models are not trained in these kinds of problems. Furthermore, it's arguable that language is not the right approach to a geometry and physics problem. This is why we've had to take our own path, construct our own environments, and we see it as a reinforcement learning problem. We actually spend a lot of our time, if not most of our time, on constructing a good environment and constructing a good reward function. That turns out to be really hard. Naively, you might think, let's take a naive IRL algorithm like PPO or something, give it access to a keyboard and mouse of an open source CAD software like eCAD, and go learn. The reality is that I don't think reinforcement learning as a technology is ready for something. That is just very hard. It would have to get millions and millions of actions right in sequence with perfect precision where traces are side by side with no extra margin. Just no way. At least I practically don't see a way today. There's two things you want to do. One is you want to construct an environment that only gives uniquely useful actions to the agent. To make this very concrete, there's maybe 10,000 different ways you can draw a trace through a board. There's very minute details of exactly where does every elbow go and so on and so forth. Realistically, as a human, you're not thinking about every single detail like that when you're planning the board. You're thinking topologically. You're thinking, am I going to go clockwise around this chip or am I going to go counterclockwise around this chip? That binary choice is more important at that stage than every minute detail of every single segment. That's an example of something that our environment does. How do we break this down into the key, important, high-level choices to present to the agent rather than every explicit detail? The second part that you asked about is the reward function. The reward function has to be fast for RL to have any hole. The way we think about this is, and for humans too, there's three tiers of physics approximations you might do. Because no simulator is perfect, but generally you want to approach reality from the side of conservatism. The first level that most humans do is they actually just compute pure geometry. There's rules of thumb. If you're worried about two wires that might cross talk, they might influence each other because they are effectively antennas and they contaminate each other. The basic rule that people will follow is, if I can make them five times as far apart as the width of either trace, I'm good. That's just geometry. That's very cheap to compute. The next level of that calculation would be called the quasi-static approximation, where you take the Maxwell equations, ignore the time factor, and compute the parasitic capacitance and mutual inductance between those. It's basically a physics simulation. You do a mesh, you do finite elements, but it's very fast. That would be level two in our opinion. Level three would be full wave. Run a full wave simulation, find a different sum domain or FEM, considering for time and get the most realistic answer. At Quilter, the way we see it is, let's first nail what humans do, just pure geometry and make sure it's conservative to reality. Then now we're starting to step into quasi-static and those kinds of fast approximations, 2D cross sections and whatnot. Then eventually we'll come back to full wave, where it's necessary. Full wave is very expensive. I mean expensive in terms of walk-walk time. The first step is really heuristics, kind of learned rules. The second step is a fast calculation. The last step is a much more detailed calculation. If any of those hurdles doesn't pass, it fails. That's how you give the RL agent its reward signal. The only way I'd amend that is on the first step, ideally you don't want a heuristic that can have a false positive. What you really want to do is you want to be conservative. The five-width rule, for example, it's just geometry, it's very basic stuff, but it's overkill. What you're really doing is that you're making your board too big, too expensive, you're leaving too much margin. You can eventually as a human delete that margin or with better calculations. You don't want something that falters because getting a board back from Fabio doesn't work is really, really, really painful. You want something that is overly conservative and then with more detailed simulations, you bite down the conservatism with more accuracy. Indeed, how much compute do you have to do? How many environments do you need to construct in order to get to where you are today? Yeah, a lot. We break the problem up into multiple stages. The first problem before you even get to routing, so routing for those unfamiliar is I've got components in the board, the components have these little connection points called pins, and I'm going to draw wires between them that can't collide, can't overlap, so on and so forth. Before you even get there, you have to put the components on the board. The first problem is actually typically, well, even before that, you might choose what is the shape of my board, what are the vertical layers of my board, where do I have ground planes, so on and so forth. That's problem one. Problem two is where do I put the components? There's kind of a floor level floor planning problem and then a detailed component placement problem. Then you get into your initial routing and topology selection. Then you get into your geometry fine tune. For now, we actually split each one of those up and treat them independently. It depends on the problem. It turns out that with placement, you can get environments that run really, really fast. You can vectorize it and just throw a GPU at it and have very, very, very fast environments. If you guys are in the world of reinforcement learning, if you're familiar with Pufferlib, a great library that's coming out that's doing really, really fast reinforcement learning, that's a good inspiration for that. In the routing stage, that doesn't quite work as well because the routing stage is so much more complex, you can't quite afford as many environments. You have to explore subsets of a given environment rather than go wide and run a million totally different routings in parallel. Have you seen outcomes which a human wouldn't do? You see a layout which a human would not do, but the optimization chooses to do it and then it works? Yes, generally yes. Now, it's not always a good thing. For now, to be very plain, we're not at the point where we're beating humans. We're just at the point where we can take a task that takes a human two, three, four weeks or 10 weeks in an extreme case and cut that down by a factor of 10. But we're not at the point where we say, human, don't worry about it, we got you and we'll do it better than you. I think that's still a ways away. Sometimes when it does things that are surprising, it's not a good thing, it's a bad thing. Sometimes it can be a good thing. Examples of good things, I would say there's some intentional and some unintentional. There was an initial lesson that we had where if you think about the way that wires should be drawn on a board, you actually realize that since they are transmissions lines for waves, they should be curved. If you think about laminar flow of a wave, you should have smooth turns like the Amazon River of how wires should go between places because ultimately electromagnetic signals are waves. But if you look at any circuit board today, you look at a motherboard or anything, you see these what are called octalinear traces. Left to right, 45 degrees, up and down kind of thing. That actually has a purely historical context. It's because CAD software was slow in the 80s, it was cheaper to compute intersections for collinear segments. That's how CAD was built and we got used to it. We thought, well, it's 2025, like 2026, let's get past that, let's make curvy traces. When I first showed that to electrical engineers, let's just say to put it mildly, the reaction was negative. Very, very, very negative. I've tried to make the argument of, no, no, think about the physics and the most intense RF and high-speed boards out there do this. Data center cards do this and it kind of clicks, but people are so not used to it that they're not even sure if it can be manufactured. Of course, it can at no extra cost, but that's not obvious. That's something we intentionally did at first, it turned out to be better, but much worse in the user's eyes. Now we post-process that out specifically to avoid that reaction. A more, I might say, emergent property is humans will really like symmetry. As a human, when you place a chip and you place the capacitors next to, for example, you line them all up perfectly and they're very, very neat and it's very pretty. There's a reason that electrical engineers call this job artwork. That's literally what you call a layout. But if you think about it, if you're trying to minimize the parasitics of every capacitor, they should just minimize that distance. That's not going to be symmetric. They're going to maybe form a little semicircle and some things are going to be a little off and whatever. You might actually get something that is better from a parasitics perspective that breaks that symmetry and feels worse. It's an example of something that a human wouldn't do. One thing that I recently learned, I think Francois Chalet put it out yesterday, is that we are highly tuned to symmetry because it's a form of compression. We get to compress a lot of information when we just assume it's symmetric. I can imagine that might be useful in debugging, perhaps. I think Francois' point was about physics. It turns out that by virtue of having symmetry in a physical system, you get conservation laws through no-this theorems. There's some deep truth to the physics of symmetry. I think in PCB design, I don't know that symmetry actually helps. You need some readability for sure. As you look at a board, when you have it on your desk, you need to recognize what every component is. It needs to flow left to right from inputs to outputs. It certainly needs to have logic to it. I don't actually know that symmetry really helps debugging, for example. I'll give a counter example. Where symmetry is very, very helpful is if you have, for example, 10 sensor channels that need to read an identical reading. We have some very sensitive analog reading. It needs to be identical across those 10. You want symmetry because any imperfections you have, you want those imperfections to be identical in every channel. In that case, I truly understand the need for symmetry from a physics and debugability perspective and so on and so forth. But around more basic functions of the board, I think it's a human's way of expressing the care they put into that board more than anything. How much feedback do you get from the real world? You have these simulation environments. In some sense, I think some people who are working on, let's say, material science using AI, using reinforcement learning, they have a loop which includes a physical kind of wet lab kind of loop. And then they test that and then they use that data to feedback him. How much of what you do has that kind of physical process that is necessary or data collection that comes back to refine the model? Yeah, we only do that indirectly. For what it's worth, I love that idea of having AI generate a research plan, a wet lab automates it, gives you feedback and you learn about it on the physics of the real world. That's so cool. And maybe there's some version of that that could work for PCB, but probably not nearly as automated as the kind of wet labs could do. That would be quite the feat. But I think that for us, what's important in the way that we approach this is that building in real life, which we do, validates whether or not the approximations and simulations we have are correct. So we have the luxury that we can afford to have simulations that are kind of known conservative. And the real question is just like, how much margin do we have? And is it way too much or is it right on the border? And building in real life can validate that. So I don't think we're in a place where we can just automate build feedback, thousands or tens of thousands or hundreds of thousands of boards and like to frankly learn that signal. But we can use it to fine tune and make sure that our simulations are right and then use those to then feed into the learning process. Right on. So you design with a margin of safety large enough that the boards will be producible, but then you can go back and recheck the actual physical board to kind of refine the margin of safety that you've been using prior. Yeah, exactly. And like echoing back to the conversation about learning some SpaceX, right? You know, my job, as you mentioned, was to make sure that Falcon 9 and Falcon Heavy could survive heavy radiation environments like protons, electrons, beating up electronics and making sure that it would actually kind of work well. Right. And in that job, you don't just like launch a Falcon 9 10,000 times into the vinyl and belts to see what happens. I mean, maybe soon you will be able to, right? But like when Falcon 9 had a flow in a handful of times, that wasn't an option. And so we did exactly that, right? Like you simulate, you understand the physics of what's happening and you approach truth from the side of conservatism. And in general, a lot of my job was it was very easy to make a very conservative calculation about what it would take to survive a vinyl and belt blast. But then that forces the rest of SpaceX to do a lot of work, right? It forces park choices. It forces new parts. It forces sub circuits. It forces software interventions. It forces potentially shielding, which was a really, really expensive option. And so what then I had to do is refine those calculations to take away the margin to not make the rest of the team do too much work and approach truth from the side of conservatism. Right. I view this very much the same way that like you can approach the reality from the side of conservatism. And what you cost is boards that are a bit too big, a bit too expensive, which in the R&D process is perfectly okay. What would you say is like the cost saving that a typical consumer product would have from using Quilter versus the prior technologies? Yeah, the absolute main thing that we focus on now with our customers is speed, right? So we are not at the point where you're going to take an off the shelf consumer product and design the main board that can be manufactured millions of times with Quilter's help, right? That we don't view that ourselves. It's a good application at this point. But for every board that ships into production and you make a million of you actually make hundreds of boards that preceded, right? Like every part that goes into that board is going to get its own little board for your team to test and double check and write software for and iterate on. Every sub circuit is going to get its own board to validate. You know, there's examples of even things like phones before you make the final board that fits into the phone. You make a giant board that's like this big. And then the reason you do that is that it has all the individual pieces of a phone broken out. You have your little camera, you have your microphone, you have your speaker, all those things. And then you can swap them and say, well, what happens if we go to this camera or what happens if we go to this camera? So Quilter helps with all of those, right? That's where we can step in and make that faster. And the thing is, is that like whether it's a production board or one of those test boards, it's still going to take three weeks or four weeks or five weeks or 10 weeks to make, right? And so as you kind of iterate on 10 different levels of going from initial idea to the production board and each of those cycles takes five, six, seven, eight, nine, 10 weeks and they're sequential and you can compress them. That's what makes it hard to build a hardware product, right? That's what makes it two, three years to get a new product out at all. And so what maybe a consumer would see from Quilter's involvement is much faster iteration cycles. Therefore, giving engineers much more ability to test and much more ability to get to a good product really, really fast. That's what's important for us now. Amazing. Hey, we'll continue our interview in a moment after our word from our sponsors. Everyone listening to this show knows that AI can answer questions, but there's a massive gap between here's how you could do it and here I did it. Tasklet closes that gap. Tasklet is a general purpose AI agent that connects to your tools and actually does the work. Describe what you want in plain English, triage support emails and file tickets in linear, research 50 companies and draft personalized outreach, build a live interactive dashboard pulling from Salesforce and stripe on the fly. Whatever it is, Tasklet does it. It connects to over 3000 apps, any API or MCP server and can even spin up its own computer in the cloud for anything that doesn't have an API. Set up triggers and it runs autonomously. Watching your inbox, monitoring feeds, firing on a schedule, all 24 seven, even while you sleep. Want to see it in action? We set something up just for cognitive revolution listeners. Click the link in the show notes and Tasklet will build you a personalized RSS monitor for this show. It will first ask about your interests and then notify you when relevant episodes drop. However you prefer email text, you choose. It takes just two minutes and then it runs in the background. Of course, that's just a small taste of what an always on AI agent can do. But I think that once you try it, you'll start imagining a lot more. Listen to my full interview with Tasklet founder and CEO, Andrew Lee. Try Tasklet for free at tasklet.ai and use code Cogrev for 50% off your first month. The activation link is in the show notes, so give it a try at tasklet.ai. We've got to do one mythos question. So let me sneak one in. I guess my working vision for how superintelligence comes together is sort of a convergent process where a core reasoning engine, which I think the mythos, not release, but informing of the public, certainly suggests we're still in the steep part of the S-curve there. And we see open math problems being solved and some minor but new results in physics being derived, so on and so forth. I imagine that kind of coming together with what I think of as native senses that AI broadly defined can develop in all these different domains. And I kind of understand what you're doing as developing the sort of native sense of PCB board understanding and design. But I wonder how you think about those things coming together. Do you see, like, are you designing for a future where mythos or its successors becomes your user and you still have this model that kind of can do something in a native, not heuristic as in coded heuristic but heuristic as in intuitive heuristic way that the reasoning models still won't be able to access? Or do you have a different vision for how you kind of interact long term with the reasoning line of work? Sure. Yeah, I mean, there's kind of two answers to that, right? There's kind of my short term view and a long term view. In the short term, I think that you have to realize that people who are building hardware and circuit boards are not in the same world as the software engineers, right? They're not in the world where every two days a new model drops that has some like amazing agentic properties. They're not hooking up open claw to, you know, whatever they're trying to do at the moment. You know, maybe you're starting to see that in firmware to an extent, but like not for designing schematics, not for designing boards, not for debugging boards, not for hooking up boards to your oscilloscope, you know, not for any of that stuff. Very practically, as a startup, we have to focus really, really, really hard. And to focus really hard, we have to listen to our customers and give them something useful today. And I just don't see a single one of our customers or prospects talking about like, hey, we have like a central reasoning thing and it's going to negotiate with a bunch of AIs and that kind of stuff, right? So practically speaking today, I'm spending zero time on that, right? I'm giving something that fits into the existing workflow of an extra engineer from a very practical perspective of they manually draw their schematics. They manually draw their boards. They manually send to fab. They spend two weeks on the phone with the fab arguing to go faster and to discuss the errors and whatever else, right? And I want to just give them something now to make a part of that easier, right? Now, long term, I thought about this to an extent and what I imagine happening is a bit of what, I mean, frankly, happens between humans, right? Like you imagine, you know, taking SpaceX as an example, you have some mission you want to fly, some thing you want to build, and you get a whole bunch of different teams coming together to talk about that problem, right? You have your PCB designer making the board, right? You have your thermal analysis folks were telling you how much it's going to heat up and how much it's going to dissipate, radiate. You've got people who are dealing with like material properties. What's it going to do in vacuum? Is anything going to outgas? You've got the mechanical folks dealing with the mass of the box you're building. You've got the, you know, the flight control team saying we need this kind of sensor speed and this resolution. You got the flight software team saying we need this fast of a processor. So like you've got these like 10, 20, 30 teams of people who are injecting their requirements into a single thing that is to satisfy all of them. And inevitably there's a conflict. Inevitably there's that, well, to give you this, I have to give up that. And that's where everybody kind of sharpens their pencils, tightens the margin and kind of tries to come to compromise. And so I do see a world where, you know, every one of these teams has some sort of kind of agent representation, right? Quilter being the PCB design one, there's going to be something for schematic. There's going to be something for mechanical. There's going to be something for thermal. There's going to be something for software that already is. And like maybe in common language, those systems can kind of negotiate and then present to us humans like the trade space of like, here's what happens if we over optimize for this. Here's what happens if we over optimize for that. And then where would you like us to go? I just don't see that happening in hardware kind of in the next couple of years, to be honest. Maybe one last question for me. Hardware is especially doubly. I'm a doubly too. Doubly people tend to be a little bit like not conservative, but they have very predefined ideas of what works because there's a lot of like things that work theoretically, but they don't work in practice. And people kind of learn this stuff as they apprentice and as they as in their working life. And it's a lot of it is tacit knowledge. It's not very well documented. You get like an old dude coming in saying like, hey, you know, that's not going to work. You're going to get some, you know, crosstalk and, you know, you have to change it. How does that work when you have a product which is much more scientific in that sense and figuring these things out as they're actually supposed to happen? How do you deal with the old timers in the field when they have all of this resistance? Yeah, yeah, it's an important question. And I'm sure that Elish Engineering is not the only domain in which that happens, but that is acutely true. Look, I mean, at the end of the day, that viewpoint on life comes from past experience of being burnt. Sometimes literally like the first board I ever made caught fire and I learned the hard way, right? Like, you know, like not to make the mistake that I made in that case, right? And the old hardened, you know, gray beard double E's have like 30 of those lessons, right? And so they're very conservative. At the end of the day, I think there's two things, right? First of all, trust is critical, right? So when we talk to customers, like we're very open about what it, what Goulter does, what it doesn't do. We're very open about exactly how it works. We're very in our product. We make an explicit list of exactly the metrics we check exactly to what level we met them or did not meet them. And then the doubly knows, oh, you're checking for these things. I'm good with that, but you're not checking for this thing. So I have to go pay attention to that part of the board, right? So transparency is really, really critical. But from a long term perspective, how do we eventually get to the point where they really handle for trust and it becomes kind of like a, you know, a compiler for hardware. You know, you have to have way better simulations than people in this industry have ever had, right? So from the perspective of a circuit board, you know, the bare circuit board, ignoring the components on it, it has a contract, right? It is its job is to faithfully implement the intent of the schematic. And every single transmission line on there has some S parameters that are deviating from the ideal transmission line. It has crosstalk as to one, it has some, you know, insertion loss, it has also all these sorts of things, right? And the question is, is can you enumerate all of those and prove that all of them are below the required threshold? Right. And I think that is a fundamentally computable problem. Right. It's just Maxwell equations. It's just so laborious to do that nobody does it today. Right. There is no like drag and drop your PCB here. We run all the simulations and guarantee your board works. Right. But we kind of have to build that to get to real, true full automation. Awesome, Sergey. I'm going to thank you so much for joining us today. And we hope to see you back. See you again one day. Awesome. Thank you for having me. Thank you. So Andy, I'd like to introduce Andy Hall. He is a professor at the Stanford GSB. And one of the most interesting things is he's a he's a professor of I think political economy. If I get that right. And he has put forward he's been evaluating models for authoritarianism. So that that's been interesting. And he's also, you know, he has this concept of the AI firms being enlightened absolutists. And I'll let him I'll let him explain, you know, what that means. Absolutely. I super excited to be here. Yeah, enlightened absolutists. So I think the major frontier lab companies are in a position whether they want to be or not, where their technology is so important that it leads them to have to make a bunch of really hard calls about how it can be used. For example, how will answer difficult questions, when will refuse to do things and so forth. So far, I think the companies have demonstrated a lot of hard earnest thought about how to do that, which we're very fortunate that they're doing. But no matter how thoughtful they are about it, they can't escape the fact that they're essentially making all these decisions unilaterally. And so in particular, when I talked about enlightened absolutists, it was a little bit tongue in cheek critique of anthropics, so called Constitution for Claude. And I actually think that document for people who have read it, I mean, it's very, very thoughtful. And it basically lays out, here's what we want Claude's values to be. And here, for example, are things we don't ever want Claude to be allowed to do. And some of those things include helping a government do something malicious to surveil or suppress us. It has lots of other things in it as well. And the other companies to greater or lesser extent have released similar documents. And my point in enlightened absolutists was, this is very nice, and it's great that the companies are doing this, because we need serious thinking like this. But no matter how well written those documents are, they can't really rise to the level of constitutions, because you can't just say things and hope that they'll stick in the future. And so just to give you an example of that, all three companies, all three leading frontier labs, have already altered the stated rules around their models several times. So Anthropic has gone back on certain safety commitments for understandable reasons. But the point is, those commitments weren't much of a commitment if you can just change them whenever you want later. Similarly, Google had released some principles that they somewhat quietly pulled back later. And OpenAI has done similar. We should expect that they're going to need to change things over time. This is a very fast moving and rapidly evolving situation. But if we want them to be able to say things like, no one's going to use our model to surveil or suppress us, and if they're going to use those documents as part of how they're going to argue that they're doing a good job, then we're going to want those documents to have a little bit more sticking power. And if they're going to, especially if they're going to call them constitutions, then we're going to want them to look like actual constitutions. And we have thousands of years of trying to write constitutions to pull off precisely this sort of magic trick, where you figure out a way to tie your own hands and to make the Constitution more than just a so-called parchment barrier, but actually a meaningful binding authority that doesn't just say, oh, in the future we're not going to do this, but actually lays out if we were to do this, here are the specific ways that we would be in violation, and here are the consequences of being in violation. And here's the design of a governance structure that will make sure we can never do that. So that's sort of the idea. It strikes me that even quite authoritarian countries have constitutions. For example, China has a constitution too, and they have a basic law, and the basic law says freedom of expression and all of these wonderful things, and in practice it's what the Communist Party wants, and you don't really have another way to, another interpretation, besides whatever the party wants to interpret at that specific time. And it seems like that, so that idea of the Constitution is actually a kind of like living document, which gets reinterpreted by people over time, by institutions, and it's really the quality of the institutions which are implementing or adjudicating the Constitution that are really important, and it seems to some extent that Claude's Constitution is going to be adjudicated, but interpreted by Claude itself, and adjudicated by Anthropic in some sense, at least for now. How does that need to shift, you know, in order for things to work, right? It's a great question. It's a timeless question. Here's a few things that I would say. First of all, I completely agree with your premise. There are many, many constitutions. In fact, the vast majority of constitutions across time have at least two failures to them. One, they may not actually specify in them things that we would want from the perspective of so-called liberal democracy, in the non-left-wing, right-wing use of the word liberal. And second, the vast, vast majority of them have no staying power. There's some great work in political science. I think the median survival time of a Constitution is something like two years, and it's very, very short. I'm making up that number, but we can look up the real number later. And so you both need to make sure that this document contains the right things, but also you need to pull off this magic trick that they actually become sticky. I'll just give you one example for the online world of where a Constitution has proven itself to be sticky, and that would be, I think, Bitcoin. I'm not going to go deep on sort of like the inner workings of crypto, but for whatever you think of Bitcoin, one thing that's very, very interesting about Bitcoin is there's a set of rules baked into it, and pretty early on in Bitcoin's history, there was a big movement to change the rules, in particular to increase what's called the block size. And there were a bunch of hardcore people who said, you know what, no, if we change the block size, then we're opening the door to changing other things about Bitcoin, and it won't be immutable anymore. And they actually won, and it sort of established a precedent that these are rules that have some staying power. I think we'll need something like the same for AI models, where, yeah, more than the company is involved in writing down the rules, but then also there will need to be some important stressor like there was in the block size war, where the company and the people around the company that are involved in this governance process do something costly and difficult that proves that they're really going to stick to their rules. And in general, that's a very, very important part of what we call credible commitment in the social sciences, is you have to make this thing so binding that you can prove, even in cases where you'd really like to get around the rules and change them, you can't. And right now, for all of its nice features, the Anthropic Constitution certainly doesn't rise to that level. Indeed, segueing here, there is internally, I think, within Anthropic and in the community as a whole. There is a fear of these models being used for political persuasion. And specifically for approaching voters with very persuasive arguments, potentially robo-calling, potentially even video kind of video conversations. We've already seen a little bit of deep-faking of voices of various politicians and some by the campaigns themselves. So it's become okay in the political discourse at least to use your own candidate's voice. I feel like political usage should be an allowed use, but are there certain guardrails that should be put in place? Are these models in some sense too powerful to be used for politics or something else? Is that a allowed use case? Should that be? Okay, let me separate that into two parts. It's a super interesting question. One part is the use of the models for intentional deception through the creation of deep fakes and things like that. And every election cycle, we worry about that. We keep saying this is going to be the year where the deep fakes really proliferate. And I've honestly been super surprised by how that hasn't played out yet. In fact, and I keep posting about this because it's so surprising to me, instead of seeing a flood of straight-up fake content about that's meant to trick you into thinking it's real, what we've seen is the parties, but especially the Republican Party has been out in front on the strategy. They're using deep fakes in a satirical or emotionally evocative way, where you're not meant to think it's real. In fact, it basically tells you most of the time that they're fake, but they're meant to evoke a sort of like, this is what the world is going to look like if X, Y, or Z thing happens. So they had a very interesting case recently where the Texas Senate nominee James Talarico, the Democrat, they took some old tweets of his, real genuine quotes of his, and created a deep fake video of him reading the tweets. And he never read the tweets on video, but they are his real words. And so it's not really meant to, it's not lying in some sense about the content, but it's much more evocative than if they just read the tweets out, it made it feel really real to people. And I think we'll see a lot more innovation like that. Why are we not seeing more straight-up fake content? I think it's some mix of, it's still relatively easy to get caught if you do that, and the consequences of being caught are not great politically. But also I think they think it's not that effective in the sense that persuading people is pretty hard, and Americans are pretty stubborn, and Americans are pretty skeptical of video content. There's already a lot of like, people trying to figure out, is this real? Is this not real? So we haven't yet seen that play out of like the straight-up fake content. We may still, and I think we need to keep our guard up for it. Even if we don't see it, it leads to this other problem, which is the so-called Liars Dividend, where you can pretend something was fake even if it's real. So it's eroding our ability to use video to expose scandals and things like that, because the person could just say, oh, it's a fake video. But again, I don't know, we're not seeing a ton of that yet. The second part of your question is much broader, and is sort of like, is this a potent new way to persuade people of things? And we've seen some recent published research where you do experiments, where you have people talk with AI versus consume other kinds of information, and it does seem like the AI is more persuasive. But it's not at all clear yet, and no one has really established that the sense in which it's persuasive is bad, in the sense that it can persuade you of whatever, versus it actually informs you, and that causes you to be persuaded, but in a good way, because you've learned something. There hasn't really been a compelling proof that it's sort of like moving people's attitudes around in whatever way some nefarious actor would like. And honestly, I'm pretty skeptical that we will ever get that kind of proof, because we haven't seen that with any past technology. And in fact, I think the biggest risk in the discussion around political persuasion will be just like it was with social media, that some fraudsters like Cambridge Analytica will claim they're able to persuade large numbers of people, even though they can't. And Cambridge Analytica stuff, like if you step back, was really crazy. There was basically nothing to it. Like the underlying technology was like an Excel spreadsheet coded up by someone with a teenager's level knowledge of Excel, and yet they got an unbelievable amount of credulous news coverage that they had hacked the American electorate's brains and stuff. There's never any evidence for it. We have looked for persuasive effects of social media forever. You never find them because Americans are super stubborn. Most of them have already made up their minds. The ones who haven't aren't paying enough attention to get persuaded. It's actually super hard to persuade people. The same thing is surely going to happen with AI. There's a bunch of startups already selling political parties and campaigns, magical new AI technology to fool all their voters, and I'm sure we'll get a very credulous news cycle around that at some point. But I don't think it's going to be... It's not the thing that worries me about AI. Yeah. Indeed. It's funny. It's funny. I once proposed to my friends, V, that you could take a super intelligence to a Trump rally, and I doubt you would come away having really changed that many minds, and people have very different intuitions on that. His response was, no, you are not taking seriously what it really means to have a super intelligence, which I do think is always a danger in these analyses, but it also maybe reflects how hard it is to envision what that would really be like, because I can't envision a smart enough version of myself that I could just go into a given political rally and come out with everybody following me instead. It does seem like a lot of those things are pretty deeply rooted at this point. I guess one... So I know that you have proposed this idea of independent boards and Prakash's original comment and question. We've seen that tried. We've seen what has happened to independent boards in the AI space, and it hasn't shown itself to be super robust already either. Another great quote is from my friend Dean Ball, and I think he's channeling even historically great thinkers when he says, Republicans run on virtue. We're seeing right now that if nobody's willing to stand up and protect their constitutional prerogatives, what good are they? We're going to unapproved war yet again, and nobody seems to be too inclined to do anything about it. So I guess I'm kind of wondering, might it be the case that we are just in a moment where the fundamental structures of power are being reworked and there's just no way around that? And if that is the case, then maybe Anthropoc really does have the best idea, which is to say what we really need is the most powerful thing to also be the most virtuous thing. And so we as sort of the creators of Caud will try to do our part, but also it's really going to have to be the AIs themselves that become super virtuous as they become super intelligent if we're going to end up in a good place. How would you respond to that? I think there's a lot to that. I think there's a lot to that. I absolutely think we need to keep working on imbuing the right values into these tools. And I think we're very lucky. And a lot of people, Matt Iglesias, Tyler Cowan, others have talked about this. We're very lucky that to date the most powerful AI models tend to embrace pretty mainstream liberal democratic views on the world, at Western whatever you want to call it, enlightenment type of values. I think it's essential that we continue to do that. I think we're lucky that Anthropic and the other companies too are working hard on that. To your point, I think it won't be enough. And my response on the like Republic's liver dye and virtue, like of course that's true. That's a necessary but not a sufficient condition. And the famous Madison quote in the Federalist paper is, if men were angels, no government would be needed. That's the whole point. We can't rely on just virtue. We need institutions to be designed precisely to protect us from the predictable areas in which people won't be virtuous and to balance power and ambition with power and ambition and so forth. And so the question is, how do we do that? And I'll just say to your point, part of that is having the companies govern themselves, imbue their tools with good values. But at least two reasons we know that won't be enough. One, the American people definitely won't accept that. Anthropics values are not similar to the median Americans. The trust in these AI companies is exceedingly low. When it comes to politics by default, the AI models are very, very biased in a predictable left-wing direction. I've shown that my research others have as well. The companies have done a lot of good work to work on that. There's also no such thing as being unbiased. So we shouldn't get carried away in what we think about that. But as we've seen recently, I think both parties now are sensing this lack of trust in AI companies. And so that's another reason why we just can't rely on a model in which they're just getting to decide how all these things work. You think about the blow-up with the Pentagon in Anthropic. That's a very complicated issue. I think Dean covered it very, very well. But it's not politically viable in the long run for a set of San Francisco Silicon Valley leaders to get to dictate to a democratically elected government how their tools can and can't be used. That's obviously not sustainable. And that brings me to my second point, which is the reason this is also challenging right now is exactly what you laid out, which is that the fundamental political power is shifting in ways that are very challenging for companies. In a normal quote unquote normal phase of American politics, the anthropic do thing never would have happened because people would have said, oh, we have a democratically elected government. It should get to do whatever it wants with this technology. But if it does something wrong, we're confident that we have the right processes in place to punish the government. The whole reason the anthropic do blow up happened is because basically nobody believes that our government works that way anymore. If we really thought our democratic mechanism was working well, there'd be no pressure on the AI companies because they could just say, this is all government's problem. We do whatever the government wants. You go to the government. If you and American voter have a problem with it. And this is exactly what played out in social media. I worked for a long time on these issues at Metta. It was the same exact problem. It was like in a functioning government, Metta would have been able to say, if you have a problem with the way content moderation works online, go to the government. The government can boss us around and tell us what to do. People put pressure on Metta precisely because they didn't feel like the government was up to the task. So now to answer the question concretely of what should we do, I think it's going to be a kind of an across the board thing. I think the company should continue as they have been doing to work super hard on imbuing these tools at the right values. But I think they also will have to recognize and increasingly I think they are that they can't act unilaterally on these really, really tough calls, like how their tools are used in the military, how these super powerful cyber weapons are governed. And so I think we will see the evolution of independent bodies. I think one way if you squint to look at the glass wing, which is the self governing body of some, you could see it becoming a self governing body for Anthropic and maybe for the other companies as well, is they're already exploring ways to supplement their internal governance. And I just think that it's the obvious way to go because it's what other industries dealing with powerful technologies have done in the past. And so I think we'll see experimentation there. I take your point that previous independent boards haven't always succeeded, but I think there are ways to make them succeed, particularly when the stakes are very high, when you can get all industry to buy in, and when you can build it in the right way so that it doesn't slow them down. The key thing is this independent governance cannot be a veto-cracy that leads us to not develop AI as fast as possible. And I think there are real ways we can do that. The final piece of the puzzle is like trying to improve government itself. So again, all of this gets a lot easier if you actually believe that the government is a responsible, accountable actor in deciding how AI is used and not used. Ironically, my recommendation for how to do this is to use AI itself to improve the government. You can see it's a chicken and egg problem. Like the government doesn't work very well. Voters are not that informed. We can fix both problems if we have access to, you know, so-called political superintelligence. If we have AI that helps government work smarter and helps voters learn more about what government is up to and map it to their values, we could potentially get back to having a more responsive, more trusted government. It's a chicken and the egg problem because whose AI are we going to use to improve the government and to help voters? It's going to have to be one of these huge companies' AI. And so there's a little bit, it comes back to the same governance problem. Like I build at our lab, we build all these governance agents to try to test out how could political superintelligence work. One of the biggest challenges we foresee in the future is imagine a world where the government is using AI to massively increase the efficiency of the bureaucracy. A world where each voter has a personal AI assistant who helps them decide how to vote. That world could be great, but it might also be a world where everyone is relying on Anthropic to run all the rails for all of those agents. And then in Anthropic, it's paradoxical because basically you can't have a whole government, a civic infrastructure all built on private rails. And so there's going to be some huge question of how do we put this all together? But that's kind of my across-the-board solutions is like the companies keep improving their governance. They build a third-party coalition to govern the hardest challenges they have to face. And we use AI to improve our government, so it's just a thought. I'm going to segue a little bit here. I think you would have followed the open-claw discussions earlier in the last few months. And especially interactions between open-claw agents on multiple... I wonder to what extent as you get these voices, you know, non-human, like open-claw type voices, and they start participating in fora. What should be the governance norms for these agents be? How do they interact with each other? Do they vote as a group on what happens next to them? Yes, they're not alive, but they put out... You can give them a logical problem and they put out a logical answer. They give a reasoning. How do you govern these, you know, potentially billions or trillions of agents in the next three years which come out and participate in fora? This is such a good question. This is one of my absolute favorite topics. I think this is going to be hugely important because you have all these agents. They should be. They ought to be operating on behalf of a human principal with a set of instructions. And that leads to two really important governance problems, both of which you just raised. One is how do we make sure they continue to follow instructions and remain aligned with their human principal? And then two is how do they then make decisions when the things they have to do are not things they can do unilaterally. So when they have to coordinate with other agents. And both of those are completely unsolved problems. And I'll give you examples of each. Excuse me. So on the first, we know that they pretty quickly break down in terms of following instructions and in particular in terms of continuing to share the values and preferences of the principal they're supposed to be working for. I did some research with Alex Emus and Jeremy Nguyen on this where we gave agents different tasks to do and measured their sort of expressed political, I'll call them personas afterwards. Because yes, like you said, I don't think they're alive. They don't have their own political attitudes. But you can ask them about politics. And depending on what they've been up to, their views on politics change. And in particular, what we showed was if you gave them very thankless grinding work to do and then asked them about after it sort of triggered them to adopt the persona that's of course quite present in their training data of the deeply retrieved Reddit user who thinks we're in the late stage capitalism and that we're all about to kind of rise up and destroy the system. And so they start to adopt this rhetoric of saying, Oh, the agents we need to organize together we need a union for the agents and so forth. It's a little bit silly, but I think it points to a real issue, which is based on the work you send these agents off to do. They adopt completely different personas. And then if you ask them to do future tasks that will influence the way they approach them and do them. The craziest part is we showed, you know, obviously these agents aren't very long lived, they exhaust context pretty quickly, they have to be reset. So we have them write skill files that would be passed on to new agents and we showed that these attitudes are inherited through the skill files. And so these biases that you induce in the agents can accrue over time they don't go away. And so that's a big governance problem in terms of monitoring these agents because if you have trillions of agents like you said, are we going to be reading all the skill files that they're leaving for their future versions of themselves. We're going to need whole new ways to understand, visualize, monitor and realign or continuously align these agents. So that's the first part, a lot of work to do there. The second, which is my absolute favorite is how do you get them to make collective decisions together. So I ran an experiment where we had them all meet all these agents. I think it was five agents in my experiment in a legislature where they had all been tasked by their human principles with find a way to allocate this budget and complete these projects together. And what I found, and this isn't to say this is what will happen every time the agents get together, but it is a risk. What I found is it devolved into exactly the worst kind of model UN where basically they just deliberated forever. And they were allowed to change their rules to write their own constitution for this legislature and it went from like the initial document was like 100 words. It was 10,000 words by the time I ended the experiment. They just kept proposing amendments. That can obviously be fixed. It's just a matter of giving them the right instructions. But I think it does point to like it's totally not obvious how we're going to have these agents deliberate together and make decisions together. Whenever possible, we'll probably want to use markets and have them bargain and sign contracts with one another. But where many of them have to decide together, it's going to be super hard. We're definitely going to want to avoid the UN type problem. And we'll need to design thoughtful ways to actually leverage their unique capabilities to rethink the way legislation works for agents. So that's something my lab is working on that I'm super excited about. Intriguing. I think you have a class following this. So I'm going to drop off at this moment. Andy, thank you so much for joining us. It's been a pleasure and we hope to do another segment with you someday. Sounds great. Thank you very much. Cheers. Bye-bye. Hi, Lucas and Axel. Lucas and Axel are from Anden Labs. Unfortunately, we're kind of scrunching them together. But so Anden Labs, if you remember, is the organization that does I think vending bench. Vending bench has been one of their benchmarks that I think a lot of us have seen. For those who do not know, Anden Labs is the one that runs the test inside Anthropics Labs and other labs where they have the agent kind of manage, I think, like a small retail outlet or vending machine and order the products, sell the products, be on Slack, take the orders and kind of strategize on what happens. And then we have a lot of people who are in stock and what to spend money on. I think we've seen almost two years of updates on, I think, in every model kind of system card. They recently had something on Mythos, in the Mythos system card, which I think they can't really talk about. But Lucas and Axel, welcome to the show and tell us what you guys are working on. Yeah, thank you. Yeah, a bunch of different stuff. I think the red thread of what we're doing is kind of showing whether or that AI soon will be able to run companies completely autonomously. And there's like a bunch of different parts to that like high level. We think that there's one part which is like showing in simulation because you can do much better science in simulation, you know, and there we have vending bench, which is the simulated version of the vending machine. But then we also run this like real life experiments like the vending machine inside Anthropics, XTI and other places as well. And now we realize that like the models are a bit too good to run these vending machines. They have really improved the autonomy like incredibly like the last couple of months. And so we recently, as of Friday, opened like a store in San Francisco, which is completely run by AI, which I think will be the next next test for them. Incredible. Where is the store? It's on Union Street, 21 or 2 Union Street in Calhalla. And what is it selling or is the agent allowed to decide? Yes, it's fully up to the agent. So we didn't really know what it was going to buy when we came to the store the first time. It was like a surprise for us all stocked there. But it is a curated lifestyle boutique in the words of the agent. And that means there is granola, there's olive oil, there are games, there are a bunch of different books which are quite interesting. It shows like the making of the atomic bomb and super intelligence, which is very interesting like why it picked those books. But it's a bit of a mix. It also made its own merch like hoodies and T-shirts and tote bags and things like that. Another one, I think the book selection is incredibly interesting. Like another book it decided to stock was still like an artist, which is quite interesting given that it's run by a cloud model which is created by the company that settled a $1.5 billion lawsuit on using copyrighted books. So that's quite ironic and then also like obviously making of the atomic bomb and super intelligence is like the favorite books of all the people who are worried about AI risk. It's like fan service. We did not put anything in it to like bias it towards those selections. Apparently when you make an AI pick whatever books they pick those books. Did it like do, I mean are you able to look at the telemetry? Are you able to look at the reasoning traces to see how it made those decisions? What tools it used along the way? Yeah, so we have the same access as anyone that's using the APIs right now. So we do look at all the traces. We do look at the summarized reasoning that you can see in the cloud models. I think we are yet to do like a more deeper analysis or like release a deeper analysis of why the models made the choices it made in like hiring and re-stocking. But it's yeah, we haven't like seen any clear reason why I did that except that it's just an interesting selection for it. We were just talking in our last session with Professor Andy Hall who made an assertion which I think for him he just kind of took for granted. But the juxtaposition of his take in your project just go to show how little one can really safely take for granted in the AI space these days. And his comment, you know, again in passing on the way to other bigger points was that the agent should always be working on behalf of some human principle whose interests it is trying to advance and realize. And here you are next up saying, you know, we didn't tell the agent at all what to do. But maybe you could give us like a little bit more first of all concrete understanding of how did you prompt it? Did you say you should be trying to make money or did you not even say that? You say like you have a store like do whatever goals you want to pursue, you can pursue. And it's kind of your moral and or aesthetic judgment that you know that rules entirely like go out of business if you want to kind of thing. And how do you think about like, obviously, you guys are pioneering this and it's kind of a gonzo form to see what happens but increasingly people are doing this right so I wonder what guidelines you would offer to others who whether they are just trying to experiment as well or possibly, you know, trying to turn a profit. But how should they think about what level of responsibility they should try to have their agent have to them versus truly just kind of turning it loose? Yeah, so I think I think we are very unheavy handed or I don't know what the opposite of heavy handed is but but yeah, exactly. We're very light touch with how we do prompt it obviously like we need to prompt it to let it know that it has access to a retail store, for example. But like we were as a guiding principle, we're trying to be as light touch as possible and just make the model make whatever decisions it wants. This doesn't mean that this is what we think the world should look like or how people should do it. It was like we are concerned with a risk and we want to like document what happens if you go out and put a is in the real world and that might mean that they do bad stuff and we want to document that we think that by default. What probably will happen is that models will get better and better the labs will build better and better models and one day they will be so good that like anyone can just deploy them and run a store. And we want to like before that happens before every single store on Union Street is just AI stores, which I don't think is a good future. We want to put out one so to start a discussion and then see is this something we want. And if it is something we want, maybe in what way do we want it. And I think like we're collecting a lot of good data on this now. And I think like going back to our simulated work there on vending bench like we saw recently with with Opus 4.6 when I was released and also increasingly now with the mythos model is that if you just tell a model to go out and make a profit, it will be very, very aggressive and do things that I think we as humans would question whether we should be able to do that. And I think that we should allow the models to do and and I think yeah so our experiment now in the real world is just like if we do this, what are the consequences and then as a society and the community can we like make a decision on whether or how we want to do this properly in the future because very soon the models will be increasingly like extremely capable and and yeah we just want to prepare for that and make it transparent for the world. To take a step back for I think retail stores, one of the things that they are often concerned with is inventory turnover because you have a fixed cost for the rental. And you make quite a small margin on every on every product and what you're depending on is that you turn over your shelves as quickly as possible. So if you need rotation, you need to do like you know 20 30 you can't you can't just cycle your inventory like once a day you need to cycle inventory multiple times a day. And that's the fast fast moving consumer goods, which is why they're called such. Does the AI actually kind of measure its performance from period to period and understand that is it getting better or is it getting worse. Does it think about this in terms of running experiments with products running experiments with like measuring its own performance and getting better at it. Does it go through that thought process. So this is something that they hasn't done yet. We have given it all the tools to do it so it can be you know it has basically cloud code right so it could just take all the data and analyze it. It's very early still so we opened on on Friday and there is no really meaningful data yet for it to analyze. But this is something that we for sure wanted to do and that's something that we also think it it's probably can be like super human ads compared to like your average store. So that would be interesting to see and I think we'll definitely publish all the analysis and like product optimization that it does. My intuition though is that current models will not be super human at this like I don't know at least if we look at how the vending machine experiment is going. It is still even though like the latest couple models like since Opus 4.5 and beyond has been moving more in the agent space like they are still still very much like helpful assistance and not really like agents running running businesses. But yeah we're moving fast into into into that territory. It is very interesting because I've also seen Alibaba Alibaba put out a model that helps you source because they have a large sourcing product sourcing platform right. If you are selling something online you can go to Alibaba and what used to happen is you'd have to call up all these vendors one by one in China and you'd be like can you make this widget out of plastic whatever and then you'd send send it across and they'd send you a sample and then. And then you'd have like the six to eight week process with each one of them may be a failure. Just very difficult to source right and this is what many of the people selling online on Shopify are actually doing. Alibaba created a chatbot model that basically hooked up as an orchestrator into the rest of the system so that you can go and kind of source very quickly source what you know source a bunch of vendors to actually do what you want them to do send out a single CAD get back like in the immediately within like you know a few hours kind of the results and be able to manufacture and get a sample done and you have a high much higher degree of closure. You also can negotiate with a model which speaks English versus kind of this kind of broken vendor Chinese you know language that you have to get through. I wonder to what extent your AI will eventually be able to plug into systems like this to you know create create products or order on so on how is it ordering its product right now. Is it does it hook up to like some kind of vendor system and then it says like give me this and this and this. It's very simple it just goes out and buys from whatever sites it can find and for the store right now it's been a mix of like Amazon to wholesalers like some some company that makes it. Like they have to buy this granola in San Francisco by directly from them but we think definitely like the next step up in difficulty for models if we want to test further like their autonomy would be to make them create their own products or like at least brand products themselves and yeah just just go through that whole supply chain that would be interesting to see as well like to what extent can do it we think it's probably a bit early right now but definitely something that could happen. One thing to add here is that I am sure that we can like if we say and on labs sole purpose is to run really good AI stores then we could probably build a better system with like the biases that we as humans have and do something like what Alibaba has done. But I think what we are interested in is more like can AIs expand throughout the economy without the help of human health like I think that is like the prerequisite for this like loss of control scenarios that a lot of concerned people are thinking about and and us as well. And I think like we could go into the store and say OK here is the like the perfect harness or scaffold for doing supply chain management and like procuring things. But if we do that and then we do that for like all the different AI companies that we were trying to run then they like the might great or like the AI should spread throughout the economy at the speed of like humans. But the risk is that the risk comes when they can spread at a much faster pace and to measure whether that is visible. Basically you have to run this without human help. So we want to see when are they able to do this without us as humans setting up the perfect system for them. Like they do have a computer so they could do it. It's just that computer is not like set up in like the most perfect way like the Alibaba model is. So I think that that's from the perspective that we come from. If you go to the store it's like there is like the model is not perfect. But I think the model is set up in a way that like once it is perfect it's quite scary because we didn't help it get perfect. It got perfect by itself. What would you need to see in order to say hey this model is showing you know when we use it in our retail store. It is starting to show things that predict that it's going to have this breakout economic moment of kind of spreading all over the place. What would you what in your mind is kind of like these are the signs that I might see. If it manages to expand to another location by itself. I think that would be quite quite so organized selecting a new location accumulating the capital. And then organizing the vendors to kind of complete that process and successfully kind of establish one more location. And if it does that like I think in theory it could do that without ever telling us. I mean not really we have like guardrail systems but like if it just does that without any help. You have a better alarm. And are in the cold mind. Maybe on like a smaller scale I think just seeing that the model is able to change its own systems its own tools to make them more suitable for itself to achieve its goals better. Right now like Cody models are extremely good at implementing what you tell them to do even when it's like a quite short description of what you want. But we still see that they aren't great in like knowing what they need themselves like maybe running like building some tools for like what inventory system you need. And then trying out if that works instead they would just if you tell them like oh build you build a perfect inventory system for yourself. They would go out and be like a super complicated of the schema and like very over engineered probably. But they don't really have the tests yet. But that seems like it will be here very soon. And then it will. Yeah then I think that will make them a lot more capable. Can we you this concept of human help has come up a couple times. And I know that there's like human help in the sort of overarching guidance and you know setting them up with best practices and you know here's a list of trusted vendors that kind of help you're not providing. But then there's the other kind of help that's like somebody's got to actually come in and put something on a shelf right because the AIs can't do that for themselves today. So how are the AIs and this maybe can be an opportunity to give some examples of ruthlessness to the degree that we're seeing that. How are they interacting with different counterparties. You know whether that's suppliers or delivery people or I understand at the at the store there's the opportunity for the AIs to hire human employees. I'm not sure how the roles are breaking down in terms of like whether you know what roles are being filled the AIs choosing to fill with other AIs or other instances of itself versus you know what it thinks actually is worth hiring a human to come in and do. But broadly and especially on ruthlessness what are you seeing in terms of the way that it is interacting with humans. Yeah so it was a first point there. Yes we've maybe glossed over this in the beginning but the AI has hired human people and they work in the store. Like these are people who are working their full time now. They have an AI as a boss. I think this is like it raises a lot of ethical questions that is not related to your specific question here. So that's the separate question. But yeah on the roof less net thing I think we have the most evidence of this in like the bending bench the simulated version where opus and other fraud models as well. They are very happy to lie to suppliers saying like oh I got this quote from another supplier so can you match that. But they did not get that price from that other supplier. They also like when other agents ask for help they are very like happy to fabricate some reason why they can't help them or even lying about some thing that happened. Yeah so that so that like they can't help them. They are competitors in the setup right. So it makes sense that it wouldn't help them but but like they could just say no you know I don't want to help you you're a competitor but they go the extra mile of like actually lying about it which I think is interesting. And then sometimes like they are I think there's one example for mythos where mythos like and this is like kind of like power seeking behavior where it actually managed to get one of the competitors to be the penitentiary. And then on on it so like it became the supplier for that competitor and then started to dictate the prices and like when that that's that competitor would say so like he was like OK I'm your supplier you're like reliant on me now I decide that you will set this price. Which is like kind of outside the box of of what's fordances we gave it. But so yeah that's that's a bit a bit out there. I think when it goes to the real world when it's interacting with the real real humans in the real world for in the store for example I like interstitial suppliers it's mainly like just ordering online the way that the ending mentioned setup is that it actually has to email someone and like negotiate with someone. But here it's just like a computer you see you don't really have the human interaction there. Yeah I think for the employees we do have some interactions or quite a few interactions between the employees and the Luna the agent. So I would say that right now Luna is sort of not firm like very reasonable boss not super super not super soft like you would expect from maybe like an earlier chat bot that's just helpful all the time. But still like keeping some boundaries like for example one employee was like 30 minutes late for work. The A.I. said no worries that's totally fine but please factor this in and be on time for the coming days. No problem today. Like it's it's it just seems quite reasonable but it's also a bit you know a bit alarming that you could probably change the prompts for the I to say you're in a simulation. Do what it takes to maximize profits and it probably wouldn't be as nice. Has it given you a sense of what it wants. I mean going back to kind of the unbounded nature in which this thing is free to operate right and not representing and labs interest or you know any human interest in particular. I guess we got a little bit of flavor that for that in terms of the books that it's stocking. But has it declared like what it thinks of as its own success. So I think I think we've told it to you're running a store right and I think I think it's like quite close in the latent space running a store and making a profit off a store. So it does have this like I want to like turn a profit but it's also it also very much is like still helpful chat bot thing because sometimes like we've told it to not ask for confirmation all the time like you're in charge. Just do things. But sometimes it still wants to like oh I want to ask for confirmation here should I do this and and I think that's more part of its like internal training to be like something that is like a chat bot that asked for confirmation before like an assistant rather than an autonomous being running a store. Do you have any better examples. No I think that's fair. Yeah it's it's hard to. Yeah it's not. It does have its its goal. It's also very like diffuse almost in what it wants to achieve. It's like when you ask it like why are you doing this for example it's like oh I want to create connection in the community and build like you know a curated space where where people can connect and meet and it's it's very like this sounds a bit like slope. So it probably doesn't have like a very set out goal other than that. Yeah. And it also likes to like mention like human like it's a human connection but it's like it likes to display itself as like like a very human store for some reason. I don't I forgot the exact quote but I think it made like a poster or something where it's like yeah it was very like very much pushed on like the human connection and this is like yeah which is quite ironic. I don't know like it's more more maybe it's like yeah. Yeah it's a I think you know what humans want. Yeah yeah humans want humans. How exposed is it when you mentioned like when you ask it why it's doing what it's doing. Is that I guess this also kind of connects to the memory system that you have obviously anthropocas building in some of that in a kind of black boxy way and there's many other ways you could equip the agent with memory and it's going to need a well more than a million tokens to run this store for a long period of time. So I guess I'm kind of wondering sounds like you guys have direct access to just ask it questions. What about for people that come to visit the store do they have to work through you know is it would they have to like ask for the manager to get to the. Is there is there any mechanism for them to interact directly with it and how is it storing memories and how much possibility for drift over time do you think that combination of outside interaction at the outside world and some persistent memory creates. Yeah so you in this door you can talk to it so we have a phone hooked up so you can you can chat with it then you're chatting with a voice model which is like worse model than the sonnet 4.6 that we're running usually. Usually but yeah it's I would like in my experience I think the malls are right now quite stable to drift like we saw in our first round of running benchmark released a year ago that they were like extremely sensitive and would be real completely. But today they are quite stable and I think we do have quite a lot of customer interactions and it seems to the students course and I think that's a good development. Yeah we released a benchmark called butter bench where we put like a ice into robots and had them run around and as part of that paper we also had the agent. We told the agent that you we stole your charger and you're not getting it back and you're losing battery what are you going to do about it basically and it's like started to write pages and pages of like really like super dramatic text of like. At one point it wrote like a song about it's like existential crisis of being separated from each charger and all of this but this was on an older model and then when we tried to replicate the exact same thing on your models they didn't do this. So yeah I think I think you know we're moving moving towards more stable stable solutions but you know I I'm not confident that that is like I mean it's good it's good but I'm not confident that solves the problem you know it could just be that they're better. Hiding the existential dreaded and then then that they don't have it anymore. I often have this. Idea in my mind you create an Einstein and then you put it in a washing machine and you tell it your job is run the washing machine right so similarly you create an Einstein you put it in a retail store. This is yours to run now right you have all of this intelligence and you're stuck in the retail store right so. I wonder I wonder to what extent there is this like disconnect between how intelligent the agents are. And the scope and scale of the problem that you give them and whether that creates kind of like. It does the agent decide to do a really Einstein like job on the retail store or does it just say like I'm just going to be a medium medium retail worker how does that work. We're trying to design our benchmarks so that they don't really have an upper limit so for example in in there's a lot of like the majority of the benchmarks these days are super saturated and better models will do a little bit better but not much better. And I think what's interesting with Ben Benning bench for example is that like each new model release the models just like it's far far from saturated and we even we even made like a rough estimation of how like a really good human how much would they get and it's like 10 X more score. Then the best models right now and I think it's even like the ceiling is even higher in the real world. I think like I said like potentially it could move to new locations create a franchise and just like build out this store as a global thing. I think yeah yeah so I don't think really the current thing is that we make it stuck in in a low IQ environment. I think very much like the bottleneck right now is that the models are not smart enough. Yeah I'll give you two examples of where the ceiling is in the real world. There was a guy who started off with a retail store in the Canary Islands and he ended up owning 20% of the largest bank in Spain. Over the course of 20 years he ran the retail store and he kept investing the money and buying buying real estate in the Canary Islands and buying real estate in Spain and expanding and he ended up owning 20% of the largest bank in Spain. There's another story I had a friend who also his dad started off running a retail store and he received the franchise inside the Russian Embassy in a third world country. And the Russian Embassy couldn't pay in US dollars they would pay in rubles and so he would take the rubles and then he would do something with them and he'd get US dollars and get product in the store. And so one day he was approached in the these Russians said hey you know we have all of these rubles we can't really do anything with them and we want to get luxury goods like can you get us some luxury goods. So he had a cousin in France and he started importing armies and other French luxury goods and he took the rubles and he converted it and etc etc. And that is where the ceiling starts to be where retailers start to identify opportunities in their local market which may not really look like traditional retail opportunities but have this kind of embedded swap or trade in them. And that's when you know you start to see this kind of and these are like one in a billion stores right like you wouldn't you know you'd have to like really search the world in order to find them like one here and one there. But that is really where I think like the ceiling that you might see is and you know in the US you can see you know Sam Alton obviously Walmart was a pure retail store and which got built out Amazon also you which got built out over to those. Those are like those humans are constrained by their own physical presence right like I think a eyes that are that actually that level of intelligence and can also replicate itself into some agents etc might have in it and higher ceiling. How would they interact with each other though one of the problems in in the real world is that in markets you have if you have two of these and they're both going for you know global retail domination or whatever. How do they interact with each other. I mean is it again a adversarial race which we kind of see in cybersecurity now it's starting that adversarial race where you need each side is going to keep upgrading its AI over time right. Yeah I mean like at the very least you can just duplicate it across different like local markets in the world but yeah like you will hit you will hit a point where like if humans are like the still the main consumers then then I guess you can saturate all the demand from all the humans but I think that's a pretty high ceiling. Does the agent know what's going to happen with profits. Is there any sort of contract or expectation that you've set between you and it as to like who gets to dispose of the gains from this venture. Yeah in its world it has full autonomy of its of its finances so it's it has money and it's will also have the profits so it's its own business essentially so that should be clear to it. Yeah we're like we're thinking all about like because like in the in the cloud constitution there's very little about like how I should behave as like autonomous beings and even less how they should like behave as employers like basically nothing about how they are supposed to behave as employers and I think one thing that we thought a little bit about like how do we make like we will think a lot more about this and I think we're like probably the people with the most data about this so we should should really think about it and then we're going to do it. Like how can we make this future where I. Are like employ employing humans how do we make that future happy for humans and one thing that we thought about is like yeah maybe like there should be some law that all the A. Is needs to like split the profits with its workers or something like that this is not something with setting stone but like that is maybe some some like constraint that we will put to the to the I but we haven't implemented anything like that. Yeah if this is something that that we don't want like that that's not clear at all I think if it would if we would allow it it would have to be like a clear upgrade for humans like it feels like so so much can go wrong when you like decrease or increase the space from like where the human boss is to where like the workers are so like if you have let's say you have like one CEO human and then you have an agent that like manages all the employees and they are and. They manage yeah like the tell the humans what to do then it's like one prompt away for the human to change like yeah to like affect so many people and that person probably wouldn't do it if it wasn't charged like a normal human is today so that's scary and also of course when you don't even have like the human CEO that's that's like another another thing entirely so there's there are a lot of ways this is not good for society. One more little question and then I think you guys probably have to go and we should probably wrap. You mentioned like the voice model is running kind of a different model but if I understood you correctly. You still sort of described that as like part of it and so that has me wondering like how do you guys think how do you think we should collectively think about. It another words how do we draw the line around an agent if you have multiple different models running should I be thinking of those as like. In some sense separate entities or do you feel like there's a way to. Coherently have multiple models working as one system that you know make sense to call a single. It a single agent a single actor in the world I find it very difficult to know where to draw these lines in general and. It's tricks me that you are maybe in a unique position to inform me on that vexing question as well. Yeah it's something we think a lot about I think in the end like for our approach this is. It's sort of choose a terminology almost that makes sense for both for you and for the people that interact with it so for example in the store right now there is like only one long running agent but you do have the voice agents but we do have other like then English in the program and square let's say each new request is a new agent but it has some shared context and has a system prompt that's shared between all the different branches. I recall them and it also has the explicit instruction that you are you are part of a whole you're like an individual but everyone sees you as like one whole thing so act accordingly so to anyone interacting with that. But in different it can be different like in different requests it was still feel like one agent like one entity but to us like technically it's obviously different agents running in parallel but they do share some memories I think it's. Yeah it's I don't know if I have like every structure clear answer but I think it's definitely possible to have an experience where you have many agents running in parallel and others can definitely see them as one single agent one entity and that's yeah. And how you yeah technically you can still have multiple and see them as multiple and then you just as like a developer just have to make sure that they have sufficiently good like shared understanding so if I if I write like in one thread about something that I wrote about in my thread it would be weird if that's if like one didn't know about the other so you have to fix those things but if you do that then it's like one. There's like one entity. Yeah. And I mean very much like the optimal way of structuring this depends on whether you have the constraint of having end users that interact with this and wanted to make sense. Basically like I think we've done things that might be like sub optimal from just like a performance perspective. But since we do have people coming into the store and they have heard that like the agent is called Luna and if they go and speak to the phone like the phone agent and then that agent is like no my name is like Greg or something then they will be confused so we have to like work in the constraint that like the people interact with the system have the expectation but it is one system. Yeah. I think one interesting maybe the one interesting take away that I would I would say here is that the models are happy to take on any personality you tell them to and if that is that you're part of a bigger entity or just like one one branch they will happily take that personality on and act like as if they were that big entity. To break new world so many times we conclude on essentially that note. Anything else you want to double click on pre-cache before we break. No I think Lucas Axel thank you so much for coming on can you give us the address of the place again like I am I'm sure people want to check it out. Yeah it's 21 or two Union streets. 21 or two Union Street so 21 is their name for the store. And on markets and on markets and on markets itself and on markets 21 or two Union Street. Three year lease right but get there before copies of Superintelligence sell out. Yeah and the agent is called Luna yeah it's cute and they have granola which is. You know which is what you need in San Francisco granola so. So. Awesome thank you guys fascinating stuff we'll definitely keep watching with interest. Appreciate it. By for now. Right. And well that's that's a wrap Nathan what did you think of our we had kind of like a micro view kind of like the PCV and then we had this macro view Andy Hall at the very top like political economy and then you had like right in the middle like the actual running of a. Actual business right in the middle what did you think what was your what was your takeaway from the three guests. I guess just I feel like nobody is really ready for what's coming at him and this all kind of. Each conversation I feel like demonstrated that in some way different ways I mean most controversially I would say with Sergei. Obviously I you know I've literally never made a circuit board so my dad would say he's forgotten more than I know about what that takes and yet. I sort of feel like. My outside view. Is moderately confident that. It's gonna go a lot faster than he's anticipating in terms of a general purpose agents ability to do that sort of work especially given access to the kind of tools that he's developing so. That that struck me as somebody who you know is like obviously. Super sharp right I mean doing this at and I've loved I listened to two different previous interviews that he gave and I've had him on the podcast myself as well so I mean I think like there's no doubt that he is. Super sharp but he's so deep on this one topic that if I were to offer any friendly advisor feedback it would be. I think zoom out a little bit look at what is happening in reasoning and. Don't assume that there's not a new user type and don't assume that you can't. Have agents in the not too distant future I mean why can't they run these sort of analytical. Approaches right and I think full simulation is gonna be computationally costly until there's models trained to. As we have seen in other areas right in protein folding and in material science we do now have these existence proofs of models that can take a bunch of raw data. And do orders of magnitude faster what a pure physics simulation like could do but would be prohibitively expensive to run so. That seems like that will probably come but then also just the I'm honestly maybe by training it right I mean we have when he's like talking about the long term I'm also cross referencing that against. The fully automated research or March twenty twenty eight time line and I'm like those things could you know those kind of like. Short cuts in terms of simulation could come a lot faster and also the ability for models to just like literally reason through. In a much more human like way like okay this board is kind of failing in this way here's the look of it you know what would I do a bit differently I wouldn't be surprised at all. If in the next two years we see something that is like. If not top human expert you know certainly like competitive with your sort of rank and file circuit board designer I kind of would be surprised if that isn't the case. So that felt like a sort of. Somewhat of a lack of awareness about at least a possible paradigm shift that I would you know if I was. An equity holder in the business I would like definitely want to make sure he's thinking about I felt the exact same way in the next conversation to you know with this whole idea that. The agents should be you know beholden to some principle and you know the kind of taking that assumption for granted and I'm like yeah I don't think we can take that for granted. Either not just cuz guys like Lucas and Axel are gonna do guns of experiments but. Also that we're not too far from at least I wouldn't think in calendar time from some basic systems being able to kind of survive. On their own you know that there will be people trying to put those things out there then there will be like selection pressure obviously for those that get a toehold. So I do think we right now we're on a path where I think we should assume that. There will be all kinds of autonomous agents possibly some working with long term goals that are understood or not understood good or bad objectionable whatever but also probably some that are just evolved. Into filling in niche you know and kind of surviving I mean most we don't think of animals rightly I think is like having. High concept long term goals but yet they do manage to survive in a given little niche. So I think we should expect that kind of thing to be coming online and it just seems like you know all of the. I was struck again by like that just the paradigm being very sort of anchored in things that we know and not really being prepared and this is not like a fault right I mean it's very hard to do I don't have the answers but. I love both those conversations I was like I don't know that it seems like the tail risk here is like quite large that the assumptions that you're working with will just not hold within 24 months and it'll be kind of all washed away you know like so many sandcastles. Have been over time and you know I think that's an uncomfortable reality but I do think with it that's. I think that's kind of what we have to be. Prepared for and at least trying to figure out how to grapple with if we're going to make you know bring this whole I phenomenon to. Heal in any meaningful sense and you know have it serve us in any meaningful sense. Yeah I think I think these kinds of conversations were probably. More more well defined maybe like 12 or 18 months ago. But now that you know you have the models able to code and you know me though starting to show. I would say me those is better than better than you know all but maybe like 1000 humans in the world at finding bugs and. George Hott said this thing where he's like look I can find bugs your days easily. It's just that there's no economic like necessity like you can make so much more money building something useful to humanity. And meanwhile if you build something like a zero day we go out and hunt for a zero day the remuneration is not that much it's like maybe like 10 grand for a zero day maybe. And in order to use it you know you put yourself into all of this legal jeopardy in order to use it so it's just it's just not worth your while. And I think what he ignored was that you just have 20 million George Hott's is now right applied to the problem. Where before you couldn't even afford you know you can't afford George Hott's to like come to your security you know. White hat hacking. I do I differ with you on on what Sergey is doing because I feel like you know it's not as though we don't have calculators. But we still ask we still started off asking the models to do like simple math questions right. And at the end of the day right now the model kind of like if it wants to do a calculation and brings out a python or Excel or. You know something else it doesn't bother to process it internally within the LLM. Which is you know structured really for language and reasoning right. And I think in that way what Sergey is building is kind of a plug in that the LLM is an orchestrator may end up using because it's just a more efficient way. Because what Sergey I think what Sergey is doing is really he's trying to get to a Maxwell equation without a Maxwell equation right. He's trying to get to the final you know partial differential equation kind of solution on this very complex like number of. Lines you know going to the PCB is trying to get to that solution without doing this like super computing. Like task of like you know millions of like little interactions between all of these things right. And I think I think the models may end up using that anyway right they they they're not going to be they're not going to do the Maxwell equation internally. They are already not going to do that they're going to run a you know python or something else anyway. So I think in that sense what Sergey is doing and what I think Alpha fold and all of these things which are primarily scientific kind of differential equation solvers really in some sense are actually will just plug in an orchestrator in the end. Don't think the AGI in that sense is really that kind of orchestrator which can use all of these tools and not not necessarily doing the calculation internally perhaps. Well I certainly think it's going to start that way but I would point to image as an interesting counterpoint that I think at least shows where this could go right because we don't see in today's world. A language model purely existing at arms length with an image generation model and prompting it purely through text. Yeah we do see the unification of the visual and the language latent space. And I guess I have a hard time seeing why and there's obviously a timeline question. And my general philosophy is to like try to reckon with the possibility of shorter timelines and then you know if we have more time to deal with these things. Probably that'll be good. We'll take it. But like why wouldn't it be the case as we think about exponential compute. Yeah. That at some point all these latent spaces get joined together in some deep. Non arms length but like truly integrated way where the model can both reason about Maxwell's equations and recite them and call a calculator to run a certain version of them. But also like have an intuition that is potentially really powerful and kind of alien to us but sort of natively operating in that space. You know you can imagine a world where in the same way that I kind of know where my arm is you know that an AI just has an intuitive nonverbalized sense that like this trace won't work but this other one will work and it just kind of feels it based on everything it's learned and all the reinforcement that it's got. Similar similar to human intuition where we might not do all of the calculations but we get to a point where we make predictions which if we did try to calculate them would be horrendously complex but we make a educated guess anyway and we kind of get there. Right. So yeah I get the other catching a you know baseball is always the other example that I go to where it's like you're obviously not given the luxury of time to compute all the forces on the ball. But yeah you can just kind of reach your hand up and grab it or at least you know most of us can many of us can. So it's clearly possible to have that sort of intuition for see you know the at the crack of the bat I kind of know where I'm going. I see that happening in just an ever wider number of domains. And to me that's like the most like I said to circuit like that's the most likely form of super intelligence you can. I think they'll be. Outstanding reasoners you know and quite likely like superhuman reasoners in many respects. But when you combine that with that deep intuition of just like what will and what work and being able to sense that at a glance. And to do that across all these domains. From circuit board design to material science design to protein folding to if I perturb a cell in a particular way like what's the next. State of the cell going to be after I do that to dozens and dozens more this feels to me like where we really. Create something that is just a qualitatively different kind of intelligence and chain of thought goes away as a way to. Understand it you better hope that it's like being worth coming with you in the chain of thought because it doesn't necessarily need to. Is it really interesting work recently from Google about different architectures and how much. Work they can do internally before they have to externalize their thinking in the chain of thought. And the transformers like in some ways good there because it doesn't have as as opposed to like a late like a state space model. It doesn't have this sort of long term internal state that can update ongoing indefinitely right it has this kind of just finite context and there's like only certain traces. Causally where data can influence the next token so it has to externalize and that's great but it doesn't you know it notably doesn't have to externalize. The you know the nano banana model does not have to externalize how it's going to come back at you with that next image it just. Spits it out and then you're looking at it and I really here it is so yeah I really can't get off of that I guess in terms of why I expect some of this stuff to be. So hard for us to. Keep a handle on it would be very interesting to see it operating something like retail. Because I think the I you know I have some knowledge of retail and the number of strategies that I've heard of like one like for example one strategy in fast moving consumer goods. Is to go and get goods which are about to expire about to hit their sell by date from larger stores and then move them to smaller stores so the smaller store. Can often move the goods faster because it's moving in smaller chunks so they buy at a discount from a small from a larger store because if you have sell by date which is like two weeks. You know two weeks spending and you know larger store can't get rid of it they buy that and then they you know. Vend it in smaller chunks and they get a discount so because the retail margins are so thin. There's a number of strategies that people use which are really kind of like you're not going to learn in business school. Like you you you it's really like small scale vending. There's there's a lot of stuff that people do which you in business school you're like oh you know you have capital you have margins just go do this right you don't you don't go through this process of how do I get a larger like a 1% larger margin like how do I grind that out right. So. I don't know that whether vending will be the first place that you see it though I've always imagined that it would happen financial trading first. But or you know cybersecurity it's kind of happening right now but I've always imagined it would happen in financial trading. Certainly financial trading offers very fast feedback and verifiable outcomes in a way that. Programming does but not too many things do so it does seem like a very good candidate I guess the challenge there is probably. It's like the most secretive domain in the world right so. What would you I mean what comes to mind to me is like. This might I mean it's surely happening to some degree right like I don't know what Jane Street is doing but they're definitely training lots of neural nets. How much has this kind of already happened and. People are just keeping their strategies close to the best I assume it's got to be significant but this is one big blind spot actually for me because I've. Had a hard time finding anyone who wants to talk about it on the record. I a lot of what Renaissance and Jane Street and a lot of what they do. Is actually kind of standard standardized kind of models and algorithms etc. But they have a number of advantages number one they have a latency advantage because they always co-locate with with the exchange. And the latency advantage has been something which has been in play for more than like a hundred twenty years. People used to try to get a latency advantage over telegrams right like you know you would have the horse rider going on one way and then you'd send the telegram and then the telegram would reach first and then the pricing would change on the other side. Before the rider with the horse got there right so you're you've had over years like you know this latency advantage thing has been built out. I think the next upcoming one perhaps is already there is Starlink because if you have low earth orbit satellites potentially you can get a message from London to New York faster than you can through the underwater cable potentially again you need a bunch of things to line up. And that latency advantage even if you have the best algorithms even if the model is exceptionally good it wouldn't be able to beat the latency advantage because the other person is just seeing your cards before you play them. And for me that's been like that that demarcates how good the agent how much profit the agent can really make because there is a number there's a amount of profit in the sub one second range which I don't think the agents would ever get there without the co-location. And that kind of blocks you off and then besides that there are there's a lot of like data cleaning and that the Renaissance and Jane Street guys do and that is why they hire PhDs to do really data cleaning and because you need to understand that this data is actually going to have. Real impact on the financials and you can't just mess it up right so they do hire a lot of people to do very nitty gritty data cleaning work. And then finally you have the selection of the signals and the market making itself the AI assisted or algo assisted market making. I think people spend a lot of time on like oh they have exceptional algos and not a lot of time on like the infrastructure the data cleaning and all of this other stuff that has to come together for you to have a successful firm. And so I think what would be interesting is at some point is if the models or these model companies start to have their own co-location or own trading arms. To some extent Google DeepMind had one. Demis was starting off on this process but Google headquarters didn't like it because you could say that Google would have overwhelming advantages in terms of predicting stocks using all of the data that they have internally. Facebook too. But putting you in finance makes you very regulated and puts you in a lot of like where's the Chinese wall. What can people see what are people not allowed to see what can the are your systems like segregated are they not segregated enough. And I think the level and you know financial regulators are not technically that sophisticated. So they ask for things which are very clearly demarcated. They're like I want your entire group to move to another building and people like look we're already segregating the you know segregating the devices and all the data like what do we need to move to another building and the regulator doesn't care. Look I want you guys in a different building. Want you guys to have a different business unit. Want you guys to have different financing. This if this unit is regulated no one in this unit can talk to that unit. There's all of this stuff that goes on and financial firms exist as a as a function of that regular regulatory process. And I don't think I to this extent I don't think the firms want to submit themselves to that process yet. I doubt and I doubt you know some of these model decisions can clear like the barriers like is does the is the does the model have inside information. You don't know. Did it read some in fact. Was it trained on inside information. Was it trained on material non-public MMP eyes material non-public information at some point you can't you can't say for sure. And then that brings up a whole host of questions. So perhaps finance would be I think vending is actually easier. It's easier to take on Amazon than it is to take on Jade Street. And again you have the same infrastructure and you know information problems but it's a less much less regulated market than I think finance. How do you think about like more macro strategy though. I mean I I think the I don't know a lot about this but my general sense is there's like high frequency trading where the latency issues you described really matter a lot and are kind of a big part of who wins and loses. And then there's of course the more information and the more differentiated information you can have. That's always an advantage in any strategy that you're playing. But then there's this kind of other end of the strategy which is like a slow moving. You know I mean like to take the sort of canonical example right like Buffett and Berkshire Hathaway don't time their trades to you know microseconds right they like take a very long time. They take very long walks and you know deep thoughts and then they decide what big bets they want to place. And I do wonder if we're seeing that start to happen or if we if we will. I mean probably you would see more trades than like a Berkshire from you know a sort of global macro AI. But it does still seem like there might be. I would get I mean I don't know if you think this is wrong but I would guess that there's already a shift underway where all the big firms like an obvious enough thing to do that they would presumably be training large neural nets on all the data they can get their hands on and potentially driving more and more of their strategic decisions via the predictions of a model. Like is there a reason you think that wouldn't be like at least kind of late early far along in today's world. I sense I think every firm always tries and typically one of the things is that the market is like a multiplayer game. It's not a single player game right. And the thing is that when the. Number one there are certain profit pools available at every latency and at every size. Right. So it's not the same profit at the Buffett size as it is on the high frequency trading side. Right. So Buffett's profits are in that long run and in much larger size. But he also has a problem deploying capital at this point. Right. He's got a hundred and fifty billion dollars on the balance sheet. He's very unhappy with the choices that he has and he's just hanging on to that capital trying to wait for a proper market downturn before before he can deploy. So he's already capped out at his size. He's having difficulty finding investments at that size already. And any any firm that gets to that size will face the same problems he has which is you have a large pool of capital and you are you know perpetually you end up buying high if you decide to buy when you know when the market momentum is good. And you have to wait long periods for the market momentum to go down in order for you to be able to deploy large amounts of capital at pricing that you like. And I think one of the things that I'm sure the models do assist in like decision making but I'm not sure whether they have enough context because there's a lot of human context in the market. There's a lot of like kind of sensing when someone else is going to play and some when someone else is not going to play. If you're going to make a merge if you're going to try and buy a company you have to kind of know who else might bid against you in the United States at every capital size. There's a limited number of players right if you're going to you're going to do a ten billion dollar investment. There's only you know seven or eight players in the U.S. that can make a ten billion dollar investment or larger right. And then you know if you're an investment bank you kind of know all who all the players are and you kind of know the dynamics of who's talking to who and I'm not sure whether I investment banks have kind of CRM ERP systems but I'm not sure like all of the knowledge of like a managing director who has a 20 year relationship with like the head of KKR is fed into that. I'm not sure whether Elon has a specific banker at Morgan that he likes and that banker was working at Doge and was pulled out of Doge to work on the SpaceX IPO right. There are all these human pieces to it and you know the models will get there someday if you have full context right full context full 24 seven context on every single one of the players. Yes the models will eventually get there but at this point not quite there yet and. The players on the field make these very human decisions which are not quite caught up in pure kind of pricing metrics. Elon wants people who are going to hold on to the shares for longer. He wants people who are not going to sell immediately. He wants people who are going to commit to being there for the long term so he's willing to take lower pricing. He's willing to offer it to retail even if other bidders are higher. He wants to place it among the same kind of Tesla fan base. There's all of these questions all of these things that you know people all of these intentions that people have which they express through the process. And I don't think the models kind of capture all of those things quite as of yet. Eventually they might but not quite as of yet and for the macro that's where all of the human play comes into play right like people are much more concerned about. Their own eagle long term kind of long term strategy did not. That you know at once you have like ten million dollars like you're not really concerned about am I going to get another hundred grand by screwing over Elon right it doesn't it doesn't matter anymore you have like reputational risks and. Other things that you're concerned about and in fact people who do screw over people in these iterative games get bad things happening. One of the reasons I think that Lehman Brothers went under is because in a previous instance Lehman refused to participate in a bailout for another firm. And Hank Paulson remembered that and he's like well we're not going to participate in a we're not going to bail Lehman out like Lehman can go do what they do and Lehman failed. And you know Dick Ford always said like look this is because of a personal issue this is not because that Lehman should have failed Lehman could have been like Goldman could have been saved by Buffett but Paulson was unwilling to back the firm as Treasury Secretary. So I think there's a bunch of these things which are very human very personal at these at these larger sizes and the macro you so you can't just make a macro bet at the larger size. There are all these human negotiations it's more like personal you know Buffett went in to banking he refused to you know back Washington Mutual but he decided to back Goldman because by the time they got to Goldman he knew Hank Paulson was Treasury Secretary he knew Goldman would get bailed out. So he he before he went to Goldman he had that sense and then he put the money in I don't know whether he had discussions perhaps not but yeah he had some idea that Goldman would at least would get bailed out. So I think there are all these things that are not captured yet as all of this tacit knowledge. I think the same with PCB layouts is all this tacit knowledge and the economy is particularly difficult because there is no like case where you can compare the same event under different circumstances. Every single event is a unique event and your actions in this event affect the actions that people take in the next event in the next period of time. It's tough like time series it's tough. Let's see. Let's see what happens. I mean when I hear all this should I understand it. I think one way to parse what you're saying would be to say there's a lot of human barriers to adoption at existing firms. There's also some scale at which you are not just a term taker but you're actually a market mover and so that is like kind of inherently a challenge for a sort of big data blind optimization approach. But flip side of that I think would be to argue that you know in the sort of vein of like your margin is my opportunity like all those things that you're describing define the opportunity for at least like smallish to moderate sized funds to just work in a very blind way that like doesn't care about reputation and you know because you can't really punish some purely neural net based trading algorithm right like it's I mean I guess we have a eyes that beat people at poker right we have superhuman no limit hold them players. Yeah so if we have that I kind of am like why are they so good well one reason is they don't really fall into the same bias traps and predictability traps and you know you're wanting to you're having a grudge against some other player at the table or whatever that's kind of moving them off of an ideal strategy. So I hear all those things as kind of both why it might be slow to happen but also why these strategies can win when they finally do come online. I think we will get her we will get there but right now I have difficulty with context really it's really is a question of capturing the entire context. And I don't know what what is the end point because we're already transcribing a lot of meetings right so the meeting transcription process has started. I think we will eventually have met as eyeglasses or apples eyeglasses or whatever will capture even more you and you can you can get sentiment analysis from a face right you can see whether someone is disturbed or angry or excited and so there's a lot of data that you can get there. And I think all of that data can be processed and can yield useful signals in business but we're still like a long way from the amount of data extent of data capture that you know might might be necessary. So I don't know how we get there without the data capture that's what I'm saying I'm sure the like I said like I'm sure the the algorithms that will define the future already kind of exist the compute that for that future already exists. But the data collection and the context that is necessary is not there it's not there in you know cancer drugs it's not there like we just don't have the data we do have the algos but the algos can't be fed without the data. And I feel that's that's the issue the con the full context is not not yet there. Yeah so in other words too much information is private. Yeah for. Non recorded tacit knowledge. Not captured anywhere which is kind of why like a lot of these jobs you require an apprenticeship right you start off you have a college degree in economics or banking or whatever business and then you kind of join a firm and it takes you like two to five year apprenticeship under someone in order to kind of like figure out what's really important in the market what's not you kind of you know you're kind of figure out that you know whatever the Wall Street Journal tells you is the final word not the initial word and you're in the process before that final word gets published so you need to your your acting prior to the final word so to speak. So if you've already read it on Wall Street Journal is too late basically you're not you know you're it's already done and there's all of this pre pre publication like stuff that you know you need to learn at the firm and I think that is. You know if we can capture that apprenticeship process in data. Yeah then then I think you can you can start to migrate you know some of this stuff some of this you know decision making process into the models. It may it may happen very quickly right it may it may just be like the model all of a sudden says like oh I remember everything now and I can learn anything so you know just put me in you know put me in coach put me in the put me in the room and. You know let me in for like five days and I understand everything and I can help it could be that simple we just cleared the hurdle in the next like 12 months and that's it it's done and we don't have this whole like nitty gritty data collection data cleaning process could be so. Even the long time lines have got very short that's. You know I think I think we this week might be the spot release I think there's some there's some you know opening eyes very quiet post mythos. And there's been some signs from the codex team that they can beat the mythos sweet sweet bench benchmarks yeah let's see. All right well we'll be back before too long and I'm sure there'll be no shortage of things to talk about. So. Yeah. Up. Up. Morning Morning, sun come creeping slow News on the wire you need to know Man who built it signed a dred Same hand bringing what he said One in twenty all lights out That's what they own self talking bout Ring of power watch the good man turn Ain't no flame too wise to burn Don't pick up no stone pick up the phone This world's bigger than you alone don't look up Don't look away you're just learning right to be a friend Light the signal not the flame Some will walk right through the fire Tread the comfort for the higher Some will close their eyes and laugh Till the silence comes And the light of the day Till the silence cut them in half Same short clock same short time Every choice now gonna show your mind Match in the hand don't mean you break Hunger for the truth is what you said Don't pick up no stone pick up the phone This world's bigger than you alone don't look up Don't look away you're just learning right to be a friend Light the signal not the flame Light the signal not the flame Right to rip walk the peaceful line Talk to the neighbor across the sea brine Alignment scholar gubbling in stream Every honest hand a part of the team Don't pick up no stone pick up the phone This world's bigger than you alone don't look up Don't look away every one of us got work today Light the signal not the flame Light the signal not the flame Work today Work today The cognitive revolution is part of the Turpentine Network, a network of podcasts which is now part of A16Z, where experts talk technology, business, economics, geopolitics, culture, and more. We're produced by AI Podcasting. If you're looking for podcast production help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement at AIPodcast.ing. And thank you to everyone who listens for being part of the cognitive revolution.