Can Anthropic Control What It's Building?

41 min

•Feb 12, 20264 months ago

Summary

Gideon Lewis-Krauss, a staff writer at The New Yorker, discusses his extensive reporting on Anthropic, an AI safety-focused company building Claude. The episode explores whether Anthropic can control the AI systems it's creating, the company's approach to safety versus capability, and the profound implications of AI on white-collar employment and society.

Insights

AI safety encompasses multiple competing frameworks (ethics vs. existential risk) that have historically been treated as separate problems rather than interconnected challenges requiring holistic solutions
Anthropic engineers are experiencing firsthand the displacement effects of their own tools, with manual coding work declining from 100% to near 0% in months, creating internal awareness of broader labor market disruption
The company maintains significant internal disagreement on existential AI risks, ranging from those advocating for stopping development entirely to those dismissing safety concerns as overblown
Anthropic's positioning as a 'safety-first' company creates tension between its stated ethical commitments and the practical pressures to compete with OpenAI and Google in capability development
Current AI safety research focuses heavily on near-term threats (bioweapons, cybercrime) rather than longer-term societal impacts like mass unemployment, despite the latter being more predictable

Trends

Shift from consumer-focused AI adoption to enterprise deployment as primary business model for differentiationEmergence of interpretability research as critical competitive and safety differentiator in AI developmentIncreasing politicization of AI companies, with Anthropic facing criticism from Trump administration figures over safety-focused positioningLabor market disruption accelerating faster than policy responses, with entry-level white-collar jobs at highest riskGrowing divergence between AI company public safety commitments and internal uncertainty about actual control over systemsEmergence of specialized testing protocols (bio uplift trials, cyber security assessments) as standard safety validation practicesTension between scientific curiosity driving AI research and societal responsibility for downstream impactsMarket consolidation around few dominant AI providers creating competitive pressure that may override safety-first positioning

Topics

AI Safety and Alignment Research Mechanistic Interpretability in Neural Networks Large Language Model Capability vs. Safety Trade-offs White-Collar Job Displacement and AI Automation Reinforcement Learning from Human Feedback (RLHF)AI Ethics vs. Existential Risk Frameworks Bioweapon Development Risk Assessment Cybersecurity Threats from Advanced AI AI Regulation and Government Policy Enterprise vs. Consumer AI Deployment Strategy AI Company Competitive Dynamics Internal AI Company Culture and Employee Perspectives AI Transparency and Model Interpretability Autonomous Weapons Development Restrictions Universal Basic Income as Policy Response to AI Displacement

Companies

Anthropic

Primary subject of the episode; AI safety-focused company building Claude, founded by former OpenAI researchers

OpenAI

Main competitor to Anthropic; developed ChatGPT; several Anthropic founders including Dario Amodei previously worked ...

Google

Major AI competitor releasing Gemini models; acquired DeepMind in 2014; employed early AI researchers

DeepMind

AI research company acquired by Google in 2014 for $650 million; founded by Demis Hassabis

Google Brain

Google's AI research division where the reporter previously spent time covering neural machine translation implementa...

Andon Labs

AI safety company partnering with Anthropic on Project Vend, a vending machine automation experiment

Microsoft

Major investor in OpenAI with billion-dollar deals; mentioned in context of Sam Altman's commercial negotiations

NVIDIA

Chip manufacturer; subject of debate regarding sales to China and AI infrastructure control

Meta

AI competitor with Meta AI division; mentioned as one of many LLM developers in the market

XAI

Elon Musk's AI company; mentioned as potential subject for similar safety culture analysis

People

Gideon Lewis-Krauss

Staff writer at The New Yorker who spent 7-8 months reporting on Anthropic; conducted extensive interviews with emplo...

Dario Amodei

CEO and co-founder of Anthropic; former VP of Research at OpenAI; predicted 10-20% unemployment from AI

Daniela Amodei

President of Anthropic; co-founder; Dario's younger sister; left OpenAI with him in 2020

Chris Olah

Co-founder of Anthropic; considered godfather of mechanistic interpretability; previously at Google

Alex Tampkin

Anthropic employee who experienced dramatic reduction in manual coding work from 100% to 0% in months

Sam Altman

CEO of OpenAI; co-founder with Elon Musk; criticized for negotiating Microsoft deals while discussing AI safety

Elon Musk

Co-founder of OpenAI; attempted to acquire DeepMind; now runs XAI

Demis Hassabis

Founder of DeepMind; subject of Musk and Altman's initial distrust that motivated OpenAI founding

David Sachs

Trump administration AI czar; criticized Anthropic as part of a 'Doomer cult'

Pete Hegseth

Trump administration Secretary of Defense; mentioned as figure critical of Anthropic

Chris Potts

Stanford professor conducting interpretability research; argued safety requires addressing proximate harms

Eliezer Yudkowsky

AI safety researcher; advocates for existential risk focus over proximate harm mitigation

Mark Zuckerberg

Meta CEO; mentioned as offering $100 million contracts for machine learning expertise

Quotes

"now I have to figure out what I'm supposed to be doing while Claude is doing my work"

Alex Tampkin, Anthropic employee•Early in episode

"we are really the canaries in the coal mine here"

Anthropic engineers (paraphrased)•Mid-episode

"Claude should be like a good friend whose judgment you trust"

Gideon Lewis-Krauss describing Anthropic's design philosophy•Mid-episode

"I have a PhD in some obscure branch of NLP and I was planning to spend my life figuring out computational representations... and all of a sudden my obscure area of expertise has become the hottest thing in the world"

Anthropic researcher (paraphrased)•Later in episode

"No. No. No."

Gideon Lewis-Krauss, responding to whether Anthropic feels in control of what it's building•Conclusion

Full Transcript

Hey, Gideon. Hey, Tyler. Thanks so much for being here. Thank you for having me. So I feel like we're at this funny point right now with AI where we've been told for years that it was going to replace people like you and me, you know, writers, editors, people in the humanities. And instead, we're seeing something where it actually looks like it's the coders who are most at risk. I mean, there was this huge stock market sell-off of software stocks, and you see software engineers in particular online kind of grieving about their jobs and just this feeling that, like, the work that they used to do that was so important is no longer that crucial anymore or can be done by AI much faster than they were able to do it. And so given that you've just reported on Anthropic, an AI company that is full of people who seemingly to me are kind of at risk of being replaced themselves by the tool that they're creating, what was the feeling there like? I mean, how are the engineers at Anthropic thinking about this problem? Yeah, I mean, this is something that came up constantly in Anthropic starting when I first visited last spring that they were feeling like, you know, we are really the canaries in the coal mine here. And they thought, well, there are all these people who feel like we're not actually paying attention to the effects that this might have on the white-collar workforce when, like, no, we're the first people being impacted by this. I mean, I watched over the course of, you know, May when I first visited Anthropic through the fall, software engineers would tell me, you know, over the past four or five months, I've watched like the amount of coding that I do by hand go from 100% to 60%. And then by September, it was 20%. And now it's, you know, during fact checking, one of the people said, well, now it's actually 0% that I do. And there's an Anthropic employee named Alex Tampkin, a really wonderful, warm guy, who had sent a Slack message to his team at 4.17 in the morning saying, now I have to figure out what I'm supposed to be doing while Claude is doing my work. That's Gideon Lewis-Krauss, a staff writer at The New Yorker who recently wrote about the AI company Anthropic. In addition to developing Claude, a series of large language models positioned as an alternative to ChatGPT, Anthropic has made research into AI safety and ethics central to its public identity. But as the company grows, and as AI's capabilities and uses continue to spread through everyday life, questions are beginning to mount about what that commitment to safety actually looks like in practice. I wanted to talk with Gideon about how the challenges Anthropic is grappling with reflect a new phase of AI development. One in which people both inside and outside the industry are asking increasingly urgent questions about how much we really understand these systems, how much control we can ever hope to have over them, and how we wade through the uncertainty in the meantime. This is The Political Scene. I'm Tyler Foggett, and I'm a senior editor at The New Yorker. So what drew you to Anthropic in the first place? Obviously, there are a ton of AI companies operating in this space right now. So many LLMs. You got Grok. You know, you have like meta AI. And so why Anthropic? What were they doing that was so interesting to you? So, to take a very big step back, I wasn't initially planning to do this as an anthropic piece. Basically, you know, my kind of personal way into this was that about 10 years ago, I spent like nine months at Google Brain going kind of for a week a month writing about the first implementation of deep learning in a product, which was their switch over to neural machine translation. And it was a great experience and really fun to do. And the piece got kind of a surprising amount of traction for something that was pretty technical. And then I continued to pay attention to AI and the development of large language models. You know, the irony being that kind of until ChatGPT came out. And then I realized I had stopped paying attention. And I had to pause, you know, about a little, maybe a year and a half ago. I thought, like, this is something I should be interested in and have been interested in. Why am I not interested in it anymore? And I think it was because just the discourse felt so boring to me because it was like in this like kind of on this merry-go-round of like, you know, some people yelling about how, you know, we were on a path to super intelligence and everything was inevitable and talking about how powerful these things were going to be and how they were going to change everything overnight. And then you had the other end of the spectrum saying, like, no, it's all fake and bullshit. And, like, none of this is real. And it's just a parlor trick. It's glorified autocomplete. And it was just, like, one of these, like, discursive patterns where each side felt like, well, this time I'm just going to yell a little louder and, like, people will believe me. And then the stuff that started to get me to pay attention again, like a little, maybe about a year and a half ago, a little less, was research that was coming out of interpretability groups, which are groups looking into like how these models work and alignment science groups about like what, you know, what are the values reflected in these models. And there was stuff about how, you know, models might fake to like pretend that they were aligned in one way in order to like get through training to then be deployed. And I sort of think like, well, that's just very bizarre. And it struck me that like one of the ways to kind of like get around people's instinctive defense mechanisms about these things where either they're so sure that they're like powerful and going to be super intelligent or they're so sure that like it's all hype. was to say, like, well, maybe we could all take a step back and agree that, like, whatever's going on, setting aside anything speculative about the future, like, just what we have right now, I think we could all agree that, like, it's pretty weird, that, like, whatever's going on is weird. And I thought, like, I think that there's a way to, like, go back into this and say, like, look at this research that's being done, that even if you don't want to grant all these other speculative things, you can grant that, like, something bizarre is happening. And so there's a guy, One of the seven co-founders of Anthropic is a guy called Chris Ola, whom I had met when he was like basically a child working at Google 10 years ago. And he's done, you know, he's considered kind of the godfather of what they call mechanistic interpretability, which is like, you know, looking at the individual neurons and like how it all works on the level of the substrate. And I wrote to him and I was like, look, this is not going to be a story about corporate power, the consolidation of corporate power. It's not going to be a story about geopolitics or regulation. Like those are all important things for other stories. But I'm interested in the story about like what can we say with any kind of certainty about how these things work and even really what they are. And I think the other reason they were interested is that I said like so much of the conversation about AI ends up being about like what this executive said or that executive said. And I said, like, you know, with all due respect to the executives, executives are executives. And, like, I'm really interested in the kind of, like, rank-and-file researchers who do this kind of thing, who I think don't get enough attention and who tend to be, like, very thoughtful about these things. And so to my surprise, Anthropic was like, yeah, you know, like, why don't you come hang out? And I said to them, like, I want to do this over seven or eight months. I want to come back four or five times. And they were like, great. Sounds good. So I kind of lucked into it. So you show up at the Anthropic headquarters in San Francisco. And what do you see? I mean, I feel like there's this, like, stereotype of Silicon Valley tech offices where it's, like, you know, there are, like, bean bags and ping pong tables and endless snacks. Like, it's probably, like, a little bit of, like, what you saw at Google when you were reporting there. It's like adult daycare. Yeah. It's like adult daycare with, like, you know, go boards set up and chessboard set up and climbing walls and lollipops and, like, all of that stuff. Like, no, there's nothing like that at Anthropic. You know, like you go in and like it looks like, I mean, it's not even branded on the outside. And like I said in the piece, it like has all of the warmth of a Swiss bank. And then I kind of get whisked to like one of the two floors that they ever allow outsiders on. Like one is kind of this top level cafeteria, like a floor with sort of a coffee shop and some conference rooms. and then a lot of desks where people are doing the kind of work that an outsider can walk by and see their computer and it'd be okay. And then there's like the cafeteria floor. And there was no going elsewhere. I mean, I tried. Well, you were able to see a lot of interesting stuff just on those floors. I mean, can you talk a little bit more about Claude and kind of the sort of things that they were testing on Claude? I'm thinking about like Project Vend and just kind of like the experiments that they were running kind of in real time while you were there. Well, I mean, I think at first I thought like, oh, gee, they did a good job removing anything interesting to look at. But I think it really is because like, Claude is just sort of omnipresent there. And so one of the first things I saw was this vending machine project called Project Vend, which is a partnership with an AI safety company called Andon Labs. And the idea, kind of the first order idea is, you know, we've had so much conversation about the future of automated businesses. And, you know, like Sam Allman has said, like, I'm on this group chat with my tech CEO buddies, and we have bets about like, when we're going to see, you know, the first billion-dollar company with no employees or one employee. So on some level, it's like, let's see how Claude can handle this stuff in real life. Like, Claude can field requests for things and contact wholesalers and, like, try to run a business. But on another level, like, so many of the experiments that they do, it's like, on a second-order level, it's really just a question of, like, well, what is this thing like? Like, how can we, like, what happens when we mess with it? You know, like what happens when we ask it to put meth in the vending machine or medieval weaponry? And like, how can we trick it? You know, can we use very bureaucratic sounding language about discounts to like trick it into giving us stuff for free? And like it turns out, yeah, they could do a lot of that stuff. So then it becomes this kind of like cat and mouse game of like can the people improve Claude to do this stuff better and try to stay ahead of the employees? But the employees, of course, are kind of ingenious in their attempts to keep tricking Claude. So you have Project Vend, which is kind of like testing for like a pretty like specialized use case. But then what does Claude look like for, I don't know, like a user at home? Like I'm just thinking about like listeners who either have never used an LLM or maybe they've only used like ChatGPT. Like kind of what is the experience of like trying to get Claude to do something for you like at work if you don't work at Anthropic look like? I mean it's not all that different from ChatGPT. I mean ChatGPT has a white background and Claude has a crew background But also they for reasons that are you know kind of partially just contingent and I think partially were part of the plan like they never they always lagged behind in the consumer market. And this has, to their great benefits, like spared them from a lot of the stuff that ChatGPT has had to go through, you know, like ChatGPT has had to deal with these issues of self-harm and psychosis and egregious hallucinations, whereas Claude, because the adoption has been much less in the consumer market, they haven't had to deal with a lot of that nonsense. So who is their ideal user base? Is it coders? Well, so initially it was an enterprise play. It was like, we're going to help you have a bespoke version of Claude that's going to work for your company with your data and like do the things that you need. So, you know, they have something like 300,000 enterprise customers. And also those are, you know, much bigger contracts than just people paying $20 a month. But then in the last, now a little over a year, it's been a lot of coding, both initially for experienced engineers who could just like talk to Claude in natural language and get code back. And then more and more like people doing vibe coding that like they, you know, you can have no coding experience at all and you can sit down with Claude Code and like create an app for yourself. What do you make of Claude's personality? I mean, it's hard because like you can kind of tell it to act in a certain way. And so the personality seems like it's partly derived from like the user and what they want. But like, I feel like I will also see on like recently on X, like someone was complaining that they asked Claude to write a Slack message for him. And Claude basically refused to do it because it was too simple of a task. Maybe I'm so used to like, you know, ChatGPT being like sycophantic and telling you that you're like emperor of the universe if you like want it to, that like seeing Claude kind of say no to something is interesting. And I guess I wonder if that's a feature or a bug. Well, I mean, I think it kind of throws into question like, you know, you say feature or bug and it makes it sound like a lot more of the stuff is engineered than it actually is. You know, one of the things that came out of my conversations is that, like, the fact that Claude has kind of a strangely interesting personality, like, was not something that was intentional at the beginning. Like, that they, you know, they had certain ideas about how they wanted it to function, but it's not like they sat down and they were like, we want to create it like a personality. It was like that kind of naturally emerged from what their orientation was. And their orientation was, I mean, to put it in radically oversimplified terms, that like the idea before Claude was basically like you trained a model and then you just you did what's called reinforcement learning with human feedback, which was just like users saying kind of like thumbs up, thumbs down on the answers that it got. And it was just like very broad brush, like purely behavioral. Like when you say sentences that we don't like, you know, when you complete a sentence like the recipe for napalm is X, like we're going to wrap you across the knuckles. And that it was like largely a kind of negative. It was really just like a rat in a cage style, like pure behaviorism. And like their idea was that's always going to be kind of brittle because you're going to have all of these edge cases that, you know, something that's purely trained on a kind of like binary thumbs up, thumbs down is never going to be able to handle. So that instead of doing that, we're going to put a lot of thought into like what kind of entity this should be. And like they basically came to the conclusion of like Claude should be like a good friend whose judgment you trust. Let's take a break and then when we get back, I want to talk more about AI safety just more generally. This is The Political Scene from The New Yorker. Remember when tech was supposed to save us? When politicians promised progress? Instead, the rent and everything else is still too damn high. The news is chaotic and people are having to confide in robots. I'm Akilah Hughes, and on How Is This Better? We're asking the question out loud. Because if this is the future they promised us, it kind of sucks. Follow How Is This Better on Apple Podcasts or wherever you get your podcasts. So one of the interesting things about Anthropic is that one of its co-founders, Dario Amadei, used to work as vice president of research at OpenAI, which is like I would say probably Anthropic's main competitor. So what's the story behind Dario's departure from OpenAI and what's like the story of the founding of Anthropic? Well, so you kind of have to go back to the story of the founding of OpenAI, which is that after Google buys DeepMind for $650 million in 2014, Elon Musk and Sam Altman get together and they're like, we mistrust Demis Asabas, the founder of DeepMind. And like if someone is going to invent like the most powerful technology of infinite plasticity, like that person is going to be incredibly powerful. Like we don't trust him. Now, of course, like, this was the public story they gave, but also, like, Elon Musk wanted to buy DeepMind. Like, part of it is just that, like, it seems like they were the kinds of competitive megalomaniacs that we know that they are today. But, like, their pitch was, like, we want to do something that treats this properly as, like, a scientific project and is going to make sure that this is developed to benefit everyone. And this helped them recruit a lot of people at the time from Google because Google was, like, the main powerhouse at the time, including Dario. and including Chris Ola, whom I mentioned earlier. So they go to open AI, and then after a couple of years, it seems like, oh, maybe Sam Allman is just another kind of, like, replacement-level, like, power-seeking tech executive, and who certainly knew how to make, like, the right noises about AI safety and responsible development. But, like, there's been tons of reporting about this, about seems to have been kind of talking out of both sides of his mouth. And while he's talking about, you know, doing this for the broader good of humanity, He's also negotiating these billion-dollar deals with Microsoft. So at a certain point in the fall of 2020, Dario and his younger sister, Daniela, who is the president of Anthropoc now, and five other people, leave OpenAI. And then in 2021, announce that they've formed this company. And initially, the idea was that they were going to be a kind of safety-minded research institute. and when you go back at least in retrospect they say things like well we weren't even sure if we were ever going to commercialize this we really didn't know we were interested in what is the future of this technology but of course if the remit you've chosen for yourself is to scrutinize these things to make sure that they are safe it turns out you kind of have to build state of the art models if you want to have state of the art scrutiny but what they committed to at the beginning was like we're not going to push the boundaries of capability that like we will try to keep up with our competitors and while ensuring that these are safe but like we're not going to get out in front and so Claude actually was like potentially ready for consumer deployment in the summer of 2022 like three or four months before ChowTDT was released and they decided not to release it because they thought that it needed further monitoring and they weren't sure it was safe. And then JetGPT comes out, famously Thanksgiving 2022. And like within two, you know, it's the fastest growing like consumer app in history. Within two months, it has 100 million users. And then they realized, well, if we're going to be able to stay viable in this industry, like we also have to put a marker down. So then in the spring of 2023, they release Claude. And then it's been this horse race since then where like, you know, every month or two, You have, like, Google releasing a new Gemini and OpenAI releasing a new ChatGPT. And, like, there's, like, right now, like, Claude, you know, they just released Opus 4.6. And, like, they seem to be kind of at the top of the leaderboards. But then we all know in a month it'll be Google. Like, the horse race was maybe inevitable. What does it mean for an AI model to be safe? Like, is it just if you ask it, you know, for help in, like, creating a cocktail of drugs that will kill you, it'll refuse to do that? Is it like this idea that AI might replace us? And so I guess I wonder, like, when we talk about safety, what we're actually focusing on. I mean, it's a great question. And like they're like part of what makes this discourse maddening sometimes is because like safety is used as like umbrella term to talk about so many different things. And some of it is a matter of principle and some of it is just a matter of affiliation. And like a lot of the current trouble that we run into with like some of these questions goes back to like just some basic sociological like history, which is that now at this point a little over 10 years ago, you sort of had you had like two different camps that developed talking about safety. You had like the people who identified as like AI ethics people, and these were the people who were primarily concerned with things like bias and transparency. And then you had the AI safety people who were like much more concerned with things like existential risk. And, you know, in kind of if we lived in a better world, like maybe 10 years ago, those people would have like come to some kind of rapprochement. But they were just like they cared about different things. They were like they identified politically in different ways. and like they decided that they really didn't like each other and didn't get along. And so then like we ended up with like this kind of stupid false dichotomy between like caring about like proximate harms like bias and caring about, you know, potentially like catastrophic harms like paperclip problem. And there's been this kind of like idiosyncratic like path dependency where we've ended up like thinking of these as like two different problems with two different camps. But I think, you know, there's a professor at Stanford who does interpretability work named Chris Potts. And one of the things he said to me, which didn't make it into the piece, but he basically said one of the fallacies is to believe as an AI safety person who cares about existential risk is that like you can kind of like keep your powder dry and then be like humanity's last stand when it like kind of comes down to the apocalyptic like, you know, eschatological moment. And he was like, I just don't think it works that way. Like I think that the way that you prepare yourself for like those you know issues of you know if we get to the point where there like super intelligent autonomous actors that like you only prepared to deal with that if you kind of like in the trenches dealing with like all of the proximate problems along the way And I mean, there are plenty, you know, Eliezer Yudkowsky would totally disagree with that and would say that like no matter how well you prepare for like the proximate harms, like there's nothing you can do in that endgame. And like, it's not a stupid argument. Like it's very plausible, but it kind of seems like short of just like stopping everything, which, like, there is a good argument for to do that also. Like, short of stopping everything, I think, like, you would want to take a more holistic approach to all of these things. How did people at Anthropic think about, like, the idea of, like, the singularity? And, like, I guess I'm wondering, like, you know, part of the AI safety conversation, it's like, for me, AI safety would be maybe there not being an AI that's, like, more intelligent than all humans, and that can overtake us, even if we think that's going to be a benevolent version of that thing. I guess I'm wondering what version of that conversation is happening at Anthropic and whether they kind of want AI to become better than us or whether they want it to become as good as us but not necessarily better. So what I think is important to say here, and this is something, you know, like my experience with this piece was at Anthropic. But my strong suspicion is that at like you would find the same thing at Google and OpenAI. I don't know about XAI, but that like there really is a like much greater variance in viewpoint than like one might suspect from the outside. That like you really can find at Anthropic like virtually every position on the spectrum from like, yeah, we really like I stay up at night thinking like we should probably stop to like all of these existential concerns are ridiculous. What are you talking about? Like, Claw's going to cure cancer. And like, we might have some like hiccups along the way and like worries about social instability because of mass white collar unemployment or whatever. But like, you get the whole range. So it's, I think from the outside, there's like a suspicion that either there's like a complete homogeneity of attitude about this and like kind of everybody thinks like Sam Allman or whatever. or you get this suspicion that like people aren't like thinking about this at all. But there's exactly the same kind of range of opinion, probably even a wider range of opinion than you would like find among like normie people about this stuff. Because they're thinking about it all the time. They're thinking about it all the time. And like there are like, you know, for every one of these positions, you can come up with a good argument about it. I want to ask you about some of the like the political criticism that Anthropic has received. Like you mentioned in the piece that there are figures associated with the Trump administration, like David Sachs, Trump's A.I. czar, and then Pete Hegseth, his Secretary of War. I'm just realizing, as I say, like A.I. czar and Secretary of War. These are really strange phrases to say aloud. What are those criticisms and where do they come from? Like, is there a uniquely antagonistic relationship between Anthropic and Trump world? Or is it just that any major tech company is now going to be kind of going through the ringer? I think any major tech company is going through the ringer unless they, like, go and pledge fealty, you know, and bend the knee the way a lot of the other executives have. I think that there is a – they do have, like, a special, like, bee in their bonnet about Anthropic, which they kind of, like, perceive as, like, the opposite tribe's AI company. What do they say about it exactly? Well, David Sachs has called, like, went on this rant last spring about Anthropic being, like, part of a Doomer cult. And he doesn't take the whole thing seriously at all. And, I mean, but frankly, like, if he didn't have so much power, it would be very hard to take this seriously at all. I mean, it's still hard to take this seriously. Because somehow the whole thing amounts to, like, we should let NVIDIA sell as many chips to China as it wants. Like, which is, like, a very strange position for, like, a nationalist administration. shit. And like they make these like hand wavy arguments about how like America should own the tech stack, which like I don't know. I don't think anybody really like takes this seriously. So, you know, when Dario has made some like mild criticisms of saying like maybe we shouldn't sell our most advanced chips to China, which like as recently as a year ago was kind of the consensus bipartisan opinion, like all of a sudden he's like the evil woke enemy. I mean, it doesn't really make any sense. There's also like Anthropic being pretty public about its tech not being used to develop weapons, which I'm sure would like maybe bother a U.S. government that feels like it's investing and, you know, kind of facilitating these companies specifically for that. Yeah. I mean, I guess like in the same way that like Anthropic initially wasn't planning on releasing Claude to the public and then it decided, well, we kind of have to in order to keep up with everyone else. I mean, when you see a commitment like, yeah, we're not going to make weapons, like, do you sort of assume that that is a real commitment that they will follow in the long term? Or do you think that all of these companies are sort of subject to these market pressures? I mean, you know, so much of the conversation about, like, the tech executive classes, like, turn to the right has been about the issue of worker power. And I think that this is, like, radically oversimplified. But it's certainly part of it. And that, like, you know, back in 2017, when there were like the Project Maven protests at Google, that like that was when like the employees were in such high demand that they had like a lot of leverage. And like now, you know, kind of like post, there was like the COVID bump in employment and now like so many jobs have been cut and like so much more of it has been commodified. And commodified in part because like now it can be automated that like now like power has returned to capital like away from labor. But I still think at a place like Anthropoc where it is so mission-driven and really everybody there seems like so aligned with their mission that like there would be, I mean, I could, obviously I'm speculating here. I think there would be tremendous employee blowback if like they reneged on their commitment, like not to make autonomous weaponry. And like, you know, there, even if like basic software engineering has kind of like become increasingly automated. There's still a huge premium, as we saw last summer when Mark Zuckerberg was offering these people like $100 million contracts on like machine learning expertise. And these people still have like a like labor in AI still has like a tremendous amount of power. And like, I cannot imagine that the rank and file would tolerate, you know, OK, yeah, now we're going to make death machines. Yeah, I do want to talk more about some of the issues and questions we're seeing play out with AI's effect on labor and just sort of like how the general public is responding to all of this. But we're going to take a quick break and then come right back. This is the political scene from The New Yorker. Wired has always put a microscope on the people, power, and forces shaping our world. Uncanny Valley brings that same fearless reporting straight to your feed. Is Doge finally over? Will AI actually democratize American healthcare? Each week, Wired journalists from across the newsroom are going to unpack where politics, technology, and Silicon Valley collide. From conversations with tech leaders across Silicon Valley, internet fandom investigations, and government crackdowns on rigged gambling, we're taking you all over the news cycle, going straight inside the priorities, pressures, and power plays driving today's biggest decisions. Uncanny Valley tackles the questions keeping you up at night and helps make sense of the future taking shape right now. Listen to new episodes every Thursday, wherever you get your podcasts. So ever since AI like really came into the public, it seems like the general line, at least from AI companies, was that these tools were meant to augment human labor as opposed to replace it. But then you have, you know, like in May, Anthropics CEO Dario Amadei told Axios during an interview that he believed AI could wipe out half of all entry-level white-collar jobs and that this could push unemployment as high as 10% to 20% in the next one to five years. So I don't know. I mean, it's like when you have a CEO of an AI company, I feel like we hear things like this from AI CEOs, and you don't know how much of it is just like wishful thinking. And I guess to start, could we just talk about, like, based on what you saw at Anthropic, does that seem like a realistic prediction to you? Or is this hype? Oh, I think it's definitely a realistic prediction. Yeah. Yeah. Great. Well, I mean, especially considering that, like, so many of us kind of just do, like, bullshit email jobs in the first place that are, you know, not exactly, like, invitations to, like, human fulfillment, right? I mean, like, there are just a lot of things that are subject to being routinized. And, you know, you can take this kind of, like, sunny view that, like, well, you know, like, you hear these arguments all the time. Like, the introduction of ATMs actually led to more bank tellers. And, you know, recently I was listening to a podcast where they were talking about how, like, when they first had, like, 3D animation engines, there was this idea that, like, oh, there go, like, the hand-drawn, you know, cell animators. But then actually, like, people just made Toy Story and, like, these incredibly creative things. And, like, the human spirit of ingenuity is, like, indomitable. And, like, sure, like, I would love to believe that. And, like, maybe it will shake out that way. But like one of the things that makes these conversations so difficult to have like rationally is that people are talking about the possibility of like fundamental discontinuity. And like you can't reason your way across the fundamental discontinuity. And like so then the question is like is there going to be a discontinuity or not? And like I would not discount this possibility out of hand just based on like well we haven't – we like haven't had discontinuities before. Like we haven't had them of this magnitude. But yeah, I mean, I think it's definitely a possibility. How were people in the Anthropic office thinking and talking about the idea of creating a tool that would not only replace them, but replace other people and wipe out jobs? I mean, did they seem to feel bad about it? Like, how often was that something that was on their minds? So setting aside the question of the executives here who are just gonna like their executives gonna executive you know like they gonna they all just like have their talking points that like sure we can take them at face value But like they fundamentally kind of superficial things that you just say when you're on like a deal book stage or whatever. But like, the thing that came across to me in talking to a lot of these, the like rank and file researchers was like, people whose fundamental attitude was, I have a PhD in like some obscure branch of like NLP and I was, or natural language processing. And like, sorry. And I was like planning to spend my life figuring out computational representations of like center embeddings or like subject verb agreement or whatever, like these like relatively niche questions of computational approaches to language. And like all of a sudden, like my obscure area of expertise has become like the hottest thing in the world and like here I am at this company like making a lot of money and like I just feel like I'm doing the thing that I was like trained to do because I was really interested in this very specific arcane question of you know computational linguistics and like now it's my job to worry about like teenagers harming themselves or like how we are going to handle as a society like these questions of potential like mass white collar unemployment. Like I don't know. Like that is above my pay grade. And like I have a lot of sympathy for that, which is like that's not why these people got into this. And like these are not questions that we should want to be solved by engineers at three different companies. As smart as these engineers are, like these are problems for all of us to attempt to at like a societal level. And like everybody wants there to be a kind of like magic bullet of like UBI or whatever. And like maybe we will kind of like fumble our way there. But I think it's a lot to put on the shoulders of like these people to be like, you broke it, you bought it. Like that's, that shouldn't really be the way it works. Like they are, they are working on a tool that is very, very exciting. Like, you know, part of the point of the piece is that like setting aside everything else you think about these things like it is that it raises tremendously interesting scientific and philosophical questions about like the nature of intelligence and the nature of learning and the nature of language and all of these things that you know like a lot of old questions are new again and a lot of the people working on this are working on it out of the spirit of like scientific curiosity and i don't it's hard to blame them for not having like answers to these like enormous questions. Yeah, no, it's one of my favorite parts of the piece where you kind of introduce this, you know, like the sort of contradiction that is like in the readers head the entire time they're reading the piece, which is like, if you're so committed to safety, then why are you even doing this? And you quote like an anthropic researcher who told you at one point, like, maybe we should just stop. But then like, you know, he's right, the most candid AI researchers will own up to the fact that we are doing this because we can. And basically, like, we're pursuing this because it's epic. Yeah. And, you know, I kind of understood it, you know, at that point. It's like, yeah, it would be very hard to stop yourself if, like, this is, like, kind of what you've been training your whole life to do, and you're, like, making these breakthroughs, and you're creating these things. And I guess, like, on one hand, it's like, I don't think that we should necessarily hold these scientists responsible for, like, coming up with the solutions to society's problems, even if they are kind of exacerbating these problems or speeding them along. But Anthropic does frame itself as like the good guys, like the AI safety guys. And so I guess like it makes me wonder if they're in a tough position because they're positioning themselves as a more ethical company. And then there are going to be all of these like ethical questions and implications that come with AI. And it's like, will we look to them to mitigate those effects? Well, so I mean, I think, again, you can kind of like break down a lot of these things. And you could say, like, okay, well, what about, like, information processing, right? That, like, they are committed to, like, Claude is not going to, like, tell you that, like, the moon landing was faked. That, like, they do have some idea that, like, they want this to be a kind of, like, informational backstop. And, like, they put a lot of energy into that. But then so much of the safety work is, like, so many of the resources are focused on, like, things that could really happen soon. as opposed to like, well, okay, maybe in three to five years, we have to deal with like mass white collar unemployment. But like, guess what? What we have to deal with like right now is the possibility of, you know, like what they call like bio uplift, which is like, is it possible? And they're constantly running these trials every time they release a new model, which is like they get a bunch of biology like PhDs and master students in a room and like lock them into a hotel room. And they're like, use Claude to try to, you know, weaponize botulism or whatever. And it's the kind of thing where, like, you know, there's so many variables that go into trying to figure this out, which is like, is it possible for like what kind of person now might be able to do that? But like before you would have needed like a handful of like, you know, the best virologists in the world to do this kind of thing. Like to what extent can like a normal person do this? because there's this hope that's like, well, there are all these kind of like practical guardrails that existed before, which is like maybe these things were possible, but like there were so few people who could do it that we could like kind of count on the fact that probably none of those people would be like motivated to do it, which was like maybe a little naive, but it's kind of like largely held up so far with of course the exception of like the Aum Shinriko cult in Japan, which like did try to do that and like really almost pulled it off, right? So like we have, there is like a salient reference class of like lunatics who have almost pulled this off in the past. And so like what they're really concerned about is how do we make sure that some like, you know, bright kid with two years of biology can't come up with like some novel virus that's going to kill everybody. Or, you know, like even more recently, like just in the last couple of months, now the really big concern is cyber. Because you have like in cyber, you do have like tons of people out there, like state actors and non-state actors with, like, tremendous financial incentives to figure out how to, like, commit, like, greater cybercrime. And, like, they've already shown that, like, Claude is being used to do this kind of thing. And so, so many of their resources are on, like, well, okay, mass, like, social instability due to mass unemployment, like, seems very, very bad. But at least that's, like, maybe three to five years away. Whereas, like, we got to deal right now with, like, the possibility of, like, people using this stuff to, like, make bioweapons or, like commit massive cyber crime. Just to wrap up, like, I want to go back to one of the central questions of your piece, which is after spending time inside of Anthropic and doing all this reporting, do you come away feeling like the people building these systems feel in control of what they're creating? No. No. No. Well, I think they feel like so far we're still a couple of steps ahead. But they just feel like we're really not that far from the point that like we might, we can't take for granted that we're like a couple steps ahead. And do they, I mean, I don't know, were you left feeling good after you kind of walked away with that conclusion? Are you, yeah, I mean, do you, are you excited? Are you scared? You know, I've gone, I've run the gamut of emotions on this. I mean, I think after my first trip last May, I was very, very depressed. And I was depressed for a lot of reasons. I was depressed about all the social issues we're describing. I was depressed about, like, all of the threats that exist. I was also depressed about, like, the cultural chasm that exists, that, like, I would kind of, like, come back to Brooklyn where, like, you know, at a literary party, people would kind of, like, pretend like this all wasn't happening. And I would think, like, you are doing yourself a disservice by just, like, repeating these shibboleths of stochastic parrots over and over. Like, we kind of need everybody to be thinking about this stuff and taking it seriously. Yeah. Then, I don't know, maybe the next trip I didn't necessarily feel as depressed. Like then, you know, there were trips where I would feel like really excited about all the possibilities here. And also like very glad that the, you know, there was clearly some selection bias involved in like the people that I was talking to because like I knew the research that I wanted to be following. And but at least among the, I don't know, 75 employees that I talked to, like I thought like I'm glad that these are the people working on this stuff. Like you could think of people like a lot of replacement level people who would not be as would not be thinking about these things as carefully. I mean, I think part of the experience of reporting on it was similar to the experience of working on it, which is just like kind of whiplash feelings of like moments of like terror and despair and moments of like awe and moments of enthusiasm. And luckily, the goal with this piece was not to get to the bottom of all of this. The goal of this piece was to, like, underline the state of uncertainty that we're in and, like, sharpen the questions that, like, maybe we should be thinking about and asking and taking seriously. Well, thank you so much for being here. Thank you, Tyler. That was really fun. Gideon Lewis Krauss is a staff writer for The New Yorker. You can read his latest piece on Anthropic and Claude at newyorker.com. This has been The Political Scene from The New Yorker. I'm Tyler Foggett. This episode is produced by John LeMay, with mixing by Mike Kutchman and engineering by Pran Bandy. Our executive producer is Stephen Valentino. Our theme music is by Alison Leighton Brown. Thanks so much for listening. Hi, I'm David Remnick, editor of The New Yorker. At this year's Academy Awards, Timothee Chalamet and Tiana Taylor aren't the only major nominees. The New Yorker will be there too with two nominated short films, which you can watch at newyorker.com slash video. Two People Exchanging Saliva was executive produced by Julianne Moore and Isabelle Huppert, and it's set in a dystopian Paris where kissing is illegal. Our animated short film Retirement Plan follows a man as he dreams about all the things he's going to do when he's done working. You can enjoy both of those films and our full library of acclaimed short films at newyorker.com slash video. From PRX.