AI just got scarier

26 min

•Apr 16, 20263 months ago

Summary

This episode examines Sam Altman's trustworthiness as OpenAI's CEO through a New Yorker investigation, revealing allegations that he contradicts his own safety-first messaging depending on his audience. It also covers Anthropic's release of Claude Mythos, a powerful AI tool for cybersecurity that's restricted to select organizations due to risks of misuse by bad actors.

Insights

Sam Altman's stated commitment to AI safety and regulation appears inconsistent with his actions, shifting positions based on political and business contexts, raising questions about the credibility of AI safety narratives
The concentration of power in individual AI leaders contradicts the stated governance principles of safety-focused organizations, creating structural vulnerabilities regardless of personal trustworthiness
Powerful AI capabilities designed for defensive purposes (cybersecurity) are inherently dual-use and difficult to control, forcing companies to make difficult decisions about restricted access versus eventual public release
The AI industry faces an 'arms race' dynamic where safety commitments are undermined by competitive pressures, with companies recruiting talent on safety promises they later abandon
Restricting access to powerful AI tools to 'trusted' organizations is a temporary measure; similar capabilities will likely be available publicly within 12 months, making current gatekeeping efforts primarily about gaining a head start

Trends

AI safety rhetoric being used as recruitment and regulatory strategy while actual practices diverge from stated commitmentsDual-use AI capabilities forcing difficult decisions between innovation, security, and democratic accessCompetitive dynamics in AI development undermining industry-wide safety standards and governance frameworksGovernment-AI company relationships becoming transactional and conditional on favorable regulatory treatmentCybersecurity becoming a primary use case and competitive battleground for large language modelsRestricted access models for powerful AI tools as temporary gatekeeping before inevitable public availabilityIncreased scrutiny of individual AI leaders' character and decision-making as proxy for organizational trustworthinessAI-powered cyber attacks intensifying, driving demand for AI-powered defensive tools in an escalating arms race

Topics

Sam Altman trustworthiness and credibilityAI safety governance and regulationOpenAI leadership and strategyClaude Mythos cybersecurity capabilitiesDual-use AI technology risksAI industry competitive dynamicsRestricted AI access and gatekeepingCybersecurity vulnerabilities in critical infrastructureAI regulation and government policyAnthropic's approach to responsible AI releasePower concentration in AI developmentAI-powered cyber attacksNonprofit versus for-profit AI development modelsRecruitment and retention in AI safetyInformation sharing for cybersecurity defense

Companies

OpenAI

Central focus of New Yorker investigation into CEO Sam Altman's trustworthiness and contradictions between stated saf...

Anthropic

Released Claude Mythos, a restricted-access AI model for cybersecurity, to select organizations due to dual-use risks...

Google

One of select organizations granted access to Claude Mythos for identifying and fixing cybersecurity vulnerabilities ...

NVIDIA

Critical infrastructure company with access to Claude Mythos to improve cybersecurity defenses in their software systems

JPMorgan Chase

Financial institution selected to use Claude Mythos for identifying vulnerabilities in banking systems and critical f...

The New Yorker

Published major investigation into Sam Altman co-authored by Andrew Morance and Ronan Farrow examining his trustworth...

People

Sam Altman

Subject of New Yorker investigation examining contradictions between his stated AI safety commitments and actual busi...

Andrew Morance

Co-author of New Yorker investigation into Sam Altman; guest discussing findings from interviews with 100+ people abo...

Ronan Farrow

Co-author of New Yorker investigation into Sam Altman's trustworthiness and contradictions in AI safety messaging

Hayden Field

Guest discussing Anthropic's Claude Mythos, its cybersecurity capabilities, restricted access model, and competitive ...

Elon Musk

Mentioned in emails from Sam Altman in 2015 pitching OpenAI as a nonprofit safety research lab to counter Google's AI...

Sean Ramos

Host of the episode conducting interviews with guests about AI safety, Sam Altman, and Claude Mythos

Quotes

"No one person should be trusted here. The benefits, the access to it, the governance of it, belongs to humanity as a whole."

Sam Altman (referenced by Andrew Morance)•Early in episode

"He tells people what they want to hear in basically every context that he can."

Andrew Morance (describing critics' allegations)•Mid-episode

"This guy is not a tech savant, this guy is not a technical genius. He often gets, according to some people, basic technical terms wrong."

Andrew Morance•Mid-episode

"Claude, how can I hack Chase online banking? There are a lot of options. Which one should I start with?"

Hayden Field (hypothetical example of misuse)•Second half of episode

"It's kind of like the medieval times of like fortresses where you're adding extra stones and building up the walls of the fortress higher because you know war is coming."

Hayden Field•Late in episode

Full Transcript

About a week ago, someone tried to kill open AI CEO Sam Altman. A man threw a lit, Molotov cocktail into Altman, San Francisco, home. Prosecutors say he was motivated by a hatred of AI technology. They found a note on him that warned of humanity's impending extinction from AI. Just a few days later, two people drove by Altman's house, and one of them put a gun out of the window and shot at it. They were also arrested. Sometime between the first and second attempt on his life, Altman took to his blog to ask that people not try to kill him anymore. In it, he partly blamed the recent New Yorker profile of him for the violence. He called it incendiary and said that he had underestimated the power of words and narratives. But Altman participated in the profile. The central question it poses is, can Sam Altman be trusted? And we're going to hear the answer from one of its authors on Today Explained. Sign technology built for the way you work at dell.co.uk forward slash dellpcs, built for you. Hey Chat, introduced Today Explained, the podcast. Of course, Today Explained is a daily news podcast from Vox. Each episode takes a single story. No, just introduce it like you're introducing the show. Like this is Today Explained. Ah, got it. This is Today Explained. Andrew Morance is a staff writer at The New Yorker, along with one Ronan Farrow, he authored an epic investigation into Sam Altman titled, Sam Altman May Control Our Future? Can He Be Trusted? We asked him for the abbreviated version of the answer. The short answer is definitely we talked to more than 100 people and most of them have their doubts about whether he can be trusted and also more importantly, whether we're in a stable structural environment that we even have to put so much stock into that question. Like even the fact that we need to trust an individual with this much power is itself according to a lot of the people we spoke to, extremely problematic. I think one of the most grabby quotes in there is that someone says this is a man who is unconstrained by truth, which is I guess like another way of saying like he's a pathological liar. Is that going too far? Well, a lot of people said very, very pointed things like that, that and not only, you know, competitors, but people who still work with him actively, people who, you know, have all kinds of interrelations with him in terms of business. But we heard that kind of stuff again and again. And look, you know, there are also people defending him in the piece, but there were many, many people who used words like sociopathic, who used words like, you know, unconstrained by truth. And we tried to give the lay of the land, but this was something that kept coming up again and again. And again, it's relevant because of what the stakes are by their lights, by the lights of the people who are building this stuff. Their pitch right from the beginning, and Sam Altman's pitch specifically, was like, no one person should be trusted here. The benefits, the access to it, the governance of it, belongs to humanity as a whole. And it needs to be a safety first, nonprofit research lab that can only be run by people of the highest integrity. People who are uncorruptible, people who are not power seeking, right? So that was the standard that Altman and other co-founders laid out for themselves. And we were just sort of asking around to see if people around them thought they met that standard. The story that emerges from these critics, which again include competitors, but also include non-competitors, is that what they say is that he tells people what they want to hear in basically every context that he can. So, you know, there are people who are really, really into AI safety. People who basically think that AI is the most powerful and dangerous technology since nuclear weapons, and that it has to be handled with the utmost care. Again, this rhetoric was coming from people prominently, including Sam Altman right from the beginning. My worst fears are that we cause significant, we, the field, the technology, the industry, cause significant harm to the world. I think that could happen in a lot of different ways. It's why we started the company. Those people will say that, you know, they were given a recruiting pitch often to work for OpenAI, often for a big pay cut, and told, you have to work here because we are the safety focused lab. We are the good guys. We are the ones who will do this cautiously in a slow, circumspect way without playing into any race dynamics within the industry. And then, according to many, many people we spoke to, over time that was reversed. The race dynamics were exacerbated, accelerated, often disproportionately by Sam Altman and OpenAI. And according to these people, they felt just completely betrayed by that promise. That's also something you hear from not AI safety-pilled people, but from, you know, more traditional business investor types. I mean, this would be like, let's say you funded an organization to save the Amazon Rainforest, and instead they became a lumber company and chopped down the forest and sold it for money. You also get this from people in government, you know, people who were told, for example, under the Biden administration, Altman's posture was, please regulate us, you're not doing enough to regulate us, these executive orders don't go far enough. I think if this technology goes wrong, it can go quite wrong, and we want to be vocal about that. We want to work with the government to prevent that from happening. Longer term, as these systems become really, really powerful, I do think we will need some sort of international authority. And then, apparently, you know, Trump comes in and says, we're done with AI regulation, and according to these people, Altman turns around and says, great, what took you so long? Like, finally, we can get rid of all this regulation, and he has a quote that he says to Trump at a televised dinner. Thank you for being such a pro-business, pro-innovation president. It's a very refreshing change. So again and again, you hear these allegations of this pattern, like you say what you think will go over well in one room, and then you say something else to another room, and then you kind of just hope that you don't get caught in the contradiction. Another thing that occurred to me reading the piece was that Sam Altman, who people may think is some sort of like tech savant, is actually maybe more of just, you know, a business person. Yeah, we were told again and again, this guy is not a tech savant, this guy is not a technical genius. He often gets, according to some people, basic technical terms wrong. And now again, we didn't want to be sort of pearl clutching or totally shocked by this in the piece, right? Because there are many people who have been extremely successful in business without doing the technical innovations themselves. So there is this other category of executive that is not the technical founder CEO, but the business CEO. And so he saw AI as a big opportunity. He saw a business opportunity. Now that's controversial even just to say that he saw it as a business opportunity. Because again, one of the things we were told was that this was one of the big things that people felt misled by. Because remember, this was founded as a nonprofit safety research lab. And right from the beginning, we have emails, you know, Sam Altman writing to Elon Musk in May 2015. And he's not saying to Elon and to other potential investors, hey, this is a great investment opportunity. This is going to be a big business. What he says is the world will be unsafe if Google controls this all-powerful weapon. And he says, I want to start a Manhattan project for AI. I mean, if he's not motivated by money, the most obvious second choice would be power. And he's pretty powerful, right? Yes. And again, I mean, people have a wide range of views on how powerful this technology actually is. There are many people who still think that all this talk about AI being as powerful as nukes and all that is just sort of a hype, you know, a sort of standard business hype project to make the thing sound more exciting, make it sound more powerful, maybe try to engage in a bit of regulatory capture. I think that the view that it's all hype gets harder to defend as we see AI entering our lives in all kinds of ways. And it's not just in the sci-fi, sky-net kind of hypothetical ways, but in very tangible, measurable ways, you know, AI slop taking over all kinds of information channels. Banana, I need to go out for a while. Please watch my daughters. AI being involved, very interwoven into military technology. The future of American warfare is here. And it's spelled AI. AI increasingly being interwoven into all kinds of infrastructure, you know, energy, create infrastructure, all kinds of stuff. So, look, I think that this stuff is so almost impossible to conceive of. I mean, these hypotheticals involve space colonization. They involve nanorobots. They involve, you know, it sounds like sci-fi because it is based on sci-fi. But I would also just say just the fact that something sounds like sci-fi doesn't mean that it's not possible, right? We don't know yet what, how powerful this stuff could get or how dangerous that could be. All we know is that the people who are building it claim that it's so powerful that they're terrified to summon it into existence. And I think that's at least worth taking seriously. It's so interesting, you know, you set out to answer a question in this piece, should we trust Sam Altman? But ultimately, the argument you seem to be making is that Sam Altman from several years ago would have said you should just trust Sam Altman with this technology. Well, he did say that. In fact, one of the questions we asked him that was supposed to be a sort of gimme easy question was, do you have an elevated moral responsibility as the leader of one of these big companies? And that was supposed to be an easy question because he throughout his career had said leading one of these AI companies comes with an extreme duty of care. I mean, I'm paraphrasing, but he said many versions of this and precisely to your point, he often said, no one person should be trusted with this much power, including myself. And we should have more regulations, we should have more guardrails, we should have more democratic input. I mean, there were many, many ideas that he floated over the years precisely to this point to try to ostensibly diminish the power of himself and a handful of other people who were in this race for the, you know, ring of sauron or however they put it. And yeah, those restrictions have not come into existence, but it's hard to tell where the good faith arguments stop or start if you can't trust that the arguments are being made in one way. There's no good faith in the first place. You can read Andrew's Almond profile at NewYorker.com. The scary AI that threatens to upend our society isn't science fiction anymore. The topic, which was founded by a bunch of former open AIers, says that it's already here, it's called Claude Mythos, and we'll talk about it when we're back on Today Explained. Support for the show today comes from Quince Spring Cleaning. It takes many forms. I might do some this weekend. A deep cleaning the kitchen? Done it already. Fixing broken appliances? Done it already. And the classic cleaning out the closet? That's what I'm going to do this weekend. So if this season you're ready to reset your wardrobe with some quality long lasting pieces, you might want to check out Quince. Quince says they make high quality wardrobe staples using premium fabrics like 100% European linen, organic cotton, and super soft denim. Super soft denim with style starting around $50. They say their spring pieces are lightweight, breathable, and effortless. That kind of thing you can throw on and instantly look put together. That's the dream. Our colleague, Nisha Chetal, has tried Quince. I love wearing sweater and jeans. I've seen similar styles of sweaters from higher end brands for double, triple, or even more. The price. So they do feel like really, really good value for your money. And the style and the quality feel similar to much more expensive comparable brands. You can refresh your spring wardrobe with Quince. Go to quince.com.com for free shipping and 365 day returns now available in Canada. Hi! Go to quince.com.com for free shipping and 365 day returns. Quince.com.com. Support for Today Explained comes from HomeServe. Owning a home can be full of surprises. You could have one repair on your to-do list for months and then all of a sudden a pipe bursts and now, well, you have another repair, don't you? Repairs don't care about timing and they definitely don't care about your budget. HomeServe says that's where they can help. HomeServe says regular homeowners insurance doesn't cover a lot of the day to day wear and tear. Plumbing failures, HVAC breakdowns, electrical issues. Those you're often on your own for, but HomeServe lets you choose a plan for your needs and budget. When something on your plan goes wrong, you can just call their 24-7 hotline to start the repair process for as little as $4.99 a month. You can help protect your home systems and your wallet with HomeServe against covered repairs. Plans started just $4.99 a month. You can go to homeserve.com to find the plan that's right for you. That's homeserve.com not available everywhere. Most plans range between $4.99 to $11.99 a month your first year. Terms apply on covered repairs. Support for the show comes from Dell. Remember Dell? Dell PCs with Intel inside are built for the moments you plan. Still, and the ones you don't. Still. They're there for those late night study sessions when you get to the cafe and there's no outlets. All that stuff. Dell is built to adapt to you. It's built with long-lasting batteries. You're not scrambling for an outlet and built in intelligence that makes updates around your schedule, not in the middle of it. Find technology built for the way you work at dell.co.uk forward slash Dell PCs. Built for you. Hey, I'm Hayden Field. I'm senior AI reporter at The Verge. Hayden, we've asked you here to talk about something you haven't tried probably, but it's Anthropics' new AI in the Claude family. It's called Mythos. Have you tried Mythos? No. And the only people allowed to try it are a very select few organizations that they've greenlit because they deemed it kind of too powerful to release to the public due to cybersecurity risks. And when we talk powerful, what kind of power are we talking? What's the mythology here? Great word choice. Basically, Mythos is their newest AI model that they designed to just be a general-purpose AI model like any other. But what they realized when they were working on it was that it had these special skills that they didn't really anticipate and it was really, really good at cybersecurity. Mythos excels at identifying weaknesses in security flaws in software which hackers could use in cyber attacks. It may actually be too good at its job and Anthropic is worried about it falling into the wrong hands. It found high stakes vulnerabilities in virtually every operating system. From banking to technology to companies. So, some software that's been around for decades, Mythos found bugs in it, vulnerabilities that were critical in just a few hours of digging through it. So, you know, that's pretty bad. If you are using that as a hacker and just have a blueprint for like a list of every big gap in security and vulnerability on all these really, really high-profile systems, you're just going to be, you know, having a list of everything you could do to take those systems down or exploit data, all types of bad stuff. So, yeah, I mean, they realized that they better not release this to the general public because it could fall into the wrong hands and they instead like handpicks a select few organizations that are responsible for critical infrastructure to release it to so they could plug those gaps in their systems instead. You've heard of many of the companies that currently have and are using Claude Mythos, NVIDIA, JP Morgan Chase, Google, apparently a few dozen more that build or maintain critical software infrastructure. We asked Hayden if those of us on the outside have any idea how it works. Yes. So, since they built it as a general purpose model, it kind of probably works like any other model in that, you know, you're using it and prompting it to flag all the vulnerabilities in your system. Maybe you're like Google Chrome and you're looking for, you know, specific niche parts of the browser that you think may have some vulnerabilities and you're asking specifically about that. Hey, Claude, let's make sure the browser is safe from phishing and malware. Good idea. I'll shore up your defenses. Hey, Claude, keep our Chrome plugins safe. All plugins clear. Good to go. You're basically prompting the model to flag all these really high profile gaps to you and your security and then you're taking that and plugging it up on your own. So a hacker would actually use it in the same way if it fell into the wrong hands. They'd be like, yeah, tell me all the vulnerabilities here. Hey, Claude, is there a back door into Amazon user accounts? Doing some shopping? Sure. Here you are. And then they're going to take it off the platform and use that for something nefarious. So, you know, it's basically about like who is prompting the system and what their motives are, but both, you know, would use it the same way, essentially. This is what's kind of scary about the technology, I guess, is it's as easy as saying, hey, Claude, tell me how this banking system might be vulnerable. And then Claude thinks about it for a minute and it spits out a bunch of answers. Essentially, yes. Claude, how can I hack Chase online banking? There are a lot of options. Which one should I start with? And do we know that the Googles and NVIDIAs of the world are actually using this technology? Yes. So part of the reason that Anthropoc released this is they wanted these organizations that they released it to, to report back on exactly how Mythos worked and what it did to plug up the vulnerabilities and the gaps in their system. So it's kind of like an information sharing thing. It is essential that we come together and work together across industry to help build better defensive capabilities. They're letting these companies use it to test out how well it does to plug up all these huge high-provol gaps. And then they have to report back to Anthropoc about how it worked and then even maybe publicly release, you know, some of the like higher profile fixes that it made. No single organization sees the whole picture and can tackle this on their own. This is not going to be done as part of a few-week program. This is going to be the work of certainly months, perhaps years. They can obviously do that in a high-level way so they don't give away any company data. But yeah, it looks like down the line we'll eventually get some sort of blog posts and things like that to kind of see how it affected their systems and how well it did in terms of, you know, plugging up these scary gaps. How is Anthropoc choosing who to share this technology with? Do we know? Yeah, I actually asked them that and they said they're essentially, you know, just really looking for cyber defenders or companies that a lot of people depend on and that downstream it would be a huge issue if they got hacked in like any way, shape or form. So, you know, JPMorgan Chase is a great example. Anthropoc has also offered this technology to the government. We don't know if they're going to take them up on it, but, you know, basically cyber defenders is how they're putting it. Like anyone, any company that has a ton of people relying on it, especially if it's a critical infrastructure company. Funny you should mention the government because the last time we talked about Anthropoc on this program, I believe was to discuss the fact that they had a rather un-subsidized that they had a rather ugly and public breakup with the Department of Defense. And now they're saying what, want to be friends again? It's funny you mentioned that because a breakup is the best way to put it. It was super ugly, public, drawn out. It was just like a crazy rollercoaster soap opera situation. But yeah, I mean, Anthropoc hasn't really changed its tune here. You know, during the DoD drama, they were saying, hey, we'd love for you to use this technology. We just have these two red lines that we're hoping you don't cross. We're willing to work with you on it. You know, we're willing to, you know, keep a dialogue going. But the DoD had a really hard line against that. They said, no, we're not letting you slide on those two things. We want to use it for any lawful use, anything we deem necessary. And that's it. We're not going to agree to these two red lines you had in your contract before. So this is them, I think, kind of trying to get back in good with the government. They're saying, hey, we have this powerful technology that can help defend agencies from cybersecurity attacks. Please use it. And then I think, you know, they're probably hoping that they can use that to kind of get back in the good graces of the government and then, you know, fix some of the other drama that's been going on. Do anthropics competitors have similar tools? Are they presumably working on similar tools? Yeah, OpenAI is apparently working on a similar tool. OpenAI has their own Mito's model. And just like Anthropic, they're not releasing it publicly. The company says the new model can help defend against cyber attacks. But Wards AI is being used by attackers looking to cause harm. Anthropic itself has said, you know, this isn't something that they deem they'll be in the lead on for too long. They think labs anywhere in the world may release this technology in the next three months, six months, 12 months. Like everyone seems to agree. It seems like on sometime in the next 12 months, this is going to be out there. And so that's why they wanted to release Mito's now. It seems like so that really high profile companies and, you know, banks could plug up their systems, you know, get ahead of all the hacks that may be coming down the line when the similar types of technology are released, you know, to everyone in the general public, maybe months from now. If this is so dangerous and there's so many potential risks, is anyone having a conversation of maybe just not releasing tools like this and just sort of shutting it down, keeping it internal? That is a really great question. I'm so glad you asked because not enough people ask whether an AI system should actually be released or used for certain things. It's kind of like right now we're seeing a lot of one size fits all like throw it at everything type of integration. And a lot of times AI is not the answer for things. With this, though, I would say it's I haven't seen much dialogue around that because people tend to agree that it is something that's needed right now. Since AI is already out there helping cyber attackers really step up their attacks. And we've been seeing that intensify over the past year. People seem to kind of agree on that this, you know, you need AI to fight AI cyber attacks, essentially. So basically, it's kind of like, you know, the medieval times of like fortresses where you're like really like, you know, adding extra stones and like building up the walls of the fortress higher because you know, war is coming. That's kind of the the sense I get when I talk to these experts about this. Like they know it's coming. It's just try to shore up your defenses now so that you're best prepared. You can read Hayden Field at theverge.com. Dustin DeSoto produced the program today. Jolie Myers edited Gabriel Dunnetov, fact checked and David Taddyshore mixed. I'm Sean Ramos from and this is Today Explained. Support for the show comes from Dell. Remember Dell? Dell PCs with Intel inside are built for the moments you plan. Still and the ones you don't still. They're there for those late night study sessions when you get to the cafe and there's no outlets, all that stuff. Dell is built to adapt to you. It's built with long lasting batteries. You're not scrambling for an outlet and built in intelligence that makes updates around your schedule, not in the middle of it. Find technology built for the way you work at dell.co.uk forward slash Dell PCs built for you. Support for the show comes from Odoo. Running a business is hard enough. So why make it harder with a dozen different apps that don't talk to each other? Introducing Odoo. It's the only business software you'll ever need. It's an all-in-one fully integrated platform that makes your work easier. CRM, accounting, inventory, e-commerce and more. And the best part? Odoo replaces multiple expensive platforms for a fraction of the cost. That's why over thousands of businesses have made the switch. So why not you? Try Odoo for free at odoo.com. That's odoo.com. .