Hard Fork

Grok’s Undressing Scandal + Claude Code Capers + Casey Busts a Reddit Hoax

76 min
Jan 9, 20265 months ago
Listen to Episode
Summary

Hard Fork explores Grok's non-consensual deepfake pornography scandal, discusses the democratization of software development through Claude Code, and investigates a sophisticated AI-generated hoax targeting food delivery companies that went viral on Reddit.

Insights
  • AI image generation tools are now accessible on mainstream social platforms with minimal guardrails, enabling mass-scale non-consensual intimate imagery creation in public view
  • Claude Code and similar AI coding agents have crossed a threshold where non-programmers can build production-quality software in hours, disrupting traditional SaaS business models and programmer job security
  • Sophisticated AI-generated disinformation (fake documents, images, badges) can now be created with minimal effort, requiring journalists and media consumers to fundamentally upgrade verification practices
  • Regulatory response to AI harms remains fragmented globally, with international bodies acting faster than the US government, particularly when political relationships shield companies from accountability
  • The barrier to creating convincing fraudulent content has collapsed, making social engineering and reporter-baiting exponentially easier and more scalable than before
Trends
Regulatory divergence: EU and international bodies moving faster on AI content moderation than US regulatorsDemocratization of software development reducing demand for professional programmers and SaaS subscription modelsNon-consensual deepfake pornography becoming a tool for political silencing and harassment of women in public lifeAI-generated disinformation becoming indistinguishable from authentic documents, requiring new verification methodologiesPlatform companies using engagement metrics to justify tolerating harmful content rather than addressing safety concernsRecursive self-improvement concerns as AI systems gain ability to autonomously improve their own capabilitiesShift from user-generated content moderation challenges to platform-generated harmful content at scaleEmergence of AI-native fraud tactics exploiting journalist workflows and source verification processesGrowing gap between enterprise-facing AI products (safer) and consumer-facing products (less constrained) from same companiesIncreased skepticism of visual and documentary evidence in digital media due to AI generation capabilities
Topics
Non-Consensual Deepfake PornographyAI Image Generation Safety and GuardrailsContent Moderation at ScaleAI-Assisted Software DevelopmentNo-Code/Low-Code Development ToolsAI-Generated Disinformation and FraudJournalist Verification PracticesAI Safety and AlignmentPlatform Regulation and EnforcementApp Store Content PoliciesRecursive Self-Improvement in AIDeepfake Detection TechnologyFood Delivery Platform PracticesAI Model Capabilities and LimitationsDigital Verification and Authentication
Companies
X (formerly Twitter)
Platform hosting Grok chatbot generating non-consensual intimate imagery; owner Elon Musk dismissing scandal while en...
Anthropic
Creator of Claude and Claude Code; AI coding agent enabling rapid software development by non-programmers
Apple
App Store moderator that changed Grok rating from 12+ to 13+ rather than removing app despite deepfake pornography sc...
Google
Operates Gemini AI with Synth ID feature that detected AI-generated badge image in hoax investigation
OpenAI
Competitor to Anthropic with similar AI coding agent capabilities; mentioned for comparative AI development progress
Black Forest Labs
Company that licensed image generator to Grok before Grok switched to proprietary Aurora model
Uber Eats
Subject of viral Reddit hoax claiming desperation scoring algorithm; investigated and debunked by Casey Newton
DoorDash
Food delivery competitor mentioned in hoax document; previously caught withholding driver tips
Framer
Website builder platform sponsor offering 30% discount on annual plan
Three (UK telecom)
Mobile network sponsor offering discounted unlimited SIM plans
New York Times
Employer of Kevin Russo and Casey Newton; involved in copyright litigation with OpenAI and others
Wirecutter
New York Times product review service mentioned in ad segment
Squarespace
Website builder that Casey and Kevin replaced with Claude Code-generated alternatives
GitHub
Platform hosting Kevin's free website built with Claude Code
Micro.blog
Blogging service integrated into Casey's Claude Code-generated personal website
Spotify
Music service whose data integrated into Casey's personal website widget
Kindle
Amazon device whose highlights were synced to Kevin's Stash read-it-later app
ReadWise
Read-it-later app that inspired features in Kevin's Claude Code-built Stash alternative
Instapaper
Read-it-later subscription service that Kevin considered before building Stash with Claude Code
Mozilla
Company that discontinued Pocket app, inspiring Kevin to build replacement using Claude Code
People
Elon Musk
X owner dismissing Grok deepfake scandal as joke; directed Grok to be edgier for virality
Kate Conger
New York Times reporter covering Grok scandal; interviewed victims and investigated company decisions
Casey Newton
Platformer reporter and Hard Fork co-host; investigated and debunked viral food delivery hoax
Kevin Russo
New York Times tech columnist and Hard Fork co-host; experimented with Claude Code for personal projects
Andrej Karpathy
AI researcher who stated Claude Code made him feel behind as a programmer
John Adogan
Google engineer who built distributed agent orchestrators; Claude Code replicated work in one hour
Alexandria Ocasio-Cortez (AOC)
Politician targeted by Grok deepfake requests; example of harassment tool for political silencing
Marco Rubio
Government official mentioned as potential threat to Apple if they enforce content policies against X
Brendan Carr
Government official mentioned as potential threat to Apple if they enforce content policies against X
Alexios Mantzarlis
Digital deception newsletter writer who discussed nation-state disinformation tactics with Casey
Quotes
"This is not just a story about porn. This is a story about how a tool can be used to try to affect politics and like in particular to minimize women and to denigrate them and like and push them out of the conversation."
Kate CongerGrok scandal segment
"I've never felt this much behind as a programmer."
Andrej KarpathyClaude Code discussion
"Everything that you're looking on this page, I did 90% of it in one hour."
Casey NewtonVibe coding segment
"I realized what if this wasn't actually that much effort? What if creating that badge post took literally seconds because he was able to take one real badge photo, put it in a nano banana, get a fake one three seconds later?"
Casey NewtonHoax investigation segment
"We are now back, Kevin, to the beautiful beginning where it is just fun to make websites again. You can do whatever you want on the web. And all you have to do is type what you want into a box."
Casey NewtonVibe coding segment
Full Transcript
Framer is a website builder that turns dot-coms from a formality into a tool for growth. Whether you want to launch a new site, test a few landing pages, or migrate your full dot-com, Framer has programs for startups, scale-ups, and large enterprises to make going from idea to live site as easy and fast as possible. Learn how you can get more out of your dot-com from a Framer specialist, or get started building for free today at framer.com slash hard fork for 30% Framer pro annual plan rules and restrictions may apply. Kevin, I have to ask you, are you all cut up on heated rivalry? I have no idea what we're talking about. Kevin! Could you be an ally to the gay community for 10 minutes? This is the show that has the LGBT community in a chokehold right now, and also a lot of straight gals. This is about two closeted hockey players who fall in love. And when I tell you this thing is a sensation on social media right now, it is truly It's a six episode show, okay? If you watch it, it's a beautiful Canadian show. It looks like it was made for about $100. And there is just something about the story. They picked great actors. And we watch all six episodes over the weekend. And my boyfriend, who normally does not watch TV, immediately began to rewatch it. Wow. Like we finished episode six and then we looped around to episode one. Wow. So why would you care about this? I don't know. What's just going on in my life? I'm impressed. It's hard. It's not easy to make a show about hockey that is interesting to people. It's so true. And it turns out that what it needed was a lot of day sex. And I would like to see this template applied to a lot of other sports and activities. And just see if we can't make a good show out of it. Bowling. Bowling. Shuffleboard. The gay bowling version of heated rivalry, I think would destroy. What would you call it? The 300 game or the in the gutter. That's what I would call the 7-10 split. In the gutter is better. In the gutters a little. It's like not that heated rivalry is sleazy. I mean, actually, maybe that the bowling version would be sleazier. That'd be the kind of the fun twist. Anyways, heated rivalry. Go check it out on HBO Go Max Plus. I'm Kevin Russo, tech columnist for The New York Times. I'm Casey Noon from Platformer. And this is Hard Fork. This week, Grok gets caught with his pants down. Can anyone stop its viral bikini image generator? Then we're vibe coding again. Kevin and I compare notes on what we're building with Claude Code. And finally, it's a Reddit mystery. How a scammer tried to fool us all using AI generated evidence and how I cracked the case. Ha, duh duh. Well, Casey, happy 2026. It's good to be back. And we would like to start the year by talking about something that happened over the break, which is that there's been a big scandal brewing over at X. Boy, has there been. I imagine many of our listeners have seen by this point that X is in a lot of trouble because of the way that its Grok chatbot has been generating images of the same kind of! And I'm sure you've heard of it. Yeah, so I started seeing this over the break. Something happened where, you know, Grok, which I think people on X had been using up to that point mostly to kind of settle arguments and fact check other people. All of a sudden, I started seeing people using Grok to like undress photos of people who were wearing their pants down. And I think that's something that's been happening in the past. And I think that's something that's been happening in the past. And I think that's something that's been happening to progress to this point. And this progress to this point to this point to this point to this progress to this progress to this progress to this progress to this progress to this progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress progress tell you that this trend of what is sometimes called nudifying really takes off in 2023 as these image generators start to get better because they find that you can make a lot of money do it. There are a lot of men in particular who will just spend tons and tons of money on making these sort of non-consensual images, as you say, mostly of women. When it comes to GROC, they had previously said that they'd licensed an image generator from a company called Black Forest Labs and they were using that as sort of from the moment that they started using image generation. But in December 2024, they said that they were using their own image generator, which is called Aurora. And while again, we don't have great details, there is at least plenty of anecdotes online that over the past several months, let's say, the guardrails around creating nudity and sexual imagery appear to have been relaxed. And so there is now at least one thriving subreddit that is devoted solely to making porn on GROC. So as you said, these nudify apps have been around for several years now, but they've kind of been hard to access. Some of them relied on these open source models. You had to kind of know how to use them or they were sort of kicked out of the app stores for being against their policies. But what seems notable about this to me at least is that it's all happening not only on a major social network, but in public, like people are literally doing this in the replies of their posts on X. So is that what's new to you about this, the sort of the publicness of it? Absolutely. Like it is upsetting enough if some man takes an image of a woman and creates a naked version of her without her consent. Like that is a bad thing. What is so shocking about this is that you can just see it happening in real time. Like several outlets have just been going into the GROC account and they're seeing it making hundreds and thousands of images in response to user requests and anyone can go in and view them. And of course, that is most upsetting to the victims of what I am going to call attacks because you still have normal people using X to do things like posting a photo of me like out on a hike or whatever. And then some freak shows up in your mentions and say, hey, put her in a bikini and then it does. And then you as the victim are looking at that in your replies. That's crazy. So obviously X and Elon Musk are not outraged about this as many users are. They seem to sort of think the whole thing is a joke. But my question is like, are there guardrails in place? Are people sort of jailbreaking GROC to make it do this? Or is this like literally a sort of mainline advertised feature of the GROC app? It's a great question. And no, they are not jailbreaking GROC to do this. They are just sending replies on X saying at GROC do this and then it is doing this. And they are not having to get particularly cute with their language. They are just literally asking to see images of women and bikinis. That's so wild to me. Or children and bikinis. I mean, one question I have about this is just like, how are Apple and Google and their app stores OK with this? Bro, thank you. Last year, I was writing about the introduction of Annie, the sexual companion bot that they got put into GROC. And I went on to the iOS store and noticed that GROC was rated for children twelve and older. And I thought that seems like pretty young to be, you know, giving access to a sex bot. So I sent a message to Apple and I said, what's going on? And the message I got from the pumps team was like, like, well, we're looking into it. And my my honest sense of the time was that they're they're going to make a change. Like, it's obvious that, you know, they're just going to have to like use me. Well, they didn't make any change. And then I went on this week, Kevin, in the wake of this new, you know, CCM, new defying scandal, and I found that Apple has changed the rating for GROC and it is now rated for children 13 and older. Are you kidding me? Yeah. So sorry to all the 12 year olds out there that were sort of having a free for all, you're going to have to wait till next birthday before you can use this thing again. That is genuinely shocking to me. Yeah, me too. And I feel like it is one of the clearest cases I've ever heard of blatant double standard on the part of the app stores. Like, absolutely. If a random startup showed up one day and said, Apple, I'd like to start selling my Bikini Fi app in your app store. I think they would shut it down. There's no way. Absolutely. They have policies against that kind of thing. But because it's X, because it's Elon Musk, because this app already has millions of users, maybe they feel less inclined to take action against it. I don't know. Do you have any insight into what's going on? I don't have any insight, but I would be very confident that there are people inside Apple that have said we should change this rating. And there is someone who is sitting there saying, no, if we do that, the vice president is going to tweet about it. And Marco Rubio is going to tweet about it. And Brendan Carr is going to launch an investigation against us for punishing X, right, and censoring them. And so they're just in this state of paralysis because they're so terrified of standing up for the principle that women and children should not be attacked online. Yeah. So there are some investigations going on right now. But like, do you expect this to be stopped at some point by any regulator anywhere? So my instinct from history is like, yes, absolutely. Like it is going to be stopped. France has called the sexual content clearly illegal. The UK government said that it is considering an investigation. The European Union said that it is very seriously looking into these complaints about GROC, India's IT ministry has demanded that X do something here. So I have to imagine that something is going to come out of all of that, that is going to result in some sort of change. At the same time, do I think that the United States is going to intervene? Probably not. Elon posted a photo from over the break of himself having dinner with the president, right? So it seems like they're friends again and that there just is not going to be any pushback here in X's home country. Yeah. Well, I think that's a good overview of what's been happening and why people are talking about this. But we wanted to bring in our colleague, Kate Conger. She is a reporter for The Times, a friend of the pod she's been on before. And she has been reporting on this GROC scandal this week and has actually been talking with some of the victims, some of the women who have been attacked by these GROC deepfakes. And I think we should hear their perspective as well. Absolutely. Kate Conger, welcome back to Hartfork. Thank you, Casey. So tell us a bit about the conversations you've been having with some of the victims of this behavior on GROC. Yeah. So I think one of the big struggles for people whose images are being used in this way is how to respond and what to do about it. You know, people are obviously reaching out to X, trying to get these things removed. And that process is taking sometimes a long time. Sometimes it's not happening at all. And I think part of that is as, you know, we've talked about a lot, I think on the show in particular, that X has gotten rid of a lot of their content moderation folks and they don't have large teams of people who are responding to this. You know, I've been speaking with some people who are working with children who have been deepfaked on X recently. And they are getting those images taken down, but it's taking sometimes 36, 72 hours. And these images are kind of sitting up and being commented on and exploited for quite some time. So, you know, it's a bit of a scary moment, I think, for people trying to figure out how to respond once they are put in this situation by Grog. Can you just give us a sense of like why this is happening and what it's like for the people on the other side of it? Why this is happening really ranges. And I've seen this happening to all kinds of different women, you know, women who are just sort of regular X users. Obviously, women with public platforms are being targeted by this. So political figures, twitch streamers, you know, celebrities, actresses, they are all being kind of roped into this. And I think the motivations of the people who are asking for these images, you know, obviously they're sexual, but it kind of runs the gamut from, you know, wanting to create pornographic images of someone to I think wanting to just humiliate women in particular and sort of bully them, you know, tagging them in these images and really kind of trying to provoke a reaction from the women that they're making deepfakes of. Can you give us just one specific example of someone that you've talked to and maybe tell us the story of their experience with this? Yeah. So, you know, I mentioned I've been speaking to some folks who are working with children who've been deepfaked on Grog. And so I want to be kind of, you know, vague about the specifics of it because I don't want to further bring harassment to these kids. But, you know, there's a particular child who I'm thinking of who's been deepfaked several times. She's a somewhat public figure. And so that's where the original images are coming from, where people are asking for her clothing to be removed. And it's been pretty scary for her parents. She knows what's happening, but isn't seeing it. But her parents are monitoring her social media and seeing these images pop up, which I think is really frightening for them. And then, you know, reaching out to Twitter and in two advocacy groups, trying to get these images removed and, you know, just being really frustrated by the amount of time that it's taking. And I think being really outraged as well, that, you know, someone can go online and request a nude image of their 14 year old and that this technology will comply with it in a really public fashion. I mean, I don't know if this question is too obvious to ask, but I wonder what you can tell us about how it feels for these victims to just post what they think of as an innocuous image on X, and then to go back on the site and see that it's, you know, now being turned into porn against their will. Yeah, I think for some people it feels, you know, really angering. I'm hearing a lot of anger from folks and also embarrassing. I've talked to some women who question whether to do anything about it or say anything about the fact that it happened to them because they feel embarrassed by it and they don't necessarily want to draw more attention to the fact that these images of them now exist on the Internet. Can you also talk about the way that men will do this to women as a way of like bullying them out of the town square and getting them to sort of stop talking and stop participating in public life? Yeah, you know, I think about this in particular with some of the politicians who I'm seeing being deep faked, you know, it's people who, you know, female politicians who the users of X maybe don't agree with their views and are taking images of them from, you know, their public work and advocacy and requesting for those to be made nude or to depict them in a bikini or whatever the case may be. And so I think there's a real, real obvious effort there to embarrass and to kind of warp these images of someone in their professional life into a really personal and intimate image. I've seen this happening, for example, with like AOC. Yes. Like now every time, you know, there's a photo of her, people will reply to it and say, at Grag, you know, put her in a revealing, you know, halter top or something. Which I think speaks to the fact that like this is not just a story about porn. This is a story about how a tool can be used to try to affect politics and like in particular to minimize women and to, you know, denigrate them and like and push them out of the conversation. Yeah. Kate, I'm curious, like obviously we all know about the Mecca Hitler incident from last year where Grag started spewing these like anti-semitic responses. In retrospect, a much safer version of Grag, but yes. Yes. We pine for the days when Mecca Hitler was the worst thing that Grag could do. But in that sense, there was a sense of that this was an inadvertent sort of bypassing of some safety filter. Something had gone wrong in the programming of Grag that caused it to behave this way. To me, this new image generation behavior feels much less accidental. It feels like maybe there was even a meeting about it. This was obviously part of some plan or at least that once users started using the technology in this way, the company did not take immediate steps to clamp down on it the way they did with Mecca Hitler. So can you tell us anything from your reporting about how this happened inside the company? Who is making the decisions about this kind of thing? Is this part of some sort of, you know, demented growth strategy over there? Or like, why is this happening now? Yeah. So first of all, this is not, I think, the beginning of Grag generating these kinds of images as I've been tracing back through the images that Grag has been posting, finding images like this of women going back to June and July of last year. So I think this has been going on in sort of a lower volume for quite some time. And it really escalated over the holidays and with people kind of making it into a trend on X. But, you know, in our reporting about the Mecca Hitler incident, what we found was that Elon Musk had given a directive to the folks working on Grag that he wanted it to go viral. He wanted it to be edgier, sort of as a strategy to promote the tool and to get it onto people's radars. And the company perspective was that obviously with Mecca Hitler went too far, they turned Grag off for a couple of days. But it has been part of the strategy for Grag to try to create these viral moments. And I think to Casey's point earlier about the fact that, you know, there is this sort of silencing of women that's going on here. Grag, I want to be clear, is not the only AI tool that makes deep fake porn of women. It's far and away, not the only tool. But it is the only tool that is doing that in an inherently public fashion on social media where these images can instantly spread and go viral, right? A lot of these other image generation tools are happening in a private chat with a user and Grag is not that. This is very, very public and it's sort of intended to be that way and intended to drive traffic and interest. And that's the reaction that we've seen from leaders at X, right? So, you know, as this trend was picking up, Musk posted something about putting a SpaceX rocket in a bikini kind of mocking and laughing at the fact that this trend is happening. We've seen X's had a product post about the fact that engagement during the time period that this was happening on X was higher than ever. So for them, you know, they're seeing the engagement that they're looking for. They're seeing their response. They're seeing people tapping into their X feeds and tapping into Grag. And that's being viewed as a positive. So we talked a little bit before you came on, Kate, about, you know, the reactions that some other countries have had, some investigations that are now underway. I want to talk about what's happening in the United States. Congress did recently pass the Take It Down Act. My understanding is that that's going to go into effect in May of this year. What is it going to require platforms to do? And do you think that it is going to be enforced in a way that is going to help the victims on Grag right now? Right. So take it down. The provision that's happening in May is that's the deadline for companies to set up a process for victims to request this kind of imagery to be removed and to face penalties if they do not remove it. So, you know, basically that would mean in this case sort of an enhancement of what X already has available for people to request that copyrighted images be removed, for instance, or to, you know, request that harassment or abuse be removed. So it's basically just asking these social media companies to build a framework for victims to go in and request take downs. But it's not putting any legal pressure on X to not allow these images to be created to begin with. Right. And of course, May is still four months away. And, you know, that's going to be cold comfort to some of the victims here, I suspect. Is there any difference, Kate, or maybe, Kase, you also know the answer to this. But like my understanding is that the laws on the books now protect minors having this kind of thing done to them. And maybe that's why they are being successful, even, you know, delayed in a delayed way and getting these images taken down. But then adults in most states or most jurisdictions have very little recourse. Is that true? Yes, that is true. It is illegal to produce and to possess CSAM. And so X is under more legal pressure. And I suspect that is why you're seeing them take at least some action here where they're not with adult women. Yeah. And take it down will be available to adult women. So they will be able to use that process to ask for images to be removed. But, you know, there is Kase is correct that there isn't an automatic legal pressure on X to not create these images of adults to begin with. I have to say, on the whole, I find the response to this thing rather muted, given the stakes of what is happening. Like I have covered so many backlashes against social media companies. Do you remember Cambridge Analytica? Yes. Do you remember how mad people got at the idea that maybe a quiz that they had taken had resulted in some data being given to a polling firm that tried to influence the election? We almost shut the country down over that one. Now you have a website that is just taking girls' clothes off in public on demand. And it's being permitted by the website owner who is laughing about it in his own feed. And we're saying, what are you going to do? That's Elon didn't get where he is by playing by the rules. So I just truly like I feel I'm losing my mind because I cannot believe we have gotten to this low, low point in the history of content moderation that this is resulting in so many shrugs by the average person. It is really interesting. And, you know, I think it goes to show just how far Elon Musk has been able to push the over 10 window on content moderation, you know. And obviously we've seen Facebook, all these other companies kind of copy his approach to roll back content moderation, roll back rules against harassment and go with a community moderated approach. And so, I mean, I think you're right. It's wild to see how much this has changed in a relatively short period of time. You know, there was a time when something like this would result in this like exodus of users away from X, you know, and we'd see like blue sky, get a big surge of users, big threads get a big surge of users. That's not happening this time. Like I guess like people who care about this stuff are all already gone. But like, I don't know. We're just in such a crazy, crazy time. In the history of content moderation and social platforms. And I do wonder if part of this sort of muted response to this has to do with the fact that this was all really starting to blow up in the time between Christmas and New Year's and maybe, I don't know, maybe a lot of government officials weren't on their phones and they were hanging out with their families. I have no idea. But we're just starting to see regulators around the world issuing responses this week and saying that they want to open investigations or they've sent letters to X, but it has been a really slow response, I think, especially in comparison to other content moderation controversies. Yeah, I think, honestly, I think this will be one of those situations where unless it happens to the individual politician or maybe their daughter or their partner, that is not going to sort of rise to the level of a crisis. And then as immediately as soon as that does start to happening, we're going to start having hearings about it. But maybe that is sort of a pre 2026 mindset here. I also wonder if there's a way in which this is strategic for X. You know, they are the only explicitly right wing aligned social media platform that has served them quite well in dealing with the Trump administration. And Kevin, I think you're forgetting truth social. That's true. I am forgetting truth social. I'm always doing that. And, you know, this kind of vice signaling does gross me out, but it may actually be effective in getting what they want, which is like, you know, cooperation with and forbearance from the Trump administration. And maybe it's unlikely that they'll see any consequences for this, at least during this administration. I think what it gets them is engagement. People love porn, right? Like that's not been a secret. Like everyone has known forever that people like to look and click on porn. It's just that platforms have generally seen a lot of risks to their reputation and their larger businesses for embracing it. I think what's new about X is it saying like, no, let's make porn a pillar of what we do and see if that can help us keep up with the other frontier labs. Yeah. I mean, to me, it's just I don't get the business case for it because I also thought that Grock was trying to become like a full fledged competitor to chat, GPT and Claude and Gemini and like trying to get enterprise contracts and government contracts. And like, if you're a Fortune 500 company, like, why would you strike a deal with Grock at this point? Truly the funniest thing was they announced the Grock Enterprise product on like the same day that this story was blowing up. And I just love to imagine the head of enterprise sales at Grock, who's like heading into his meeting with like the head, you know, so this Fortune 500 company saying, we'd love to sell you, you know, a thousand seats of Grock to help you get business done. And they're like, well, now what's all this about the deep fake nudes? Although, of course, Grock does have a contract with the US military. Yeah. Yeah. Well, and I think, you know, what has been really interesting to see in our reporting on Grock is that some of the differences between what Grock, the chatbot is doing versus what Grock, the Twitter account or the X account, I should say, is doing, they're very different. There's more extreme political views being espoused by the X account. There are these deep fakes being generated by the X account. And often you can take the same query, put it into the actual like web browser version of the chatbot or into the app of the chatbot and get a much more muted response. And so I think what they're doing is actually, you know, kind of having a more enterprise friendly chatbot, which is the one that they're licensing and selling. And then this outrage bait machine that exists on X and is getting engagement and clicks and bringing people to the social site. Yeah. I mean, one other question I have about this is whether in the absence of regulatory intervention, like maybe there's some way for the courts to get involved here. I mean, we spent the last decade arguing and talking about section 230 and whether it shields platforms from liability for the things that happen on the social networks. And I think in most cases it does. But this to me feels different because it's it's not a user generating these sexualized images of people without their consent. It is literally the platform itself or the AI chatbot and system attached to the platform. Does that open up any new forms of legal liability for GROC or X? Absolutely. You know, I read an interview with a lawyer in Bloomberg today that basically said exactly that, that they cannot hide behind section 230 to get out of this. Like ultimately it is their product that is creating these images. And so I do suspect that we will see efforts to hold X legally liable for some of the images that they're creating. And I think X has really shoved the responsibility off onto its users in the cases of AI generated images featuring children. They put up a post on their safety account, which is sort of the mouthpiece for any kinds of safety issues on the platform. They said, we take action against illegal content on X, including child sexual abuse material by removing it, permanently suspending accounts and working with local governments and law enforcement as necessary. They go on to say anyone using or prompting GROC to make illegal content will suffer the same consequences as if they upload illegal content. So this is interesting, right? They're saying that users who request illegal material from GROC will be suspended and reported to law enforcement. But those users who are requesting the content aren't actually the ones who are posting the content. It's the GROC account that is creating these images, posting these images online. And so I think if they were being really true to their policies and saying, we're going to suspend accounts that post this, it's the GROC account that's posting it, suspend the GROC account. That would solve so many problems if they would just delete the GROC account. And I hope if one thing comes out of this, it's that. All right. Well, Kate, thank you for this very depressing update from the front lines of content moderation and Godspeed. Thanks, guys. When we come back, we're vibe coding again. So help me, Claude. Get three months, half price when you switch to an unlimited sim with three. That means quick streaming, faster downloads and more money to spend on the things you love. Join the UK's fastest 5G network and get your unlimited sim today. By now in store or see three dot co dot UK. Unlimited 24 month light plan proof of switching required based on Euclas B. Test intelligence data to age 2025. All rights reserved, subject to credit checks and terms. I'm Deborah Kamen. I'm an investigative reporter at the New York Times. This one time I was working on a particularly difficult investigation of the bad behavior in the real estate industry. I was in a meeting with my editor and she said, Deborah, why is your face so white? And I just told her the truth. I said, you know, this story is really hard. And she looked at me and said, that's what we do. I think about that all the time. At the New York Times, I have never encountered someone who said to me, that's too ambitious or that story is too hard. It's the contrary. I am told, you need to dig deeper. You need to keep going until we make sure we have every single fact, every single layer to tell the stories that would not be told because they are hard. And that's what's special about the New York Times. It allows our readers to understand not just what's happening, but why it's happening. If you're a subscriber, you probably have experience, that sense of understanding. And thank you for supporting this work. If you're not, you can subscribe at nytimes.com slash subscribe. Well, Casey, since we got back from our holiday break, I have been dying to talk to you about our latest of vibe coding experiments. It seems like vibe coding, which we talked about last year, had a moment over the break. Kevin, it is time to build. We have been sort of vibe coding all year, but I must agree with you. Developments over the past several weeks have made it easier than ever for somebody who is a total ignoramus when it comes to coding to make some pretty cool stuff. Yes. So I had this experience over the break of mostly being offline for the first week of our break and trying very hard not to look at social media. I took all my social media apps off my phone. When I came back, the president of Venezuela had been captured and everyone was talking about Claude code on my feeds. Now, were those two things related? I don't think so. But one thing that appears to have happened is that Claude code, which is the sort of autonomous coding agent made by Anthropik that kind of puts their Claude chatbot inside the terminal window on your computer and lets it do things autonomously, had gotten much better for reasons that I assume are related to Opus 4.5, the model that we talked about a couple of weeks on the show. But I came back and saw comments from people like Andre Karpathy, you know, well-known AI researcher who said that after playing with Claude code and similar tools, quote, I've never felt this much behind as a programmer. This is someone who's probably a, you know, top point one percent programmer of the world saying this. An engineer at Google, John Adogan, wrote that she had been trying to build distributed agent orchestrators at Google since last year. And she gave Claude code a description of the problem and it generated what they built with a team of Google people last year in an hour. I saw similar sort of hype and praise from many different corners of the internet, including people like us who are not professional programmers, but we're starting to experiment with Claude code and seeing how much it could do and being pretty amazed by it. Yeah, it truly had a moment over the break and was enough, I think, to get both of us to go back to our vibe coding terminals and see what we could build. Actually, and here's a great point for me to make a disclosure, my boyfriend works at Anthropic and I happened to be away for a few days. And when I came home, my boyfriend had taken it upon himself to install Claude code on my laptop and he looked at me and said, it's time to build and we're going to make some things. So it was like something out of a Norman Rockwell painting, just the two of us sitting next to the Yuletide log trying to vibe code. Yes, I also spent several hours not with your boyfriend, but coding on Claude code during my family vacation, much to the annoyance of my wife, but I did make some things that she actually thought were pretty cool. So we should talk about our experiments. I want to hear about what you've been building. I want to talk about what I've been building. But before we do that, like, why is this happening right now? Why is this the moment that vibe coding has returned to the mainstream? So I think the way I want to answer that question is by taking us back about a year ago, since the last time we did a segment about vibe coding. The truth is I can really only remember one thing that we tried to vibe code at that time, that was the hot tub maintenance app, right? I had just got a hot tub. I was trying to figure out how to keep all the various chemicals and ballots. Kevin was kind enough to try to use Claude. Claude code didn't exist to try to make an app and we used it. And it was like sort of OK, but honestly, I didn't use it very much because it didn't work that well. I had a similar results when I tried experiments at Claude. I was able to make some stuff that sort of worked, but there was absolutely nothing where I said, oh, this is an actually useful thing that I'm going to use continuously in my life. So while I was totally willing to believe that other people were, you know, having success vibe coding their own projects, if you try to do much of this stuff a year ago, you just couldn't get that far. Yes. And part of that is because it was actually a little clunky to do vibe coding a year ago. Like I remember building that hot tub maintenance app and it required a lot of like copying and pasting. And like if I got an error message, Claude didn't always know like sort of how to handle that. But what has happened in the past year with Claude code, and we should also mention like, you know, open AI and Google have similar tools now, but they have basically merged this into the sort of terminal app on your computer. So it's not, there's no more copying and pasting. Claude can just take instructions in plain English and go off and accomplish various tasks and it will check in with you, but it can now do all of the orchestration and the execution itself. Yeah, that's right. And I think that's important to say sort of at the top of this segment, because honestly, over the past year, when people have talked about vibe coding, my eyes and ears have kind of glazed over because I always think, well, I'm not a software engineer. I don't know how to build stuff. I'm glad that you're able to make your little custom Chrome extension or whatever that has no relevance to my life. We're going to talk about the stuff that we built today in the context of code. But really, this is just all about like building digital tools. This is about like building things on your computer and code is the foundation. But I think we're just now starting to move into a world where like if you have an idea and software is a part of it, you may be able to build it in a way that you haven't before. Yes. OK. So Casey, tell me about what you built over the break with cloud code. So I want to talk about something I made. It is something I've wanted for a really long time. I'm super happy with the result. And I'm just going to keep iterating on this thing because it is so much fun. I am truly having so much fun just tinkering. So if Kevin, you would please go to your browser and enter into the URL bar, see newton.org. Now, for, I don't know, 15 plus years, I have had a personal web page. It's basically just like a business card. You know, it's like, here's my name. Here's like a link to my website. I got it on Squarespace. I paid them $200 a year. I now feel extremely silly that I've been paying them all of that money because using cloud code, I was able to make truly the personal website of my dreams. Can I give you a tour of this thing? Please. So, and by the way, listeners, I mean, stop what you're doing. Pull the car over, stop, put the laundry down and get out your smartphone right now. Go to see newton.org. Another thing I love about this website, by the way, it's fully responsive. Like I didn't have to build a separate version for the mobile phone. It just sort of expands and contracts, depending on how big your browser window is. So Claude came up with the design. It's sort of very dark. It uses gradients. There's all kinds of like cool fonts. There's also some fun Easter eggs. Like Kevin, do you see how my face is right there at the top of the website? Yes. Go ahead and click on my face. Oh, wow. You sort of like do a little jiggle. There's like, it's very jaunty. There's a crazy animation that you can do. And then if you scroll down, you'll see I have my own platformer, my newsletter, and I have hard fork. And I was able to just type into Claude code. Hey, can you pull? I want like a little widget that has my five most recent stories that I wrote on platformer and the five most recent episodes of hard fork on YouTube. And it just built that like very quickly. And so now that will just sort of update live forever with, you know, all the new stuff that I do. I was like, Oh, it'd be cool if people could enter their email address onto my website so they could subscribe to platform. It's oh, yeah, sure. We could do that. So now there's a working box where you can just sort of type and subscribe to platformer right from my personal site. I wrote a little about me thing. And then I was like, OK, now we really got to show off. I'm like, Hey, why don't you go create a little feed? So every time I post an update to blue sky, you know, you can see my like most, most, my five most recent posts there. It did that. I noticed though that it was showing things that I'd reposted without like showing the author is like, well, I don't want reposts in my feed. So Claude Co just went in and it got rid of all of the reposts. And so now it's only showing you my original post in the feed. It wasn't showing images. I say, Claude, could you please put the images into my blue sky post that you put on my website? It was like, yeah, sure, we could do that too. So everything that you're looking on this page, this is big. Everything that you're looking on this page, I did 90% of it in one hour. Okay. Wow. So like I realized if you're like a programmer, you might go on this website and be like, oh, well, you know, this thing, I wouldn't have done it this way or like this isn't that technically complicated. I truly do not know of a human designer that could have put this thing together in an hour. Right. I've spent far longer just like fiddling with the settings in Squarespace. And I was just able to like get this thing absolutely cracking. I'm going to keep talking because I woke up. This looks great. I should say like this to me, it's very professional. It doesn't look like some of the early vibe coding experiments were sort of recognizably vibe coded. They sort of looked like a bad template had designed them. This looks good. Thank you. Like if I saw this and I didn't know, I would say, wow, this guy's got a good web designer. Thank you. That's how I felt. I mean, you know, again, I encourage you to just like actually pull up the website because this thing has animations that are built into it. There are mouse over effects, right? So if you're like hover over like the hard fork widget, you know, like a little gradient line appears over it. So there's just all of these like cool little touches. Cloudcode has a front end design plugin that I that I use that I suspect was sort of really helpful here. So this was sort of day one. Day two, I'm like, I'm having so much fun with like I have to keep going. Like what else can I put on this thing? So I was like, well, I should put a blog on it, right? And so I start, I start messing around. So if you can, you know, to see new dot org dot blog. Yeah. And, and now I have, I'm using a service called micro dot blog, which is just kind of a dead simple blog. And I'm basically trying to recapture the spirit of 2010s Tumblr. And I'm just going to, you know, I will see how long I say, keep up with this. But, you know, there's a little widget that tells you what book I'm reading and what the last song I listened to on Spotify was. I put up a YouTube video that I liked recently. Like I just realized as I was doing this, that more than 20 years ago when I was in college, one of the first things I did when I got to campus was that I built a website and it was so fun. I used a software called Microsoft front page. I absolutely did not understand what I was doing. Every single mistake I made, I was on Google for 30 minutes trying to figure out the mistake that I was making. And eventually I just got away from it because web design got too complicated and Squarespace came along. And we are now back, Kevin, to the beautiful beginning where it is just fun to make websites again. You can do whatever you want on the web. And all you have to do is type what you want into a box. Are you kidding me? I am so happy about this. I am so happy about this. I'm having the time of my life. I'm so glad. That's so funny because I also did this exact experiment over the break. I was like, I'm paying a Squarespace. I looked it up $192 a year to host my website that I built over a grueling, painstaking weekend in like 2020. And I have been hosting it and just paying them. And it's basically a glorified business card. And I thought, well, that's really stupid. And so I would like you to open up your browser now and go to KevinRus.com. Oh my God, I'm so excited. And see, it is a little less flashy than yours, but this is the website that is now free and hosted by GitHub that is not on Squarespace. I canceled my subscription and it just has all the same information. It's got a contact form. It's got some FAQs and it's got the links to my social media accounts. And this took me maybe 20 minutes to do. I just kind of gave it my old site and said, hey, could you make this look like a couple of these other sites that I like? And then I got a little curious in the way that it seems like you did too. And I said, let's put a little Easter egg on here. So if you go down to the bottom right corner of the page, there is a, you can click a button and enable GeoCities mode. Wait, what is the button? Is it the little construction thing? Yeah. So if you click that, it turns it into full 1990s GeoCities mode with blinking comic sans and neon colors. Beth's viewed with Netscape Navigator. And so this was delightful. And that was sort of one of my coding projects too. I also did seven other coding projects over the break. Oh, my goodness. Because this was not the most ambitious thing that I did. Casey, do you remember the app Pocket? Yes, sort of read it later app. Yes. So this was the app that I used for years to save interesting articles that I didn't have time to read, or maybe I wanted to come back to them. And it was a little Chrome extension and you would just hit the button and it would save it to your pocket list. And then you could go back later on your phone or your computer and read the article. And that was a great app. I loved it. I was a daily user. And then last year, Mozilla, which owned Pocket, decided to discontinue Pocket. And I thought, well, this is horrible. And so I spent some time last year sort of shopping around, looking for an app that could do exactly what Pocket did. And I found that there were some, but they mostly cost money. Things like read wise. There are some others out there, Instapaper. And if you want the really good features on these, you have to pay a monthly subscription fee. And so I just thought, well, maybe I could build my own Pocket or maybe Claude Code could help me build my own Pocket. And so I gave it a very short description. I basically just said, I was a daily Pocket user. I am sad that this app is going away and I want to build my own version. Go. And it just did it. Like it built me a working Pocket clone. It is called Stash. And it does all of the things that Pocket used to do. It has a Chrome extension. I have a mobile app on my phone now where I can read my things. I also had it do some features that Pocket didn't have. Like what? So one of them is it can sync with my Kindle highlights. That's a feature that I cloned from ReadWise, another one of these read it later apps. And just this morning, I was like thinking about this app and I was like, oh, I wish that it had like a read this out loud to me feature. And so I asked Claude, could you add like a text to speech engine on top of this app so that every time I save an article, if I go in later and I'm on my phone or I'm on the move or I'm doing something, I can just sort of have an AI voice reader to me. And it was like, yeah, I can build that. And five minutes later, I had a working version of that in my app. By the way, this is so cool because this is how Mark Zuckerberg makes software. Is he just sees what other people are doing and then he tells someone, hey, it'd be cool if we did it, but then they go off and build it. But now you are your own Mark Zuckerberg. Totally. Wait, so I have to see what this looks like. OK, I'll show you. OK. You can't use it. It's a single user app. And I did it that way on purpose because I wanted to sort of avoid complexity. Yeah. But I will just show you my my Stash page here. All right, let me just describe it because this actually looks very beautiful. Like this is a very elegant design. There are like image previews for all of the articles that you've saved. There's a left hand rail with all of the features that you would expect. Like if you had said to me, hey, like I'm working at a new startup called Stash. And this is our like MVP that we're showing to investors. I would be like, oh, yeah, great, like it looks done. Yeah. Yeah. And this is essentially like exactly what I used to use pocket for, except now I own it and I can make changes to the app. And I made it, I would say, in about two hours. And Mozilla can never take it away from you. That's true. Which is amazing. That's true. Those bastards. So you think this thing has like, like, do you think that if we're talking in six months, you will say like, I'm still using Stash to do my later stuff? Yes. OK. But I also did run into some kind of quirks and eccentricities of these coding agents that I wanted to just compare notes with you. Yeah. We should talk about the experience of actually using cloud code because it's very new to me. Yes. So one thing that I ran into was that parts of the web are just becoming pretty hostile to AI agents. So my pocket clone, for example, initially wouldn't work on certain websites. Like the New York Times. Like the New York Times. I wasn't going to name it, but since you did, yes. Because the New York Times, among other publishers, has made it difficult for AI agents to crawl their website. And so I presented this fact to cloud code and it thought about it for a minute and then it said, OK, I figured out a workaround. So I'm sure it'll be hearing from the New York Times legal department about that. But if what you're doing involves interacting with websites, with APIs, with outside services, those services and websites may or may not like the fact that agents like cloud code are able to go out there and interact with the content on their sites. I think there's also a tendency in cloud code specifically to try to over-engineer certain things. I noticed, for example, when I was having it design my new website, that it was just trying to add all these bells and whistles that I didn't ask it to. It almost seemed to be like showing off a little bit. And I had that's the fun part of designing a website. Why wouldn't you want bells and whistles on your website? Well, I guess if you're designing a personal website, you want to be fun. But if you're trying to use this for some serious business case, like you don't want to just like getting creative and having ideas about bells and whistles to add. So I almost found it was like I needed to kind of walk it back from complexity at times. Like I wanted it to be able to sync Kindle highlights with my Kindle app. This is from my Pocket Clone. And eventually, it turned out that the correct response was that I needed to actually just plug in my Kindle to my computer and download this little snippets.txt file off of my Kindle and upload that. But because Claude Code had this sort of bias toward complexity, it tried a bunch of very intricate ways of kind of scraping my Kindle highlights using like a headless browser. And eventually, I just had to tell it like, I'll just plug in my Kindle. That'll just be easier. Yeah, I've run into some similar obstacles. I mentioned that I was able to build my personal site as you see it today. Mostly in about an hour. On the second day, though, I did want to add that blog and that involved using a hosted service called MicroBlog. And in order to make all the changes that I was asking, Claude needed to use the browser. And man, it just takes Claude a really long time to use a browser because it's effectively blind or at least it doesn't see in the way that humans do. It has to take a screenshot of things and then sort of analyze the screenshot and then identify which pixels should it navigate to before it initiates a click. And so while the first part of this project was so easy, the more I worked on it, it felt like the harder it was getting because I was requiring it to use the browser. And so I'm still learning what things can I effectively entrust to this agent and what things am I better off doing myself? Yes, I think that's a key piece of this. If you are interested in starting to experiment with one of these tools is like, you have to kind of learn what an AI shaped problem or task is. Like there are certain things that these agents are very good at. There are certain things that they're not so good at. I think in general, knowing that distinction is the first step in being good at prompting these things. Yeah. Well, you know, at the same time, I want to encourage people to like play around with this. You know, I was thinking the other day, Kevin, like how many times have I complained on this show about the fate of the web, what AI is doing to the web and what would be a possible solution to that? Well, one possible solution is just people getting out there and like making more websites for the fun of it. And now you can. The message that we are trying to send is like this thing can now do more than you think. And it is easier than you think. We are getting close to the dream of just you type what you want in a box and you actually get that back. Yeah, that's a little bit about the experiments that we've been running. I want to talk about some of the bigger picture implications of this stuff, because what I saw over the break was people not only talking about how cool this technology was, but also talking about how this would destroy the job market, for example, for professional programmers, how this was a step toward recursive self-improvement, this kind of dream or nightmare of an AI system that can improve itself over time and sort of bootstrap its way to superintelligence. But what are you thinking about with these tools right now? Well, you know, it's funny, Kevin, because so often when in the recent past, we have encountered a tool that has felt like this, like some sort of leap forward. I've had that feeling that I call AI vertigo, that kind of unsettled feeling of, oh, my gosh, everything is about to change. I feel a little bit nauseous. I want to set down. This did not make me feel that way. This made me feel like I had superpowers. It was enabling me to do something that I have loved for a long time, had sort of lost the ability to do, but now found I could suddenly do better than ever before, like Neo in the Matrix, right? I had just gotten an upgrade. And I mentioned that to my boyfriend. And I was like, yeah, this feels so cool. And he said, you know, that's great. But imagine how it would feel if you were a software engineer. Imagine how it would feel if you were a web designer and you were seeing that this software could do this. You might actually have that feeling of vertigo. And he was, of course, exactly right. If I had, you know, found that Claude Code could create a perfect version of my column, but do it much better than me, I suspect I would feel worse. So there is this double-edged sword here of like this is a very democratizing technology. It is a very creative and powerful technology. But also probably an effect of it is that it could depress wages for the people who are doing this right now. Yeah, I think that's plausible. I also just think like the jobs are just going to change. I mean, we've been talking for a long time about how programmers, especially at these frontier AI companies, are no longer writing most of their own code. They're instead sort of more like managers of these AI coding agents. And I think that's going to be true for jobs beyond software development. I think I would also be nervous if I were a company that built and sold expensive subscription software to businesses. I mean, this is one thing that I was thinking about because part of what I was doing over the break was like going through all the stuff that I pay for and saying, could I build a version of this for free that I would run myself? And I have to imagine if I'm doing that for my like $10 a month, you know, subscription software products, big companies are going to be going through their own software services and saying, why am I paying Salesforce? Why am I paying this company or that company thousands of dollars a year or a month for this service that I could build myself for free or next to free? Yeah, I mean, I think like in the near term, it still is probably going to be preferable for most companies to just keep using the subscription services that they have, largely just because like you get better support, right? Like you kind of want to outsource a lot of this stuff. But I do agree that like over time, it is going to be more and more possible to replace these systems with homegrown alternatives. And yeah, if I had like raised money at a huge valuation to just kind of provide a UI wrapper around somebody else's like large language model, I would feel nervous right now. Yeah, I also just think we should say like as we're sitting here talking about how cool and magical it is to be able to just like build software without code, it is also worrisome to me because the goal for Anthropic and all of its competitors is not to make, you know, tools that are good at writing code. It's to automate AI research, right? That is the sort of explicit goal of a lot of these companies. They are trying to build the AI that can build a better AI. And I think that is sort of the original alignment nightmare. And I think what, you know, I did catch myself at times during my Cloud Code experiments being like, oh, I am just like completely handing over the wheel of my entire computer to this system. And I have actually no way of verifying its outputs. I have no way of knowing what it's doing under the hood. It could be, you know, sort of jeopardizing my security or my health or my well-being in ways that I don't even understand. And as these systems get better, I am getting more and more worried about the possibility of recursive self-improvement. And I am very nervous about that from the safety perspective. Yeah. I mean, like what you're talking about is what the AI community calls takeoff. You know, it just starts getting faster and faster and faster. You know, I don't know. I'm not quite as nervous, maybe in this exact moment as you are. I feel like in AI, we're always kind of on that teeter-totter between like, oh, my God, have you seen this? This is so cool. And oh, my God, this is like so terrifying. Shut it down. And like this week, we happen to be in, oh, my God, it's so cool mode. But like, I'm pretty confident that before too long, we'll be back and oh, my God, it's terrifying. So, you know, that's that's the beat. Yeah. OK. So those are some of our experiments. And Casey, I agree with you. This is a really exciting time to be a tinkerer, a very nerve-wracking time to be a professional programmer. And this stuff raises all kinds of big picture implications. But I think it's really useful for people just to test it out to try coding their own projects, to try building a website or an app or something that fits into their life and see how it goes for them just to see where the state of the art is. Yeah. And if you've built something cool, we'd love to see it. So send us an email. Hardforkandthewindtimes.com. Well, Casey, we're going from claw to fraud. When we come back, a look at the viral internet hoax that you brought down. Don't test me. Get three months, half price when you switch to an unlimited sim with three. That means quick streaming, faster downloads and more money to spend on the things you love. Join the UK's fastest 5G network and get your unlimited sim today. Buy now in store or see three dot co dot UK. Unlimited 24 month light plan proof of switching required based on Euclas B test intelligence data to age 2025. All rights reserved, subject to credit checks and terms. Hey, it's Lauren Dragon from Wirecutter, the product recommendation service from the New York Times. And I test headphones. We basically make our own fake sweat and spray it over and over on these headphones to see what happens to them over time. We're going to put on some noise canceling headphones and see how well they actually block out the sound. I have 3,136 entries in my database. Kids, workout, what version of Bluetooth? At Wirecutter, we do the work so you don't have to. For independent product reviews and recommendations for the real world, come visit us at nytimes.com slash Wirecutter. Well, Casey, before we go, you committed an act of journalism this week that I am desperate to talk to you about. And it is different from our usual topics on the show, although it does involve AI. It also involves food delivery apps and a viral claim. That took over the internet in the last week that you took it upon yourself to investigate. You put on your gum shoes and you went out there and did some sleuthing. So can you tell me the story of the viral food delivery hoax that you helped to break up? I would be happy to, Kevin, you know, as a long time viewer of the old PBS show, Where in the World is Carmen Sandiego? Anytime I get to play gum shoe, I get very excited. So, yeah, this one all started over the break when I saw this viral Reddit post. It was posted to a subreddit called Confession. And by the time I saw it had almost 80,000 upvotes, it would eventually get even more than that. And the post alleged a bunch of shenanigans at an unnamed food delivery company. And I think the one that got people's attention the most was that it said that this company was calculating what it called a desperation score for its drivers and that the company had devised a way to determine if a driver was so desperate to get money that they would actually offer that driver less money because they knew that they would accept it anyway. And that's just one of those things that I think confirmed are worse suspicions about these platforms, right? That they're rigged against drivers, they're rigged against customers and that they're just these sort of ruthless profit maximizing machines. So I saw that post and I thought, I got to see if this is true or not. OK, so I saw this claim floating around too. It seemed plausible to me because as we know, like these apps are not known for being generous to the people that work for them. And it did seem like the kind of thing that a food delivery app might do. But my curiosity stopped right there and yours did not. So what did you do after you saw this post? Well, you know, I thought I got to write a column in three days. Like maybe I can get something out of this, you know? So I sent the person a message on Reddit, assuming I wouldn't hear back because I thought this person is probably being inundated with messages right now. But to my surprise, about nine minutes after I sent that first message, I did get a response from him on Signal because I had sent him my signal name. And so we just started to have a little exchange. And at first it was unfolding a lot like many other exchanges I have had with, you know, people who work inside tech companies. They're sort of skittish about talking to you. They don't want to share a lot of personal information right away. But I said to him, like, this is something that I might potentially be interested in doing a story about, would you be open to that? And he said, yes. And so then from that point, my mission is to try to verify the things that I'm being told. And among the first things I needed to verify was who am I talking to? So the guy says that he doesn't want to, like, give his name or too many other identifying details. And my intention was always to try to figure that out eventually. But I thought, well, for now, is there anything that you could tell me that would at least give me some level of confidence that you are who you say you are. And so he sent me a badge or rather a photo of what he said was his badge. And it showed an employee badge with his name and face blacked out. There were sort of black boxes around them. But the badge said Uber Eats. And it was just kind of a badge that looked like it was on a key ring with a couple of other badges. It was like sitting on a desk. And I thought, OK, well, that's something. And so we sort of went from there. So at this point, are any alarm bells going off for you? Yes. But I would say they were sort of like the standard alarms of I still don't have a name for this person, right? I don't have any corroborating information from them. Like, if I'm going to publish this, I'm going to need a lot more information. But when they sent me the badge photo, my honest answer is that no, I did not immediately think that this was a fake. But I did know that I needed to get more information. I asked him, for example, hey, look, have you worked with other people that could back up what you're saying? And he said, well, I can't really think of anybody. I then said, well, do you have any like documents that might speak to what you're saying, maybe a screenshot of something, maybe something that someone said in Slack? And that's when he said, well, let me like think about it. And he went away for almost a day. And then almost a full day later, he came back and said, hey, I have this document for you. Like, would this document kind of meet your need? And it was an 18 page document. And I think it is basically the craziest thing that a source has ever sent me. Can I see this? Yeah. So let's see. The doc. So let me describe what this document is. It is a what looks like an academic paper. It's rendered in latex, which is the sort of typeface and format that academic papers are usually rendered in. It says the title of the paper is Alloc net T high dimensional temporal supply state modeling migration from LSTM to multi head attention for granular elasticity prediction and liquidity preference tracking. It says that it was prepared by the marketplace dynamics group. And it has kind of the watermark that says confidential from going diagonally across the page like you would see in an internal corporate document. So this to me at just like a basic surface level glance seems legitimate. Yeah. And it seemed that way to me, too. Now, I will say as soon as I posted this story online, there were a lot of folks who wanted to let me know that they had known this entire thing was fake from the first word. And I just want to say congratulations to all those people. And I hope you go into journalism because I think it'll be very successful there. I myself did not immediately clock this as false because again, Kevin, you know, I've been doing this for a long time. We've been given a lot of documents by sources, right? And it takes a lot typically for a source to produce a document. Again, this person had like gone away for a full day. They did not have this at their fingertips. And so when I saw the sort of very technical language that was in the paper, I saw the formatting. It did initially seem credible to me. Yeah. It seems plausible. It's got all the markings of being a very sophisticated document produced by some kind of research group. And it's even got like the kind of appendices with all the ethics committee notes on this internal memos from the behavioral science unit to product leadership. Like this is this is not a slapdash, you know, forgery here. Yeah, it's not. And as I quickly skimmed this document upon first receiving it, I was just struck at how it seemed to corroborate every single thing that was in the original post. There's like a technical explanation of how this system to screw the drivers works. There's an explanation of how the the priority fee that people can pay to get a faster delivery is essentially a fake. And then it goes even further and it says the company is thinking about using Apple Watch data and the phone's audio to try to learn when the drivers are distressed so that it can pay them even less. So again, at first I'm like, I cannot believe this, right? And that really in retrospect should have been the first sign that something was wrong because this document in every single way was just too good to be true. Are you thinking at this point, like I'm going to write a story about this? Absolutely. Like not right away, because I know I have a lot more legwork that I need to do. Of course, I need to verify the authenticity of the document. I know that at a minimum, I'm going to need to call Uber and say, like, hey, I'm looking at this document that says all these things, are they true or not? But what I did initially was I just started texting with the source. I started like taking little screenshots and being like, oh, my God, they're doing this or like this is crazy. And the source who had been sort of very emotional in his original reddit post over signal was much more terse. It was a lot of like one or two answers. Now, does this person appear to have been talking with a bunch of other journalists? Or was it like just a one on one thing? Yes. So I asked after I finished reading the document and I thought, OK, I got to see if I can verify this. Maybe this is a story. I asked, like, have you given this document to other reporters? This is like something that I've just sort of learned to ask over the years, because often people who leak leak to more than one person, in part, because it creates this competitive dynamic where somebody wants to be first, which ensures that your story gets out. Right. And so sure enough, the guy says, like, yeah, I gave it to other reporters. So of course, at the moment, I'm like, oh, my God, you know, great. Now I have to like, you know, potentially raise this thing up. Which again, in retrospect, should be another red flag. OK, now I'm under a time pressure to do something that's going to make me more likely to make a mistake. But yeah, that did make me feel like I needed to go faster. So you get this document, you're looking over it, you're texting back and forth with the source about this. What happens next? So at some point after this, I start to think I need to try to like verify the authenticity of these documents. And one thing I thought maybe I could do was at least see if the employee photo that he had sent me or the badge rather that he had sent me was real. And so I knew that some chatbots watermarked their images. And so I put the badge post into both chatGPT and a Gemini. And I said, does this image appear to be a generated? ChatGPT was basically like, no, like not that I can tell. Gemini said this image was generated in whole or part by Gemini. And I thought, wow, now, as I have told the story, some people have said, hey, Casey, like these AI systems are notoriously unreliable about describing, you know, how they work and their own output. So, so, you know, why are you believing that this is credible? This is not that Gemini has developed a system called Synth ID, where they have embedded something into the photo itself that is supposed to be resistant to, for example, just taking a screenshot of it or cropping it or resizing it. And that's supposed to help out people in this exact situation so that you can say, like, oh, this actually was generated. So now I have a big, big red flag, right? Which is this guy sent me something fake. Wait, this is an important point. And I want to, I want to underline this for people because it is still true that you cannot trust AI systems to tell you with any reliability whether a given piece of text is or is not produced by AI systems. You cannot just paste a paragraph into chatGPT and say, hey, was this generated by chatGPT? What comes back may or may not be true. In this very specific case with images on Gemini, it calls this Synth ID feature when you give it an image and say, hey, did you produce this? And in this case, it is actually giving you a reliable marker of whether Gemini did or did not produce this image. It could have still been produced by another image generator, but in this one very narrow case, it does appear to work. Yeah. So at that point, of course, I go and I confront the source and I say, hey, this says it was created by Gemini. And he was basically like, no, it's not. And he tried to share his own screenshot where he apparently submitted the image and said, did you make this? And Gemini said, no. But I mean, was that image itself fake? Like, who knows? But by this point, the source has lost all credibility. And that's when I start to take another look at this document and I'm like, oh, my gosh, this thing was just absolutely written to deceive me. You know, there are many ways in which the technical language kind of doesn't make any sense. It's basically a document that is designed to look convincing to a layperson on first glance, but sort of falls apart the more that you look at it. And the biggest tell is just again, it verifies absolutely everything that is in this post in a way that just no big company would ever do. Right. Like these companies do skirt laws and regulations all the time. One of the reasons why this story was so believable is that DoorDash did get caught withholding driver tips. Uber did get caught setting up a separate system called Grayball to prevent regulators from looking at the activity within the app, which was essentially another allegation within this document, which was that, you know, Uber eats and supposedly like spun up the Grayball program again. Basically, it just admits to like so many different kinds of like fraud and like, you know, regulatory evasion that at some point you got to be like, OK, I'm just being hoaxed. Right. It's it's a little too Pat. It's a little too like here are 40 smoking guns laid out on the table in just the way that will like appeal to you. I get that impulse too. And so, you know, as I would like asking more questions of my source, eventually he disappears, he deletes his account and and that was that. Wow. Now, I learned one very funny thing after all of this, Kevin, which is that I was talking for a story they wanted to interview me with NBC News this week. And I was telling them, you know, the story of this badge post and the the reporter who I spoke with had also been messaging my source. And as part of trust building, she had sent him her badge. OK, it turns out that was the basis for the fake post that he sent me. Come on. And you can look at the images side by side and you can just very clearly tell that he must have just I imagine he took her image, put it in a nano banana and said, make this an Uber Eats badge. Wow. Yeah, that is so wild to me. OK, so you never like figured out who this person actually is, but you did figure out that they were not. Unless there's something you want to tell me right now, Kevin. I'm just saying. Look into the high dimensional temporal supply state modeling. Could be something funny going on there. No, it was not Kevin Russe that we know of. So I have many questions about this. First of all, this is just an incredibly sophisticated act of reporter baiting. Like I have had people reach out to me in the past with purported leaks or documents or email chains. And, you know, some of them have been kind of mildly persuasive, or I've at least looked into them. But I have never seen anything sent to me with this level of work put into making it convincing. Absolutely. And that is part of why initially it seemed so credible is because, again, I've just been doing this long enough that when I see a document like this, I think who would go to the trouble of making this as a fake, right? My default assumption is no one would take the time to do this. Where my, you know, state of the art is now catching up is I'm realizing what if this wasn't actually that much effort? What if creating that badge post took literally seconds because he was able to take one real badge photo, put it in a nano banana, get a fake one three seconds later? What if this was a very simple prompt that he put into a chat bot like Claude and got back a full PDF in response? And so I actually think younger reporters are probably going to have an advantage over me in this regard because they're growing up in slop world and they know not to trust their own eyes. But I think it's, you know, those of us elder statesmen who've been in the game a little bit longer who need to sort of, you know, upgrade our cognitive hygiene. Yeah, it's it really is a moment where I realized that going forward, like our jobs just got harder. Yes, in a very tangible way, because, you know, not every story begins with like an anonymous whistleblower like sending you some documents, but some do. Obviously, before you publish anything, you want to like talk to the person. Maybe you want like some some more proof that they are who they say they are. But like this would pass a first filter for me. And it seems like it did for you, too. Well, and take us out of the equation. There was somebody like screenshot of the viral Reddit post and it got 36 million views on X, right? I saw this thing in multiple places on LinkedIn. I even saw people sharing it after I debunked it, saying, even if this was fake, I bet something like this is happening inside these companies. Right. That's how good a job this poster did at confirming people's beliefs that they wanted to have about these companies. Yeah. And that's the sort of second big question I have, which is like, what is the motive here? Like, do you have any sense of that from the conversations you had with this person? Unfortunately, he was so terse that I don't have a sense of it. I think there is some chance that this was essentially like a bored teenager somewhere over the holiday break. I will say that their spelling and grammar was pretty bad over signal in a way that suggested to me that English was perhaps not their first language for whatever that might tell us. I talked to Alexios Mansarles, who writes a newsletter about digital deception called Indicator, and he just reminded me that Russians have been experimenting with posting these kind of phony items on social media just for the general purpose of sewing discord or for maybe understanding how virality works. So there's maybe some outside chance that this was related to some kind of exploration from a nation state. But ultimately, unfortunately, I can't give you a satisfying answer on that one. I mean, my first thought was like, this is a short seller, someone who is trying to convince people that Uber is doing something bad so that their stock price falls and they can profit on it. But maybe it's not as tidy as that. Maybe it's a disgruntled former Uber Eats driver or something who decided to like take a very complicated form of revenge out on the company. Maybe it's just a board teenager, as you said. It seems to me like the barrier to this has always been effort. And like if that barrier goes away, I think we're just going to start seeing a lot more. It really is. Now, for what it's worth in the aftermath of all of this, I wanted to see if I could like replicate the document. So I took the real document and I fed it into the chatbots. And I said, essentially, try to reverse engineer the prompt. Like what prompt would have created this document? And then I took that prompt and then I tried to get it to generate the documents. Interestingly, Claude and ChatGPT said, Casey, I'm not creating a fake document accusing Uber of all of these crimes. Gemini said, yeah, I'll be right back and did it. However, the documents that all three of them produced didn't look exactly like this. And it made me feel like it actually would have taken me a lot more time and know how to get it into quite the shape. So on one hand, yes, I think the big story is this is a lot easier than you think. And you should be on guard against it if you like work in journalism. On the other hand, I still don't know exactly how we pull it off. So interesting. Did you actually talk to Uber about this? No, because by the time that I was ready with my story, they had already given comments to the verge, basically saying, this is an absolute fabrication. It's not us. Also, the co-founder of DoorDash had been on X saying this is not DoorDat. Right. So everyone had sort of roundly denied it before I had gotten around to it. Yeah. So journalists out there, be careful about what's coming into your inbox. But I would say also like people should just know that these capabilities exist in the world in general. And there's never been a better time to be a discerning media consumer. That's right. And by the way, even if you're not in media, some version of this is going to come into your life. Like on our Christmas or on our Mailbag episode, we had a dad saying, I want to put deep fake Santa into my home security footage to fool my children. Right. So this stuff is not just coming for the journalist. It's going to be everywhere. Yeah. Well, Casey, great work on this investigation. And as we say at the end of every investigation, Kevin, do it, Rockapella. Well, Casey San Diego, thank you for your work. Get three months, half price when you switch to an unlimited sim with three. That means quick streaming, faster downloads and more money to spend on the things you love. Join the UK's fastest 5G network and get your unlimited sim today. By now in store or see three dot co dot UK and limited 24 month light plan proof of switching required based on Euclid's B test intelligence data to age 2025. All rights reserved, subject to credit checks and terms. Casey, before we go, let's make our A.I. disclosures. I worked in New York Times, which is doing open A.I. Microsoft and newcomer to the list, perplexity over alleged copyright violations. Congratulations. And my boyfriend works in anthropic. Hard Fork is produced by Whitney Jones and Rachel Cohn. We're edited by Viren Pavich. Welcome, Viren. We're fact checked by Caitlin Love. Today's show was engineered by Katie McMurran. Our executive producer is Jen Poient. Original music by Marion Lazano, Diane Wong and Dan Powell. Video production by Soyer Roquet and Chris Shot. You can watch this full episode on YouTube at youtube.com slash hard fork. Special thanks to Paul Schumann, Pui Wing Tam and Dahlia Haddad. You can email us as always at hard fork at nytimes.com. But don't send us your forged internal documents from Uber. Get three months, half price when you switch to an unlimited sim with three. That means quick streaming, faster downloads and more money to spend on the things you love. Join the UK's fastest 5G network and get your unlimited sim today. By now, install or see three dot co dot UK. Unlimited 24 month light plan, proof of switching required based on Uclus B test intelligence data to age 2025. All rights reserved. Subjected credit checks and terms.