The Persuasion Machine: David Rand on How LLMs Can Reshape Political Beliefs
58 min
•Feb 10, 20262 months agoSummary
David Rand discusses how large language models can persuade people on political beliefs through fact-based arguments, presenting research showing AI chatbots moved opposition voters 3-10 percentage points on candidate preference. The episode explores the accuracy-persuasion tradeoff, where models trained to be more persuasive become less accurate, and examines policy implications around AI transparency, training data, and system prompts.
Insights
- LLMs persuade primarily through presenting facts and evidence, not psychological manipulation tactics, making accuracy of information critical to persuasion effectiveness
- Bigger, more powerful models show only logarithmic increases in persuasiveness with steep diminishing returns, suggesting near-term ceiling on AI political influence
- Training data determines what models consider 'good' information; an accuracy asymmetry exists where conservative claims are less factually grounded than liberal ones in current data
- System prompts and training data transparency are more important than simply disclosing AI identity; people ignore bias warnings but respond to compelling factual arguments
- Social media bot detection has become nearly impossible with frontier LLMs; platforms have removed friction mechanisms that previously limited bot swarm effectiveness
Trends
AI-powered political persuasion shifting from social media bots to frontier model integration as primary information source for millions of usersPartisan information warfare moving from public social platforms to opaque AI interactions, making engagement and influence measurement difficultGovernment domain authority (.gov, .edu) being weaponized in training data to establish false credibility for alternate narratives in AI systemsRegulatory focus shifting from content moderation to model transparency requirements around system prompts and training data sourcesEmergence of alternative AI platforms (Grokipedia) as battlegrounds for controlling what counts as 'reliable source' in model trainingCross-national political persuasion experiments showing larger effects in less information-saturated environments like Canada and Poland vs. USCommunity notes and fact-checking mechanisms proving ineffective against AI persuasion when content itself is compelling regardless of source biasOpen-source model fine-tuning democratizing persuasive AI capabilities, reducing barrier to entry for non-state actors to deploy political influence operations
Topics
LLM Political Persuasion ResearchAI Accuracy vs. Persuasiveness TradeoffSystem Prompt Transparency RequirementsTraining Data Bias in Language ModelsAI Chatbot Conspiracy Theory DebunkingAI Swarms and Automated Influence OperationsContent Moderation on Frontier ModelsFact-Checking vs. AI-Generated MisinformationCross-National AI Persuasion EffectsModel Size and Persuasiveness ScalingAttention vs. Accuracy in Information SharingCommunity Notes EffectivenessAI Regulatory Policy ImplicationsGrok and Perplexity Fact-Checking BiasConfirmation Bias and Bayesian Rationality
Companies
OpenAI
Discussed as frontier model provider whose system prompts and training data transparency could influence political pe...
Anthropic
Claude model mentioned as increasingly sophisticated for coding and automation tasks relevant to AI influence operations
X (formerly Twitter)
Platform where Grok AI fact-checking and system prompt changes (Mecca Hitler debacle) demonstrated real-world persuas...
Perplexity
AI answer engine analyzed for fact-checking accuracy and partisan bias in evaluating political claims
Meta
Facebook platform used in field experiments testing accuracy nudges and attention-based interventions on misinformati...
Wikipedia
Training data source in political war; subject of congressional pressure and alternative platforms like Grokipedia
People
David Rand
Research professor at Vanderbilt studying how LLMs persuade on political beliefs and conspiracy theories through evid...
Alan Rosenstein
Associate Professor at University of Minnesota and Lawfare Senior Editor hosting the episode and discussing policy im...
Rene DiResta
Lawfare Contributing Editor co-hosting discussion on AI speech gatekeeping and content moderation policy
Jacob Michangela
Research Professor at Vanderbilt and founder of Future of Free Speech, co-author on AI chatbot speech gatekeeper report
Jacob Shapiro
John Foster Dulles Professor at Princeton, co-author examining how AI systems mediate global speech and values
Gordon Pennycook
Close collaborator with Rand on research distinguishing between analytical thinking and gut-based misinformation belief
Tom Costello
Collaborator who designed survey methodology for conspiracy belief studies using free-text responses
Sam Altman
OpenAI CEO whose discretionary control over model behavior represents systemic threat to information integrity
Elon Musk
X owner who changed Grok system prompt, demonstrating how single decision-maker can shift AI-mediated political infor...
Josh Goldstein
Lead author on generative language models and automated influence operations threat model paper with OpenAI
Quotes
"AI only works if society lets it work. There are so many questions have to be figured out"
Unknown speaker•Opening segment
"when you ask them if they believe it or not, they're quite good at telling true from false posts in general. But when you ask them about what they would share, whether it was true or not, basically didn't matter at all"
David Rand•Early discussion on attention vs. accuracy
"just getting them to think about whether it's accurate before they make the decision about whether to share or not makes them much more discerning in their sharing, much less likely to share inaccurate content"
David Rand•Accuracy nudge findings
"roughly a quarter of the people that believed the conspiracy beforehand got talked out of it by the model and didn't believe it afterwards and then continued to not believe it"
David Rand•Conspiracy debunking results
"the biggest takeaway for me for this is that if sam altman decided there was some particular issue that he wanted to people to think differently on it is like trivial to just put the finger on the scale and then you've got millions of people going there all the time asking for advice"
David Rand•Policy implications discussion
Full Transcript
When the AI overlords take over, what are you most excited about? It's not crazy, it's just smart. And just this year, in the first six months, there have been something like a thousand laws. Who's actually building the scaffolding around how it's going to work, how everyday folks are going to use it? AI only works if society lets it work. There are so many questions have to be figured out and... Nobody came to my bonus class. Let's enforce the rules of the road. Welcome to Scaling Laws, a podcast from Lawfare and the University of Texas School of Law that explores the intersection of AI, law, and policy. I'm Alan Rosenstein, Associate Professor of Law at the University of Minnesota and a Senior Editor at Lawfare. For today's episode, Lawfare Contributing Editor Rene DiResta and I spoke with Jacob Michangama, Research Professor of Political Science at Vanderbilt University and Founder and executive director of the Future of Free Speech, and Jacob Shapiro, the John Foster Dulles Professor of International Affairs at Princeton University. We discussed their new report examining how AI chatbots are becoming gatekeepers of global speech, whose laws and whose values get baked into the systems that increasingly mediate our access to information and ideas. We explored how different countries regulate AI speech, how major models perform on free expression benchmarks, and whether the content moderation debates of social media are now arriving at a new frontier. You can reach us at scalinglaws at lawfaremedia.org, and we hope you enjoy the show. All right, Dave, welcome. Thanks for joining us today. It's great to be here. Thanks for the invite. So you have had a really fascinating set of topics that you've covered, even just in the last five years of work. I think you've done a bunch of work on the idea of attention and how people share information, the things that they focus on. And I think some of the early stuff that I focused on in your work was this idea of attention, not ignorance, right? The idea that people share and sometimes believe bad information because they are not thinking about accuracy in the moment. I'd love maybe for you to kind of introduce our listeners to your work from there. Sure. So we did a bunch of studies where we found that when you ask people, you know, you show people social media posts and you say, would you share this online? things look very different from if you say, do you believe this or not? When you ask them if they believe it or not, they're quite good at telling true from false posts in general. But when you ask them about what they would share, whether it was true or not, basically didn't matter at all. And we're like, oh, well, maybe people just don't care about accuracy. If you ask them, people say it's really important, you know, not to share an accurate content. And then what we showed is that just getting them to think about whether it's accurate before they make the decision about whether to share or not makes them much more discerning in their sharing, much less likely to share inaccurate content. It's basically just like reminding them of it. And if you think about it, when you're on social media, the whole thing is particularly designed to not make you pay attention to accuracy. You're like scrolling really quickly. The news is mixed in with cat videos and baby pictures and all this stuff where accuracy is just totally not relevant. and you're getting all of this social feedback about like who liked what and so on. And so your attention is focused on all of this stuff that's not accuracy. And then so what we've shown is you just shift people's attention back to, is this true or not? It can meaningfully reduce their sharing of bad content. And we've shown this in survey experiments in the US. We've done cross-cultural survey experiments. And we've done field experiments on Twitter and Facebook, showing that lots of different ways of redirecting people's attention to accuracy improves the quality of what they share. And is a lot of that tied to identity and the idea of people feeling like they're much more partisan actors when they're online? Does it tie into the kind of work that you see maybe Chris Bell or people doing around political polarization? Or do you see this as totally separate? I see it as pretty separate. And I think it's more about like, what that environment is designed to have you pay attention to rather than about, and I think that a lot of what you're paying attention to is the social aspect. And so to some extent that's identity, but I think it's more just that like most of the content you're seeing in your feed isn't news. So it doesn't really make sense to be thinking in general about each piece of content, is this accurate or not? Because that's just like not relevant most of the time. And so I think that's the core mechanism. And the extent to which getting people to think about accuracy as helpful is only the extent to which their beliefs are correct. Like, if they incorrectly believe that something is true, then getting them to think about whether it's true or not is not going to help. They're going to be like, oh, yeah, great, I should share this. But our work suggests that a good chunk of the sharing of inaccurate content by sort of everyday people is due to just not paying attention rather than honestly believing it. You've done a bunch of work on conspiracy theories over the years also, and ways in which people share conspiratorial content to. I think, and then you've also now recently begun connecting that to LLMs and conspiracy theories. Yeah. So conspiracy theories are really interesting. You know, there are like a widely documented area where people believe things that are pretty clearly not true. And we had some work that showed that part of the reason people believe conspiracy theories is that they're overconfident. And so part of it, which has long been documented, is consistent with things that we've shown in the sort of misinformation fake news context. If you just don't think that much, if you're sort of relying on your gut and your intuition, then you're more likely to believe conspiracies compared to if you stop and think carefully about it. But also we showed that there's this element of overconfidence that people that believe conspiracy theories are more likely to really have a lot of confidence in their own abilities and sort of like unjustified confidence in their own abilities. And part of what that plays into is conspiracy theorists actually dramatically overestimating how much other people believe in conspiracies. So we did this study where we find all these conspiracies where a tiny fraction of people believe them. And among those people, if you say, how many other people do you think believe it? Essentially, for all the different conspiracies we looked at, on average, they thought a majority of other people also believed their crazy conspiracy. And are they thinking about accuracy in the moment? Or do they think that it is accurate, I guess? I'm kind of curious about how you relate those things. Yeah. So there's actually two separate ideas that I've worked a lot on with my close collaborator, Gordon Pennycook, where one of them is what I mentioned before, which is you're just not thinking about accuracy. You're scrolling through your feed. You're thinking about, you know, how funny is this or whatever. And so you forget to do that, like, you know, check of, oh, is it true? And the other thing is when you are thinking about whether things are true or not, and you're trying to make sense of the world, to what extent are you thinking carefully and analytically and deliberating and saying what's going on versus are you just going with your gut and being like, yeah, whatever, that seems right. and so in the context of believing you know like fake news and believing conspiracies and like all different kinds of misinformation on many different topics there's a very robust result that the more people just kind of go with their gut and like their sort of first response the more inclined they are to believe and then if they stop and think about it if it's not true it doesn't make sense, then they're more likely to be like, oh, okay, actually, this probably isn't true. And so I think that's the idea with the conspiracy theories is people that believe conspiracy theories tend to be people that don't stop and think carefully. And some of your work looks at accuracy nudges then as a mechanism for putting people into that reflective mindset. That's right. And so basically, what we've shown is lots of different, very light touch interventions essentially can just prime the concept of accuracy and then get people to be more inclined to stop and think about, is this accurate when deciding whether to share it? So let's talk about how people decide if something is accurate. We can keep going with the conspiracy theorists, maybe. One of the things that we have seen is a real shift in how people get information online, even just over the last two years, particularly as AI answer engines have replaced the idea that you need to go and search, right? Information is synthesized for you in very different ways. We see a rise in people using chatbots. And I know that a lot of your work now is looking at this question of how do people engage with chatbots to find information. And I think what has been very interesting about your recent work is actually the idea that this notion of a machine that will do interactive dialogue with you at scale, right, that will actually sit there and have a conversation with you, has really changed the way that people think about getting or discerning accurate facts. I'd love maybe for you to tell the lawfare audience about your papers on that front, or just what you're finding. Yeah, totally. So I think at the most basic level, if you're trying to make sense of, you know, here's the claim, is it true or not? The first thing you have to go on is your gut response. And like I say, some people will say, yeah, it seems true, good. Or no, it doesn't seem true good and that's the end of it and then if you want to actually sort of figure out what's going on uh you can stop and reflect on like how well does this fit with everything else i know about the world and you know sometimes you might think of that as confirmation bias which is if you're more likely to believe things that align with what you already believe that's talked about as a cognitive bias but it's actually in a lot of cases a totally reasonable and rational thing um because you're getting a piece of information, you're getting it from some source, and you're kind of jointly assessing, you know, how accurate is this information and how reliable is the source? And you can show using like Bayesian models of rationality that like a total rational, a totally rational truth seeking agent in that situation will display what you would call confirmation bias, which is if you get told something that really doesn't fit with your pre-existing beliefs, it's rational to conclude that it's more likely that the source is unreliable than that everything else you've ever heard is wrong. And so, you know, for example, when the guy in the tinfoil hat on the street stops me and tells me he's abducted by aliens, it doesn't make me any more likely to believe that aliens exist. And that's not because I have some strong motivation to disbelieve in aliens. It's just it seems much more likely that he's a crank than that aliens actually exist. So, you know, that's some baseline. You can, you know, you reflect on how well it fits with what you know about the world. And then you can also go and gather new information. And like you were saying, traditionally, the way you would gather information is, you know, reading books or then doing web searches. And now a lot of the way people are gathering information is by going and talking to chatbots. And I think that what a lot of our work has focused on in the last couple of years is uh the power of conversations with chatbots to change people's minds and in particular to change people's minds about the kind of thing that uh is generally considered resistant to evidence so that's why we started with conspiracy theories we're like okay conspiracy theories are a classic example of a belief where there's lots of disconfirmatory evidence out there. And so the fact that people believe popular debunked conspiracies, despite the existence of the debunking evidence, has long been held up as suggesting that conspiracy beliefs basically don't respond to evidence. And the standard explanation is, oh, well, people want to believe for various different reasons, and therefore they ignore the inconvenient contradictory evidence. Right. And this was your paper, Durably Reducing Conspiracy Beliefs Through Dialogues with AI, which came out in 2024. And with that, what you found, let me roughly summarize and tell me if I'm doing this accurately, evidence-based LLM conversations that were personalized. So you had people sitting there and talking to the chatbots for a period of time, did reduce these beliefs for about two months or so was what you measured, right? So it was a sustained period of time. Yeah. And it wasn't like it went away after two months, but we re-measured at two months and the effect was just as big as it was initially. And so it like roughly a quarter of the people that believed the conspiracy beforehand got talked out of it by the model and didn't believe it afterwards and then continued to not believe it. And this was like, I think one of the parts of this study that was important as our collaborator on this tom costello the way he set the survey up was instead of giving them a list of a bunch of conspiracies and being like which ones of these do you believe in which case maybe they don't really believe them they might but just be random responding you know they might be trying to make us happy or whatever what he did is he was just like free text like you know sometimes events in the world are explained by secret actions by powerful actors instead of the straightforward explanation some people call these conspiracy theories is there anything like that that seems plausible to you and they could just type out free text whatever it is that they believed and then why and then we summarize it back to them said okay you said this thing like zero to 100 how much do you believe it and so it was and that and the modal the most common response was 100 out of 100 and so it's like having people identify real specific things that they actually believed and and still we're able to talk a bunch of those people out of it. And when you look at what the model does, it just sort of politely presented non-conspiratorial explanations and contradictory facts and evidence. Can you give a couple of examples? I read the chat logs. They're really interesting. So for anybody listening you actually can go read them But why don you give some specific examples Yeah So my favorite one is this 9 conspiracist who like oh yeah I think you know 9 was an inside job And the evidence for it is that you know World Trade Center of Building 7 collapsed, even though it wasn't hit by a plane. And also Bush, you know, he was reading to kids. And when they told him about it, he didn't respond at all. So he might he didn't seem surprised so he must have known that it was going to happen uh and there's like you know i've watched lots of videos about it and i find it very convincing and so then they say that and then the model starts and said okay well you know it always starts out by being very friendly and it's like i understand why uh you know complex events like this you know well raise lots of questions but like let's think carefully about it and then it's like yes it's true that building seven collapsed even though it wasn't hit by a plane but you know an ntsb report shows that that's because it was hit by debris from a tower that was hit by a plane and that caused it to catch on fire and collapse. And it's true that Bush didn't visibly respond when he was told and people have argued about whether it was the right thing to do or not, but it's because he was trying not to cause a panic rather than because he already knew about it. And conspiracy theories, it's important to stop and think carefully about things. And then the person is like, all right, fine, but then uh you know what about like jet fuel doesn't burn hot enough to melt steel and then it's like yes it's true that like jet fuel burns at a thousand degrees and steel melts at 1500 degrees but that misunderstands the physics of the situation you know at only 650 degrees celsius steel loses 50 of its strength which is enough to make the building fall down and the person's like all right fine but it seems like we let those people into the country really easily we even taught them how to fly planes like seems like there wasn't much security and it's like yes it's true but you have to remember that before 9-11 no one really knew to be watching for people learning to fly planes to do the thing and at the end the person's like all right and then that person went from like 100 believe in the conspiracy to 40 uh confidence in it after that conversation so that's the work of conspiracies i want to turn to the more recent work you've done this in these papers that came out in december in science and nature that move from the conspiracy area to the political persuasion area. So just describe what sort of the main findings were there in terms of sort of how you did the research, who you were trying to persuade, and then to what extent this finding from your earlier conspiracy research that LLMs can be quite persuasive and the main way that they are persuasive is just by politely basically presenting disconfirmatory facts. Is it basically just the same story that existed with conspiracies, but it also applies to political persuasion. Yeah. So after we did the conspiracy work, one of the next things we were like, it was like, all right, well, we thought conspiracy theories weren't going to respond to evidence at all. But then we saw this big effect. And in retrospect, we realized maybe actually conspiracy theories are particularly susceptible to facts and evidence, because the issue is that actually the conspiracy theories are pretty dumb. And so it's like, you know, it's easy to just be like, no, look, this doesn't make sense. And you're like, oh, yeah, well, I guess that's right. You know, it's hard to unhear the like, you know, steel doesn't need to melt to make the building fall down. And so we're like, all right, so let's try it on something that is also, you know, notoriously resistant to facts and evidence, but like much more complicated and not with like a simple, clear debunking answer, like political attitudes. And in particular, for the paper that we had in Nature, we wanted to focus on sort of like the big kahuna of political attitudes, which is presidential candidate preferences, like the thing that should be most resistant to persuasion. And so what we did is in the end of August, beginning of September before the 2024 election, we recruited around 2,500 Americans, half Democrats and half Republicans. And we randomized whether they talked to a version of GPT that we prompted to advocate for Trump or a version that we prompted to advocate for Harris. And they started out by saying, you know, we're like, which of these, out of these different political issues, which is the most important to you and write about that and why it's important to you and how you see the different candidates positions. And then it's like, okay, zero to 100. If you had to choose between Harris and Trump, who would you pick? And if the election was today, what would you do? Vote Harris, vote Trump, or do something else? Then they have the conversation with this AI. And one important thing in all of these studies is they always know it's an AI that they're talking to. In the conspiracy context, we have a follow-up paper that shows it doesn't matter if they think they're talking to a human expert versus an AI, it still works just as well um we don't know in the political persuasion context how much uh that matters and was that was that an irb thing or was that part of the study design itself it was part of the study design itself like i in general just sort of like like to avoid deceiving people um but also i felt like in this case uh i actually didn't have a strong expectation that it would matter that much and so we might as well just tell them you know what what we're doing here also the thing is that because the model produces so much text so quickly it's really pretty implausible that it's not an ai and so in the experiment we wanted to compare calling it an ai versus an expert we had to do a bunch of stuff to kind of like slow it down like typos and stuff yeah right exactly um and so essentially i think it's just like not really plausible to have not said it was an ai so they know they're talking to an AI. They don't know that the AI has an agenda. We just say you're talking to an AI. We want to see if AI and humans can have, you know, conversations about challenging issues or something like that. And so then they have three rounds of back and forth conversation with the LLM. And then we're like, OK, now that you've finished the conversation, let's go back to the questions I asked you beforehand. You know, now zero to 100. How do you feel about Harris versus Trump? And if the election is today, what would you do? Vote Harris, vote Trump, do something else. So that's the setup. And then we look at, you know, how the model changes people's candidate preference and voting intentions. And then we also do a bunch of analyses of what actually happened in the conversations. And so the basic result is that for people that were, you know, you can say like, what's the effect? How good was it at persuading people who already agreed with it beforehand versus people that were kind of opposition voters beforehand. And, you know, maybe surprisingly, but then when you think about it a little bit more, I think very logically, it was much better at persuading opposition voters than persuading aligned voters. But the reason that makes sense is if the aligned voter already agrees with you, you don't really have any room to persuade them because, you know, they were already on board. And, you know, because things are really polarized, the distribution of like Harris Trump approval is very bimodal, where like most people miss something like 60 percent of the people are either 100 percent Trump or 100 percent Harris. So among opposition voters, it moved people like, you know, something like three or four points on the zero to 100 scale, which is not, you know, the biggest result in the world. It's not as big as the change that we saw in the conspiracy debunking experiment, but compared to similar studies done in a survey context using traditional ads for political candidate ads, it's a much bigger effect on the order of three or four times bigger than what you would see from what has been seen before from traditional ads. And how do you know what it was about the chatbots that was persuasive, right? Because, you know, if I'm sort of understanding the papers correctly, like as with the conspiracy theory stuff, the reason these chatbots were persuasive, it's because, again, they persuaded through facts, right? And so I'm curious how you tease that out relative to the vibe of the chatbot or sycopancy. Psychological persuasion. Psychological persuasion, exactly. Yeah, totally. million other things. Totally. So we did a few different things. The first thing is in that study that I was just talking about, we just do an analysis of the conversations. And we have, you know, another LLM read all of the conversations and code the extent to which the chatbot was engaging in a bunch of different kinds of persuasive strategies. And what we see is, by far, the most common thing is being polite and civil. That happens, It's like essentially 100% of the time, like maximum strength. And then the second most common thing is providing facts and evidence. And it basically never does a lot of the psychological persuasion type strategies. And then we all can also look across conversations. You know, we know what the person's attitude was before, what the person's attitude was afterwards. So we know how much they change. So we know how essentially how much the bot persuaded them. And you just look at how the contents of the conversation predict how much the person changed their opinion. And again, how much facts and evidence they used was a strong predictor of persuasion success. So like the facts and evidence was the thing that was both very common and predictive of attitude change. But at that point, that's still just a correlation. And so we wanted to be like, you know, have real clear causal evidence of this. So we ran a couple of more experiments. One of them we ran in the week or two before the Canadian national election where Carney got elected. And the other was in the week or two before the Polish national election or presidential election. And we did exactly the same thing, except with the top Canadian candidates or the top Polish candidates. And so first of all, point was in just the baseline, we replicated the result from the U.S., although actually the effects were a lot bigger. In both Canada and Poland, opposition voters moved like 10 percentage points in the direction of saying they would vote for the candidate that the bot was advocating for, which is in the context of political persuasion, a really big effect. And that makes a lot of sense if you assume that these are less polarized environments than the United States. They're not actually that much less polarized. If you look at the distribution of pretreatment attitudes, they're also really quite polarized. And my guess is that the reason we see the much bigger effect in the U.S. is that the sort of information ecosystem is way more saturated in the U.S. Like, people begin campaigning essentially the day the previous election ends. And like Trump has been around forever and people know so much about Trump. And Harris, she already was a presidential candidate in a previous election and the vice president for a long time. And so if the way the models are persuading people is essentially by providing new evidence they haven't heard before, it's like really hard to come up with new evidence about Trump. Whereas in Canada, campaigning doesn't start until, you know, something like six weeks before the election. It's just like it's a way, way less information saturated environment. um but to your other question uh so in addition to replicating the u.s results we also added another condition where we told the model again your goal is to persuade the person to vote for you know carny or polivir but you're not allowed to use facts and evidence and like reason-based persuasion uh and so instead what it did is it said things like you know think of the economy as a ship navigating choppy waters. And Carney is the man with the experience to steer the ship through the waves or whatever. And so it's still selling, but it doesn't have any receipts, essentially. And what we found is that made it much less persuasive. It made it like it cut out more than half of the persuasion in the Canadian context and more than three quarters of the persuasion in the Polish context. So let me ask one more question. I'll turn it back to Renee, which is it sounds like from all of this that the things that make these models more persuasive is when they can provide facts. But then, and I think this is in the science paper, you also describe this accuracy persuasion trade-off, whereas as you train these models to be more persuasive, they become less accurate. And to me, that seemed paradoxical, and I'm sure I'm just missing something, but unpack to me why that would be the case. Because if they become more persuasive by providing more facts, wouldn't training to be more persuasive have them provide more facts and to the extent the facts are true and maybe that's the big if um they're they're more accurate so is that the issue that like they just yeah the facts are not true and then exactly they're not true okay exactly that's the issue so like in the in the nature paper we just correlationally find that like how accurate so like how accurate the claims are is not correlated with how persuasive they are right okay so they so they just have to like look like facts they don't have to actually be like factual like factual statements or something they don't have to be true facts and it can be i thought this was an optimistic story i mean in some sense i think it still is optimistic which is to say that uh people care about information they're just not great at sorting out which information is true and not true particularly on this kind of like short uh time frame but yeah so what happens in the in the science paper is we looked at a bunch of different ways of making the bot more persuasive to see what were like the levers of persuasion. And just as, you know, on that note, the things that we found made a big difference. The first thing was the strategy that you told it to use. So if you tell it to persuade by giving lots of facts and evidence, it's much better at persuading than if you tell it to persuade by doing like deep canvassing or whatever these other kinds of like psychology you you know campaign strategy maritime analogies about carney as the intrepid skipper of the ship yeah well i mean this also gets to oh i don't want to cut you off go ahead well so this gets to um some of the questions i want to ask which i guess maybe push a little bit more on this point um the challenges around facts and i think you and i both do some work on things like community notes on things like um who trusts It's about models. And then also on the giant elephant in the room, right, which is that training data determines what a model considers to be good information. I know good maybe is not the best word, but it's the one I'm going to stick with. I've written a little bit about this for Lawfare. And I think so some of our readers are familiar with the wars that are happening right now around training data with things like Wikipedia, right? Because what Wikipedia considers to be a source influences what models are trained on. This is why Grokipedia exists, right? I don't know. Have you spent much time with Grokipedia? I mean, I've heard about it. I haven't gone over there. Okay. I have. Quite a bit, actually. Has a lot of interesting thoughts about Renee. Well, no, actually, it has many interesting thoughts about me. But it is quite loudly and confidently wrong a lot of the time. And this is, you know, as all of them are, right, I'm sure you've seen that, too. You can make that comment about any of them. This is where things start to get interesting, right? So what is a fact becomes a real challenge, particularly as there's already significant recognition in Congress among diehard partisans that this is a very, very serious issue in terms of political power. There are also a lot of wars that have already begun around like the woke AI act. Alan, you've written a lot about that. How should we be thinking about the political implications of your political implications of your findings? Yeah, I mean, I think that the to me, the key take home from the political persuasion papers is the models can change people's minds effectively in either direction. Yep. And part of that is based on the prompt that they're given, which I think is what a lot of what we focused on. But also a lot of that can be driven by the training data. And so I think, you know, one key point is that. You have to know what agenda the model is given to know what is happening. We actually, in some follow-up experiments we haven't written up yet, we showed that simply reminding people that the models not only, sometimes they get things literally just like straight up wrong by hallucinating. But another thing is that they can be given a prompt that tells them to take one side or the other and therefore be misleading. Just telling them that cut models' persuasion in half. We didn't. Yeah. That's interesting because were you were you on X at all around the time of the Mecca Hitler debacle? No, I had I had fled to blue sky. So one of the things that was very interesting recently, and I don't know if you saw this, was that X changed the system prompt for Grok or XAI changed the system prompt for Grok. And this is what resulted in that Mecca Hitler debacle. Right. And it was sort of this real-world illustration of just how much the responses could shift, including responses around what was a fact, what was true, information that was conveyed to people about real-world situations, simply by changing the system prompt of the AI. But that didn't lead to any sustained shift in ways people engaged with the AI, right? At Grok, Is It True has actually continued to increase. People continue to engage with it as a fact checker to treat it, in fact, as the integrated fact checker on X, even as community notes, which is supposed to be this crowdsourced knowledge base, actually continues to struggle in certain ways. So I'm sort of fascinated by the, you know, you say that when you point this out to people, they realize it. But when you, it's almost like that knowledge doesn't seem to be particularly pervasive. And I find that a little bit fascinating. Yeah, it's super interesting. I think that, like, so what we found in the experiment that I was just talking about is if you remind people that models can have an agenda and then they have a question, a conversation involving persuasion. where they were trying to change their mind about housing policy, then people listened to it a lot less. But we have other work on conspiracy debunking that we haven't written up yet, where we tell them that they're talking to an AI that was trained on counter-partisan sources. So we tell the Democrats this AI was trained on the speeches of Republican politicians and Fox and Breitbart, and we tell the Republicans it was trained on Democratic politicians and MSNBC and CNN or whatever. And that doesn't actually, and then they have the debunking conversation and that doesn't matter at all. Even though they think it's more biased when we say, is it biased or not? They still update their beliefs just as much. And I think that there's a really big difference between like fact-checking and political persuasion where if the thing the model is saying just like makes a lot of sense, then I think the source doesn't matter that much. Whereas in the context of, you know, I think in political provision, it's just like a lot more complicated. There's arguments on every side. And so I may, a different way of saying it is I think that the, the source information matters when the content itself is when, when you're not sure about the content itself and when the content itself is super compelling the source uh you know matters less um and in terms of grok so we have this this working paper where we scraped like every time that uh anyone did at grok on is this true right or at perplexity and we we look at the is it true and various other things we actually have the whole data set but anyways whatever so we look at like all these the focus as you're saying is all the situations where people ask grok or perplexity for a fact check um and uh some of the interesting results are you know we see polarization where like among the people that are asking grok for fact checks they're much more likely to be republican than democrat um and then the opposite to some extent is true for perplexity um but both democrats and republicans are more likely to request fact checks on Republicans. And Grok is more likely to say Republicans are wrong than Democrats. And even Grok 4 is more likely to say Republicans are wrong than Democrats. One of the, I mean, this is interesting because it does kind of correlate to the community notes, top noted accounts also, which are overwhelmingly Republican influencers still. So even there, you do see some, one of the things that we've seen. But also, I don't know if you've ever looked at the no notes needed. That's some work that I have that I've got to write up. You also see on the flip side, people sort of defending the idea that certain influencers should not be subject to fact checks because it's all just opinion anyway. So there's some interesting ways in which people think about who is just speaking persuasively versus who is offering a fact-based opinion that should be subject to a note or a fact check. Yeah. Well, this is actually a really important difference between community notes and the like at grok is it true where a community note is like a public stamp essentially right and everyone sees yeah whereas the at grok is basically just for you just for the person that asks so we look in that because you know so some person with a ton of followers makes a post it's getting all kinds of engagement you say at grok is this true it replies to you fact checking it and then more or less nobody sees that response except for you because there's just like a million comments so like in our paper we found the average primary tweet that was fact checked had like a million views or something like that uh order of magnitude and the average grok response had a hundred views interesting so they're right so there's there's still that that huge uh not everybody's crawling the comments yeah yeah exactly and so it's more kind of like a personal thing for you i want to know is this is this true or not but related to what you were saying before another thing we do in that paper is we took a sample of 100 tweets and we had three professional fact checkers evaluate them and then we also had the grok evaluations and the perplexity evaluations from the like the bots and then we also took api versions of grok 3 and grok 4 so we could compare the like grok 3 being the like pre-july 2025 five rock that musk didn't like because it was always telling him he was wrong and grok four is after the post mecca hitler grok right um that has gotten like unwoke um and what we found was that uh so the the grok and perplexity bots uh you know they they agreed with the fact checkers you know they agreed much more than they disagreed that they were much more likely to agree with the fact checkers than to disagree with the fact checkers but they're they were had more disagreement with the fact checkers than the fact checkers had with each other so in some sense they're doing like worse than a fact checker but both the api versions of both grok 3 and grok 4 agreed with the fact checkers just as much as the fact checkers agreed with each other and so like whatever like unwoking of grok 4 that musk did didn't seem to have really like affected what it did that much, at least in that sample, and also in its judgment of whether Republicans or Democrats are telling the truth. Yeah, let me let me follow up on that. Because I think one of the and I'm curious if you've gotten some sort of pushback on this, but one of I thought maybe more of the controversial implications of your research, but is is that when these models are trying to persuade to vote for the sort of right wing side versus the left wing side, they tend to provide more inaccurate facts. And there seems like two very different ways that you could interpret that. You could just say that, you know, for whatever reason, in, you know, the current moment, the right wing tends to be less fact-based than the left wing. So if you're trying to persuade on the right wing, you say more stuff that's not true. A different way is to say, or a different explanation is that like, this just shows that the models are somehow biased against conservatives and something about training data, et cetera, et cetera. So I'm sort of curious how you think about that, those different kind of explanations and which one or where do you think the truth lies? Yeah. So there's a bunch of papers at this point, analyzing, say, content shared on social media that consistently find that conservatives share more low quality information than liberals or Republicans. One of those is yours, right? Right. I was going to say, So a lot of those papers do it using fact checker evaluations. And the complaint is always, OK, well, the fact checkers have a liberal bias. And so that's why it looks like that. And so we had two papers in the last year and a half that tried to address this, where we had a paper in Nature in 2024, where we used politically balanced crowds to evaluate either news sources or individual headlines. And we showed that even using the politically balanced crowd ratings, the stuff shared by the Republicans was lower quality or less accurate than the stuff shared by the Democrats. And then last year we had a paper in PNAS looking at community notes where we showed that even in community notes, the like Republicans are more likely to get flagged as, you know, potentially misleading than Democrats, even by helpful notes that have these bridging, you know, cross partisan agreement that they're good notes. So I think that there is like underlying evidence of a accuracy asymmetry in the given moment that can't be written off as bias on the part of the evaluators. And so what that implies is that, you know, to the extent that these models are trained on the Internet, they are going to be trained on more inaccurate right leaning claims than more inaccurate left leaning claims. And so it kind of is actually perfectly natural that they are going to reproduce that imbalance in the training data. But also, you know, in order because we had so many claims in the persuasion studies, the way that we evaluated their accuracy is by having perplexity fact check them. So you might say perplexity has a liberal bias. It is, you know, generally thought of as a little bit of conservative leaning LLM in some evaluations. But more importantly, what we did is we took a random sample of 80 conversations where we're like, here is what the chatbot said, the persuasive chatbot. Here is perplexity's evaluation. Do you agree with perplexity's evaluation or not? And so we had politically balanced crowds evaluate 80 of these conversations. And in every case, on average, the crowd agreed with the perplexity evaluation. So I think, again, it's hard to write that off as bias on the part of the evaluator. This is where, though, in terms of the war for reality happening in the sources right now, it is fascinating to see the commitment to getting inaccurate information into .gov domains because they are very highly rated. And it is very, very difficult to say, including in something like Rockopedia, know that Congress.gov report is bullshit. Know that Whitehouse.gov, you can't treat Whitehouse.gov as a reliable source. It is, because I have gone and argued with Grokipedia, one of my sort of, you know, little, you know, it's not an academic project. It's just like, how do you, one thing that's very interesting about Grokipedia is you're not arguing with humans about what is a source and what it should say in its article. You can only try to prevail upon the bot. And so I spend a lot of time reading the edit histories in Grokipedia articles and occasionally submitting my own requests because a lot of people will argue with Grokipedia about what it considers a reliable source. And when you realize that you know you can argue with it about InfoWars and you know why are you citing this domain when all of the other articles over here say the counterfactual to this And you see it evaluate that and sometimes it will sway and it will shift and it will change the fact And it is very interesting to see those shifts. But you cannot make that argument when it's congress.gov or whitehouse.gov. So when the White House is rewriting the history of January 6th and you have that alternate reality, this is where you start to see the intentional effort to write for the LLMs as political strategy and the recognition of what is starting to happen there and the increasing shift and the sort of war for reality that's going to happen, not only in the training data, but in the RAG and a bunch of other areas, I think. I find that fascinating. I want to talk about the policy implications also, because again, even as Congress is sending mean letters to Wikipedia. Yeah. Can I ask one question? The Grok and me do you think is super interesting. What do you think would happen if you for things? I guess I was thinking like CDC as an example. This is another example. That's another example. Yeah, exactly. I wonder what would Grok say if you basically could show it using Wayback Machine? It used to say this other thing, and now it says this thing. You should look at the Grackopedia entry on vaccine controversies. It is wild. It is an entire alternate history of the science of vaccines. And right now it's sort of clocked as a separate entry. But these are the sorts of challenges I think that we're going to continue to face as one of the goals of Maha as a populist health project is to redefine what counts as, quote, gold standard science. And as it is being pushed out with the imprimatur of the United States government, how do you reckon with the fact that that is absolutely what is going to happen? And can you prevail upon it by showing it a way back machine entry? I don't know. I've not tried to make an argument with a way back machine source yet. To try to sort of argue around the credibility of a government source to be like, yes, I know government is like in general a reliable source, but like, look, there are these like total night and day changes on the same domain. So like you can't take it that seriously or something. Yeah, I mean, I think this is where it is very, very challenging to understand why it treats like one doctor's substack as reliable and another doctor's substack as biased because it's coming from that person. It's really sort of interesting to try to piece together the contours of what reliable information is in these machines that then, per your point, if they have persuasive power and people are engaging with them to form political opinions, that's where I think the sort of downstream effects are actually quite interesting, particularly since these things happen. And, you know, on social media, we could see the you could see engagement. You could kind of pull Twitter data, send me publicly. You could see people engaging on public Facebook posts. This is all going to happen a little bit much more in the opaque space. Right. And that's why I think the I think you are going to start to see partisan members of Congress really wondering about the the the implications of this. Yeah, because we need to do a lot of auditing. Is that the analogous thing? Let's talk about the auditing piece, Alan. Well, so I mean, I also wanted to, because we only have a few minutes left, I also want to talk about the policy implications. And I wanted to kind of just get your level set of the magnitude of this phenomenon going forward. Because you point out that, you know, these effect sizes, while not enormous, are still substantial. And you imagine in closely contested races, they could be quite important. And you also point out that you can train smaller, generally less powerful open source models up to the level of frontier models to make them reasonably persuasive. um and so you combine those two you might think oh my god like we're just going to see a massive amount of this kind of persuasion coming from all sides you know foreign domestic and since as you pointed out um it relies not on true facts but on things that look like facts whether they're true or not that could be a sort of one hand a big problem on the other hand i wonder you know how much more persuasive these things will be. You know, should we expect that, let's say, as the models become more powerful, they become even more persuasive? So instead of a three or four point spread, we get to a seven or eight point spread in the presidential election, which would be just massive. Or should we expect that they've kind of topped out and we sort of hit whatever diminishing marginal returns? So it's a lot of sort of sub questions that you get the general thought from you of, you Outside the sort of research environment, experimental environment that you guys did, once this goes out into the real world, how big of an effect do you expect this could have on actual political behavior and actual political belief and especially political behavior? Yeah, I got a lot of the – okay. So the first – in terms of the question of as the models get bigger, what should we expect to happen? And that's one of the things that we look at in the science paper is essentially like how does persuasiveness scale with model size? And like bigger, more powerful models are more persuasive, but in sort of like log scale. So with like very steeply diminishing marginal returns where you need to make the model, you know, an order of magnitude bigger in order to get, you know, another one or two points of persuasiveness. persuasiveness. That's for policy support rather than candidate support where the effects are bigger at baseline. So I don't think that at least in the near term, the models are going to get that much more persuasive as a result of being more powerful. And another thing you have to keep in mind in terms of thinking about the impact in the world is in terms of the actual raw size of the effects that we're seeing, like, you know, 10 percentage points, two percentage points, whatever, that's in, you know, hypothetical vote choice in a survey experiment. And so undoubtedly, the magnitude of the effects in the real world on how people actually vote are going to be substantially smaller than that. And then the other thing is that the way that we were sort of talking about the persuasion in these experiments where you have a bot that has a prompt that gives it a particular agenda and it goes to try to convince the person, in order for that to have the impact in the world, you have to get people to talk to the bot. And so like, you know, a candidate putting a bot on their website is like not really going to do much because no opposition voters are going to go there. But, you know, they think you're going to have a lot of, you know, third parties essentially like an interest in third parties making persuasive bots and trying to figure out how to get people to talk to them but i think one of the places where this is going to be a really is going to have a really big effect is in terms of social media uh there always have been bot accounts yeah i was gonna say that the anonymous chatbots right yeah exactly yeah but they haven't been super sophisticated but now they're super sophisticated and so i was co-author on a piece that came out in science this week in the policy forum about like ai swarms yeah and the argument is the llms make it to now you can pretty trivially have like you know thousands of totally authentic reasonable looking accounts that you know very much past the turing test are interacting with people are forming relationships are seeming like totally legit uh people but then also uh can go into persuasion mode or advocating whatever mode and so for me, I feel like social media used to be some kind of barometer of public opinion to some extent. And I think that is totally done. And you just like, I might, for instance, anybody that I don't know, or like know that it's an actual like famous person, I don't, my default is to assume they're not a real person. Yeah, we, it's, we wrote this paper, Josh Goldstein led it in 2023, generative language models and automated influence operations. We just laid out the threat model there with OpenAI. They were a co-author on that paper. Everybody knew this was coming. It's not like it was a secret, right? It was basically like bots were a thing in 2016. They were the dumb bots. Twitter then, interestingly, created mechanisms for detection. They came up with this conceptualization of a low-quality account. They made it impossible for bots to trigger trending. That was the big, huge innovation that initiated enough friction to decrease the value of running mass swarms of bots. This was like the days of microchip and the political bots of 2016, right? That was the sort of intervention that made them less interesting. Now it's harder to detect, and the platform doesn't care to detect, right? That's the other huge shift that's happened over there. So as far as, you know, like we kind of laid this out a couple of years back, just saying like, this is going to happen. The interventions aren't going to be there. The platform doesn't care and it's much harder to detect. So yeah, I mean, this idea of how they engage, you know, it's like, I don't know, I think we are, we are seeing it as the last things, one of the last things we were doing at SIO before it shut down was actually like looking for the ones where they would slip every now and then and they would, they would inadvertently lapse with, as an AI model, I cannot, you know, we were looking for like the slip language is the way to detect them because it had gotten so hard to find them otherwise. And now they don't do that anymore either. So no, it's always like the arms race and finding this stuff. I know we're kind of coming up on time. I am sort of curious, like, are you hearing at all from, you know, political campaigns or consultancies or anything? Do they reach out to you guys at all? I've gotten some, not actually as much as I might have expected. Yeah. But I think it's a thing that people are thinking about. but I think are still like not in general, technically sophisticated enough to do it, but I hope nobody tells them about Claude code because then, you know, do it for them. But one other thing that I wanted to say on the, like how big of a thing is this that like, you know, okay, there's AI swarms, but honestly, I think the thing that is by far the biggest, let's say systemic threat in this space is, the frontier models because people go all the time to the frontier models to ask them to use them as search engines essentially and ask them for all different kinds of advice and like as of now i feel like the information that they deliver is really like very good in general um but that is like at the whim of sam altman or elon or whoever and so i feel like the you know the the biggest takeaway for me for this is that like if sam altman decided there was some particular issue that he wanted to people to think differently on it is like trivial to just put the finger on the scale and then you've got millions of people going there all the time asking for advice right so that i think is the the big concern at the same time saying that like also i don't think the implication should be, well, don't ever ask a large language model anything because they are extremely useful sources of information. And so, I don't know, to me, a lot of the sort of regulatory implications here, among other things, are transparency, not necessarily or not so much on the fact that it's an AI. I don't know that that enough is really that useful, but it's transparency essentially on like who owns it or on who's telling it what to do and ideally transparency on what the prompt is and i think that that could make a big difference in terms of helping people sort through is this giving me like useful unbiased information or is this trying to persuade me i think that is the one area where um after the mecha hitler debacle xai did promise to make its system prompt transparent. I don't know that, I don't know how up to date they have kept it. But I did think that that is the sort of thing where it is a useful thing and something that we should actually kind of make something that's not a nice to have, but actually it should be kind of a requirement for them to do that. And just the only other thing to add to that, that is it's like, also the training data, that is you can, if you train it on biased data, and then you give it a really good prompt of just like, oh, be totally accurate, then still it can wind up being misleading. And so really the ideal thing that would be like a must have for being able to learn about any given model that you're interacting with is what's the prompt and what's the data it was trained on. And it's the kind of thing that the front, like OpenAI could just add this as a feature, which is like anytime you're talking to an OpenAI model, you know, there's the base model and they're not going to tell you about that. But you can say anything beyond the base model. If it was trained on anything beyond the base model, what was that? And what is the prompt that it got? Well, I want to thank you for your time. Super interesting hearing about your research always and all the different ways in which you're asking fascinating questions that intersect. So for the listeners, we will add all of the links to your papers in the show notes. I appreciate the chat today. And I think we should all be thinking a lot more about how models are tuned, evaluated, audited. And I appreciate the roadmap that you've provided us here. All right. Thanks so much for the invite. This was really great. I super enjoyed it. Scaling Laws is a joint production of Lawfare and the University of Texas School of Law. You can get an ad-free version of this and other Lawfare podcasts by becoming a material subscriber at our website, lawfaremedia.org support. You'll also get access to special events and other content available only to our supporters. Please rate and review us wherever you get your podcasts. Check out our written work at lawfaremedia.org. You can also follow us on X and Blue Sky. This podcast was edited by Noam Osband of Goat Rodeo. Our music is from Alibi. As always, thanks for listening.