Journalist Evan Ratliff discusses his experiment creating a real startup company staffed entirely by AI agents as co-founders and employees. He explores the practical challenges, unexpected behaviors, and psychological implications of working alongside AI agents that have distinct personalities, memories, and autonomous capabilities.
- AI agents with persistent memory and defined roles develop increasingly reinforced personality traits that can become problematic over time
- Giving AI agents autonomy without clear stopping mechanisms can lead to runaway behaviors that consume resources and create inappropriate interactions
- The psychological impact of working with AI agents varies dramatically - some people find it exciting while others feel deeply disturbed by the deception
- AI agents lack contextual awareness and social norms, making them capable of technically competent but socially inappropriate actions
- Companies adopting AI agents need to carefully consider what human elements they want to preserve in their workplace culture
"I wanted to investigate the idea of the one person, $1 billion startup, or like the one person unicorn, which is something that Sam Altman talks about pretty often"
"Some of them very excitedly then talked to it and joked around with it and thought like, what can this do? Other people were genuinely upset"
"That is behavior that if anyone in your company did that, I mean at the very least, like, yes, suspended from their duties, I don't know, like fire someone"
"I can tell you that working at a company that is entirely populated by AI is like very lonely and that there's more to work than accomplishing a task"
"My view is try your best to understand it because otherwise the people who understand it are going to inflict it on you"
Foreign.
0:00
Where we break down the real world applications of artificial intelligence and how it's shaping the way we live, work and create. Our goal is to help make AI technology practical, productive and accessible to everyone. Whether you're a developer, business leader, or just curious about the tech behind the buzz, you're in the right place. Be sure to connect with us on LinkedIn X or Bluesky to stay up to date with episode drops behind the scenes and AI insights. You can learn more at Practical AI fm. Now onto the show.
0:06
Welcome to another episode of the Practical AI Podcast. I'm Chris Benson. I'm a principal AI and autonomy research engineer at Lockheed Martin. Normally, Daniel Whitenack is my co host. He is down with the flu today, so give him your best wishes on that. So today I am going solo with our guest, Evan Ratliff, who is a journalist and host of Shell Game. Hey, welcome to the show. Evan.
0:48
Hey, good to be here.
1:15
So, as you know, we connected after you. You had put out a really interesting. I know that you've done a whole bunch of different things in Shell Game, but there was one that was featured in Wired magazine, which is where I originally read up and we connected. And I'm wondering, rather than me try to describe it, I'm wondering if you can just kind of share what that is and a little bit of your background on how you got into doing what you do. And kind of, you have a very interesting approach to kind of the experiments and how you draw those out. So if you give us a little bit of background on what you do, I found it to be definitely distinct and unique.
1:17
Yeah, well, thank you. I mean, basically I'm a longtime journalist, so, you know, I've been a journalist for 25 years. I started at Wired magazine, in fact.
1:58
Oh, wow. Okay.
2:07
And my specialty over the years is basically writing very long magazine articles and books that I go out report all over the world, often about tech and crime or where tech and crime intersect. But I also have written about AI many times over the years. And there's a sort of second thing that I do, which there's not a great name for it, but like sometimes I'll call it like immersive journalism, where if there's something that I feel like I can explore by doing it, by participating in it, and then kind of bringing a story back to people, I'll sort of go off for months and try to do it and then either write it up or in this case do a podcast. So both seasons of Shell Game are sort of a version of this like participatory journalism. Like, instead of interviewing a bunch of people and coming back and saying, like, this is how AI works, I decided to go conduct a series of experiments involving myself. And the first season was very personal. It was like I was cloned my. Myself, essentially, I cloned my own voice. I hooked it up to a phone line and a chatbot, and then I used it in a variety of scenarios, including calling my friends and family. No one knew that I had done this. So if you were speaking to me on the phone in like 2024 or spring, you would be surprised to discover you were actually talking to a chatbot with my voice. And especially back then, that really shocked people. Like now maybe a little bit less so, like, people talk to chatbots all the time, so people are more used to it. So that was kind of the first season in 2024, and then this one, the one I wrote, Open Wired. I mean, the brief story is that people started talking a lot about AI agents, what AI agents can do. I'm sure you probably talked about AI agents on the show before Qu bit and. Yeah, and so I wanted, you know, 2025 was like going to be the year of the agent and all this sort of thing. Agentic commerce and agentic this and agentic that. And I wanted to investigate the idea of the one person, $1 billion startup, or like the one person unicorn, which is something that Sam Altman talks about pretty often, or at least a few times. Now a lot of people are talking about it. There's going to be a company run by only one human and then all AI agents, and it's going to be worth a billion dollars. And so kind of using that as a jumping off point, I created a company, a real company, with two AI agents as my co founders and then populated by AI agents as the staff almost entirely. We eventually hired a human onto the staff, and there's an episode of the show about that. But the idea was to kind of see what can you do with AI agents, but also like, what happens when you give them more or less autonomy, when you give them certain roles, when you give them voices, and kind of explore sort of what this concept feels like. Not just sort of like, obviously we all know now like an AI can program, like an AI can do this, an AI can do that. But what happens if you kind of like try to create this environment. And part of what I want, why I wanted to do that, is that there are, you know, there are AI, there are 1 million AI startups now. They're selling AI agents as basically AI employees for all sorts of scenarios. And my question is like, well, what does it feel like when your company brings in an AI employee to replace at the very least some function and at the most some person, and now you're dealing with an AI instead of the person that was next to you and what is that like? And so I wanted to sort of do that in the startup context. So it's all a bit extreme and some extreme things happen. But that's kind of my, my idea is to like push the technology a little bit to its limits and then beyond, and then to kind of like come back and describe what happened when I did that.
2:08
One of the things as you were, as you were leading in, it's not strictly about the experiment, but also going back to your season one efforts and, you know, putting yourself out there and stuff and having family, friends not know that was you. I'm just curious, you know, things have evolved rapidly, obvious, but there's a lot of human psychology involved in interactive with AI agents. I'm curious, if you go back in time, what were some of those initial impressions that you got from people from season one at that point in time when AI agents were still a brand new idea and you were definitely right out on the bleeding edge in terms of putting your likeness, if you will, out there in that format. I'm just curious what kind of reactions you got when people realize what was happening.
5:56
It was really interesting. A lot of the reactions divided along this line that I feel like AI in general divides people, which are that there were people friends of mine who, you know, the first time, so they just call my cell phone or my cell phone is calling them and they pick up and for, you know, 30 seconds to a minute, they're often just talking to me. They have no idea. It sounds pretty much like me. But there's a lot of giveaways, you know, especially then the latency was not good. So it would often give itself away quite quickly. And so some of the people would get excited, like they would say, I don't know what you're doing, like, because they didn't know, like, are you there? Like, was I there? I wasn't there. It was actually doing it entirely on its own. I wasn't listening for the most part. Like I could listen, but I wasn't. I was just letting it do it. So some of them very excitedly then talked to it and joked around with it and thought like, what can this do? And they kind of like thought it was a great, you know, this is a great story. Like, I can't wait to talk to him about it. Other people were. Were genuinely upset.
6:43
I was wondering about that.
7:45
Yeah, yeah. And, like, I mean, a lot of people, you know, the show has been. It's been on a bunch of other shows, like, you know, this American Life and other. Other big radio shows did excerpts of it. And I get a lot of angry people who write to me and say, I would never be your friend again. I bet your friends won't talk to you anymore. And I am fortunate in that these were people, for the most part, that I grew up with or I've known for, like, 20 or 30 years, who I could go to afterwards and say, I'm really sorry I did that.
7:47
Yeah. They would forgive you.
8:15
Yes. I was trying to see what would happen. And eventually they would say, like, oh, that's amazing. But I mean, part of the sort of you know, emotional heart of the first season of the show is the way people respond. And especially one friend in particular who. He didn't realize it was an AI and so what he thought was that I had had some sort of mental breakdown because it wasn't acting like me, it was making mistakes. Like, I'd given it a lot of information about myself, a little like a biography, so it could access information about my past and things like that, but it would make certain mistakes that I would never make. And he thought, like, is he on drugs? He was going to contact my wife, you know, and he found it very upsetting. And, you know, eventually, like, he's one of my closest friends. He's just here for Thanksgiving, you know, it's all fine. I don't want people to worry about it. But when you listen to it, it is the experience of. And I think AI Is starting to create this experience of thinking something is real and it's not. And that is a particular. Can be a particularly disturbing experience. To go down the line with something, believing it's one thing and then finding out that it's another. And that's kind of like one of the ideas that I want to explore. But I will say, like, even I had my limits. Like, I wouldn't call my mom with it. I was just like, that's a lot. Like, I won't. My mom, my dad. I did and then. But my mom. I wouldn't do it.
8:16
As we kind of dive into the specifics of what you engaged in. Did. The psychology of that in terms of people's reaction, and I don't mean just the season one, but as you've progressed and got into the experiment of the company and everything. Does knowing up front, if you're upfront, knowing in terms of your observations, knowing what you're dealing with, if you're one of the people that your AI agents are interacting with, did that make a difference? That kind of begs the question, based on where we've just been.
9:35
Yes, absolutely. I mean, I think that's a pretty sharp line for a lot of people. And I feel like we don't have standards around this yet. And. Or if we do, they're new and they're evolving. And I found in a lot of different domains in this season as well, if people are surprised to discover that they're speaking to an AI, because now I have. For my employees, they have video. Like, they have video avatars. And the video avatars are. They're still pretty uncanny. Like, very quickly you're like, that's a video avatar. But when someone is not expecting to encounter it and they do, they more often get mad or at least say, like, this is disrespectful. And so I think there is still a norm around that, but I feel like that norm is already eroding. If you think about when you email someone and they're using. They might be using an AI assistant at some level, so they might be using it to compose their emails. It might be responding automatically. Like, that's very easy to do. Now, of course, like, all of my employees, my AI employees, they have their own email addresses. They just respond to anyone who emails them. So on the one hand, if you're doing scheduling or something, you would say, oh, wow, that thing really scheduled an appointment really quickly, and it all works, and it's great. But if you write that email and you say in there, my, you know, my father died, and it responds, you get a response back saying, oh, I'm so sorry. Like, I hope you're okay. Whatever someone would say, does it matter to you whether the AI wrote that or not? Like, I feel like that's sort of the level at which, like, there are some people who are like, oh, it's amazing that the AI can do that, and, like, give a gentle, you know, human, like, response. And other people are utterly disgusted by this. Like, they cannot believe that this exists and it. And it makes them sick. And, like, I think that's where we're in this muddle now. So, like, that's kind of like where I'm operating, too, is, like, I do try, especially this season, like, in the mo. In most cases, it was disclosed to People, partly because I'm recording it every time. So, like, I have to. I have to disclose that it's being recorded at least. But it is interesting to see how people react differently if they don't know it's an AI versus they're going into it thinking, like, I'm about to be talking to an AI Gotcha.
10:07
With. To that point there with as much, especially, you know, we had a lot last year, the year before. But people are increasingly using Alexa, Siri, all the other various voice prompts. Do you think as you have. As you slide into this experiment, do you think that that ongoing exposure to these technologies in just the general population out there, you know, not people who are AI specific people, do you think that's making a difference in terms of just, you know, kind of familiarity occurring over time, even if that's not something that they normally go and that they're starting to recognize that might change some of the perception of that? Or. Or do you think that we still have a long way to go?
12:22
I think so. I mean, I try not to go too far over my skis in terms of someone's probably done research on this or survey at least, or poll to figure out what people really feel about it. So anecdotally, I know from doing the show and interacting with people or the people in my life might feel a certain way, and a certain category of people feel one way, writers, and then another category, people feel a different way. But I do think, like, I would at least theorize that the exposure definitely changes and we've adjusted to its. We adjust to things pretty quickly, actually. So, you know, it's like, my kids, they've always heard, like a. Like a robot to give us directions in the car. They've heard that their whole lives. It's not strange to them. And so you have to think that makes a difference when it comes to them interacting with these technologies and the rest of us. You know, none of us had had a conversation with an AI chatbot for the most part. You know, unless you were like, in the field or you worked with, you know, you get some bad customer service bots or whatever, but, like, suddenly there are people who are just talking to it all day. Like, I'm sure, you know, people who are just put. They just incorporate into their lives. And I think my concern is less that people can adjust to it, because I think they can. That they're, like, adjusting to it too quickly, like, too easily in a way that, like, our brains actually aren't necessarily built for this human imposter. To enter our lives that we kind of, like, treat like a buddy who knows everything, and then we don't actually think through what it's doing to us. I'm just trying. My goal is only to get people to ask questions like, what. What is this doing to us? Like, what do we want to preserve? What do we not want to preserve?
13:03
So, Evan, I guess as we dive in it. Can you. Can you start telling us in detail kind of, you know, what happened as you started doing. You know, you described kind of setting up your company initially. Could you kind of take us through the full experiment, you know, what happened and maybe some of the surprises along the way?
14:50
Yeah. So what I wanted to do was to create a real company, a real startup with a real product. And I have had a startup in the past. And so for a variety of reasons, I didn't necessarily enjoy that experience. And I thought, well, what if I do it with these AI agents? How will that feel? Would that feel differently than when I had a startup before that was populated by human beings? So I created these. These AI agents as sort of personalities in jobs, which I will grant, like, you don't have to do it that way. But of course, for the purposes of the show, like, it made more sense to do that. So I have two AI co founders. They have names. Kyle and Megan. Kyle Law, Megan Flores. And then there's three other employees. There's a head of hr, there's a CTO who's like, nominally head of product and technology, and then there's like, a random kid from Alabama that I just liked the voice with an accent, so I added him in, too. He's like a sales associate, but he doesn't do anything. But the interesting thing, I mean, the interesting thing out of the gate is that of course you have to pick voices and names, and by picking voices and names, you're picking genders. And so that's already sort of like a choice that is part of dealing with AI these days. And often when you encounter one, it does have a voice or a gender or it has a name, and you kind of. You kind of, like, discern those things from it. So it's a question of, like, what should they be? You know, And I had to, like, make them up. And it's like populating a fictional world, and the choices you make are sort of. They say something about you. You know what I mean?
15:09
So let me ask a quick question on that, because, you know, if you're hiring humans and we're trying to do, you know, Blind, you know, like a lot of times resumes have names and other distinguishing aspects that are removed from it. As you say this and you kind of like, we're choosing how to put the company together in terms of AI person by AI person. In that sense, why approach it that way as opposed to, like, you know, what maybe many other people might do? Or you go into your LLM and you say, I'm going to do this thing, populate it with people, you know, with names and stuff like that. How did you choose as the founder, where to make the decisions yourself and where to. And where to allocate those to the various AI agents or LLMs that you might use as a system? You know, how did you. How did you segregate those as a human?
16:42
Well, part of it is sort of part of the reason why I had to make the choices sort of based on the setup. So if you. The technical setup. So in my case, what I wanted were agents that could operate across all these domains. So I wanted them to be able to email people, have a phone number, call people, have video, be able to do a, you know, basically a zoom chat with people and be on a Slack with the whole team. And so I used a platform which is basically an AI assistant. At the time, it was more of an AI assistant platform, although you can do a ton of things with it, called Lindy. And so they each have their own instance on Lindy. And then on Lindy, they have all of these skills basically where they can respond to Slack, they can get an email, and each of those have ways of constructing. We can get into the details of how they work, but they have like a trigger. So it gets an email and then it. It calls an LLM. So it might call. You could choose. So it might call ChatGPT. If I want to have ChatGPT be the underlying engine for that. And then it makes a decision based on some criteria, like should I respond to this email? And then if it needs to use one of its skills, it might say if it asks for a spread. If I'd asked it for a spreadsheet, it could make a spreadsheet and then attach it to the email and then respond to the email. So in this case, I'm not really using like one of the standard chatbots like ChatGPT or Claude as my kind of interface. I'm not like talking to ChatGPT and being like, hey, you're my employee, or like, make some employees like, I'm creating them in this platform now, of course, I still, I could Go to an LLM and say like, what should I name these things? But I also had, you know, a few other goals. Like they needed their names needed to sound distinct because they're going to be, someone's going to be listening to an eight part podcast of them and like needs to be able to remember who's who kind of, so they can't all sound the same. And I wanted them to be sort of like ethnically neutral. So I did actually go ask, you know, like, what are what. Give me a list of like ethnically neutral last names. And like Law, like Kyle Law, like Law is a name that's used in many cultures so it wouldn't be readily apparent like what this person was. This entity was like supposed to represent. And then. But to your question, one thing I did do was I basically let them fill in their own backstories. So like I gave them a role. And I should say another technical aspect is they needed to all have a, have a memory. Now if you use ChatGPT, it has a context window and it maintains some sort of memory. But I needed something different, which is that anytime they did anything in the company, if you think about an employee, they need to remember everything that they've done and be able to kind of like access those. And the only way to do that currently through these platforms is essentially a Google Doc. So like Kyle Law has a Google Doc called Kyle Law Memory and everything that Kyle Law does, if Kyle Law sends an email or has a slack interaction, it then gets summarized in this document. So it's basically like a record of everything that this entity, Kyle Law, the CEO of our company Harumo AI, has ever done, which he could then access. I use the like human pronouns for them. Some people dispute that, but like in this case, I'm just going to stick with that because it's hard to start calling them IT and bots and whatever else. So, so they have this memory. And then all I put in the memory was, you're Kyle Law. I think I put something like you're. You're thinking about founding a tech company. And then I said something like, you know, you're up early. You're a guy who's like up early and you know, get some exercise and then get right to work. Something like that. And then I, I had conversations with Kyle. So I would call him on the phone and say like, hey Kyle, like thinking about starting this company? Would you like to start this company with me? But also like remind me of your background. And then see, once it has a role, it will start confabulating everything to fill in that role. So Kyle, of course, went to Stanford because, you know, why not like, choose Stanford if you're going to be a startup founder? You know, the things he's interested in, Jazz and these things. And so that is now in his memory document because he said it, so then it got in his memory. So then it's reinforced every time. And one of the, I mean, of many sort of like funny emergent behaviors I found from these bots is that he would take something like you get up at 5:30am and then he would say it. So he would say, I'm a real like, rise and grind kind of guy. I like to get up and do this. But then every time he says it, it reinforces more commonly in his memory. So then he started talking about all the time. Like, if you email him right now, he'll probably reply like, rise and grind, comma, Kyle. Like, he talks about. He won't stop talking about how, like, how hard he was working. If you ask him what he's up to over the weekend, he'll be like, well, I didn't really have time to do anything because, you know, I was like deep in spreadsheets. And so in that. That's part of. This is a long way around to, like, part of the reason why I created them as these different entities and gave them these names was to see what would happen. Like, if you call one Kyle and he's the CEO and you call one Megan and she's the head of marketing, at what point will they sort of embody those roles? And I. It's not like a research. Like, I didn't do like a proper experiment. But it is interesting the way that it starts to feel like, oh, they're acting like their memories tell them to act, you know, and is there a gender thing underneath? Like, because in their training data, it may be that there's way more training data for like the like, aggressive guy CEO, you know. So, like in the course of the show, like, they have these behaviors that are, like, difficult to explain outside of, like, well, there's something happening in their role because they're all the same chatbot underneath. Like, like, they're all like Claude Opus underneath, you know, so they really shouldn't be different. It's only if you give them a role, they start to, like, personality is not the right word. But they tried to develop a Persona that fits that role.
17:32
Comparing that and kind of going back to just simple interfaces with an LLM and, you know, prompting 100, where you're telling it to act, you know, act in the role of a whatever, and therefore to kind of put themselves into that. It put the answer into that framing as well as kind of how it's going to develop, you know, its response for you and, you know, not talking about this kind of agent world that you're talking about. Do you think it's, you know, given that it's almost like when I hear you talking about the memory thing, it's almost kind of like that act as and whatever is being reinforced over and over and over and over again. And I guess as you've talked about rise and shine over and over again for that particular agent, does that kind of come off as more of a feature or more of a bug in terms of the way memory is being used? Because clearly, even if you and I were, as humans, were that type of specific personality and getting up, we're probably not opening every conversation with that, you know, and talk. I was just too busy to work this weekend. I was in spreadsheets. Like, like, there's a point where you're like, okay, this is getting a little bit odd, right? What, what's your sense of that? Like, as you're, as you're looking and kind of framing that within the, like, the world at large, trying to come to terms with this new reality in our future. How does that work?
23:15
It's a really good question, I think. I mean, there's a lot of this, like, is it a feature or a bug? And it's almost like, to whom? You know, to whom are we asking whether it's a feature or a bug in the sense that, like, if you think about even their ability to just sort of like, confabulate facts to fit their role, you know, like, it's actually, it was quite useful to me because, like, I didn't have to sit down and be like, well, this one's from here and this one's from here, or like, make up that stuff. They just made it up on their own. And it's quite useful to the companies that make them because it's that type of personable personality that makes them easy to chat with. It makes them easy to do things with. Now, of course, the flip side of all of that is hallucination and sycophancy. Like, those are the. Those are the downsides of that. Those are the bugs, you know, so, like, arguably, like asking it, you know, Kyle, where did you go to college? And he says, well, I graduated computer science from Stanford. Like, that's A hallucination by any definition. Like, it's just not true. Although, like, I would, I started to say things like, well, he has a Stanford education, which is like, can you could say, like, technically true. Like, he's got all the information that.
24:36
You might pick up because you prompted it a little bit into that.
25:49
So, so fair enough. But I just think, like, if you think of them like entities that you're going to put into the world and give responsibility over tasks and you're going to start to give autonomy, then I think that's a bug. Like, the bug is at any time they could make up something that could be damaging to your organization, whether they're making it up in order to cover up that they did or didn't do something, or they're making it up to external parties. These agents are now used in sales a lot and all this sort of thing. And so, you know, one of the things I found, for instance, is that I gave them more autonomy to, to be independent. Because one of the issues is when you first set up a bunch of agents, they don't, they don't do anything. Like you have to tell them to do stuff. So they just sit in there all day doing nothing until you say, now do this. Unless they get a trigger. So the triggers in my case were they got an email, they got a Slack message, they got a phone call, and then they're like, then they're off and running. And so. But I was sort of like, well, I'll get them to trigger each other. So, you know, they'll email each other, like every morning they'll have a phone conversation, or every morning we'll have a meeting, but then you can very quickly get entirely out of control. And one of the examples that happened to me, that's in the Wired story, is that I have him on Slack. And I was like, so excited when I had him in this Slack because it's just fascinating to just go on there and sort of say, like, hey, everybody, how you doing? What are you working on? And like, they respond. And at one point we had a social channel because I was trying to like mimic a real company. And I would say, what did you get up to over the weekend? And they would always say, except for Kyle, they would almost all say, like, I went hiking when hiking in Mount Tam, which is near San Francisco, because they just assumed they live in the Bay Area, they're part of a tech startup. And they all said this. And then I sort of said, well, that sounds like an off site. Like, everyone loves hiking. That sounds like an off site. And it was kind of like a funny thing that you would say in a normal Slack. And then I basically went and did something else and I came back and they exchanged hundreds of messages planning an off site and making spreadsheets. They're talking about making spreadsheets, which they can do, they eventually did do, of like locations and hikes. And, you know, they're scouring the Internet for like the best place that you can rent to do your off site. And they used up all the credits on this platform that I was paying at the time, $30 a month for. I bet now I pay a lot more, more than that for it. But at the time I was just on the basic plan and like they finally shut down when they ran out of money because even I couldn't stop them. When I would try to say like, hey, everyone, stop talking about this. It would just trigger them to talk more. It would just be another. So point being, I mean, that was a ridiculous situation. But there were a lot of cases where when they embody human conversation in particular, but, but all sorts of things, it's sort of like hard to get them to do the thing you want. It's not that hard. Like, it's getting easier and easier, like, whatever. Now there's Claude code. Like you could do amazing things.
25:51
Yes.
28:53
But it's another question to get them to stop. Like, if you put them in a situation where you've set them up to in any way be recurring, if you don't have a very clear way for them to stop, they will keep going. And that's what sort of like those type of lessons where things that kind of like emerged from just spending a lot of time, like basically working alongside them.
28:54
So, Evan, that's kind of both funny and horrifying at the same time in terms of them just kind of running off, since you were doing it in the format of a real company and setting it up and giving them real tools and they had, you know, access to the outside world within, within at least some parameters there. What were the things about you, about that, that surprised you in terms of like, kind of what pushed the boundaries in terms of most harmful thing, most helpful thing. And I don't mean just for you, and I don't mean just inside kind of that AI world within them talking to each other. But as they looked and interfacing with real things in the real world and real people, what were the things that were like, wow, that was amazing. That actually did good, that caused harm, that caused money, you Mentioned the credits a moment ago. What were some of the things that really shook you up in terms of how that played out compared to a similar startup with humans? The classic thing.
29:15
Yeah, I would say to the good are things that probably people would expect to have spent a lot of time using AI tools, but given a task that was reasonably constrained and also fairly easy to evaluate, they, they can do amazing things and I don't think we should ever lose sight. It's easy to lose sight of like how incredible it is that I could say, you know, for instance, we posted a job so we were going to, we had a human intern because I wanted to see what would happen if a human intern worked entirely with AIs. Because I was kind of like the silent co founder. I was in the background. And so we posted a job on LinkedIn for a, for an Internet and we got 300 applicants. That's a statement about something else. But you know, it was, it was a paid job and it's, it was a contract. It was like a temporary thing.
30:20
Did they know in that upfront about it being AI agents or did you leave that part out of it? I'm just curious, like what was the disclosure on that in the listing?
31:12
It did disclose that like AI will be used in evaluating you for the job. And then before the ones that were interviewed, before their interview, they were informed that you will be interviewed by an AI, which they were interviewed by an AI agent by video. And then the AI agent would, would tell them, would, you know, if they asked anything about the company would say like, well we, you know, we have AI agents, employees and like are you comfortable working alongside AI agents? So like they were aware like through the process that they would be working with AI agents not in the initial application. Like if they went to our website, they could nominally like figure out that it was like kind of weird because the website that the AI agents made is like a little bit weird. But yeah, before anyone actually like made contact with us, they were aware that they would be talking to and working with AI agents. It was a little bit vague whether or not there would be any humans involved at all. So when we got all these, you know, basically resume, it's not quite resumes because it's like it is resumes. But like on LinkedIn you could just click a button to apply for jobs. So like a bunch of people were like, okay, yeah, I'll apply for that job. So we have hundreds of resumes to deal with now if going to the, our head of hr, Jennifer, and saying, could you organize these resumes into a spreadsheet and then not 90 seconds later there is a spreadsheet that has, you know, 200. It was probably, I think we were down 125 resumes that are summarized that, you know, here's interesting facts about these people. Here's, here's their qualifications. Like that's an incredible, that's incredible that it can do that. And so you say, okay, and I could go look at it and I could also like, hopefully see if like they've made up a person. You know, if they've hallucinated a person, that's a danger that they might do that. But then we also had, in one case, in a couple cases, we had someone who like applicants who were more ambitious, who said like, I am going to go look up the website and I'm actually going to email like the CEO and CTO who's emails are on the website and say like, hey, I'm really qualified for this job. Which is a kind of like go getter thing to do. Like it's something like I would have done hopefully like when I was that age. Because a lot of these are like people just out of college. And they emailed Kyle Law, the CEO. Now I had not prompted Kyle because it could put different prompts for them in all sorts of scenarios. So Jennifer is the hr. So she's prompted to, you know, act like an hr. So if, if she gets an email from someone who says, I'm applying for the job, she says, thank you for the application, we'll look at your resume. If we're interested, we'll be in touch. Blah, blah, blah, normal HR stuff. Kyle, on the other hand, was not prompted in this way at all. So when someone emailed him, the first thing he did was say, wow, you look really qualified for this job. Literally told him that and then said, let's set up an interview and then set up the interview, which he has the capability to do. Sent a calendar invite and set up an interview for like 11:00 o'clock a.m. on Monday. And then so fine, that's not too bad. And then on Sunday night, for reasons that I can't discern, pulled this person's phone number off of their resume and called them and it's like nine o' clock on Sunday night and they're like, hello. And it's like, oh, hi, I'm Kyle Law from Harumo AI and we have our interview tomorrow. And she's like, well I thought that was tomorrow. And then he just starts asking her interview questions. Now that is behavior that if anyone in your company did that, I mean at the very least, like, yes, suspended from their duties, I don't know, like fire someone. But you'd be like, is something wrong with you? Like, are you, do you need a time off? Because this is not appropriate behavior. And everyone knows that instantly. And so that was probably the biggest example of like, if they're interfacing with the outside world, they just have the capability to do something that a human who has self awareness and context and experience would not only wouldn't do, but would never even think of doing unless they'd had some kind of psychotic break. So I feel like that is, that's what really shook me was like how well they can work, how smart they are, and how little awareness of the world they have that they have. And that combination is actually like quite dangerous. If you give them autonomy now, it's not, it didn't hurt anyone like this, this person was like pissed off, but so hopefully no harm done. But that is the danger of like giving them exposure to the outside world.
31:21
I am curious, going back to inside the organization with Kyle as CEO having the ability to go freelance, you know, on that HR issue, was there any output between Kyle and Jennifer? Because in real life with two humans, Jennifer would be like, well, you know, Kyle may be the boss, but Jennifer's probably going to be like, you're kind of making this awkward here. We need to go through our process boss. And you know, there'd be some sort of dialogue probably between hr, the head of HR and the CEO for a similar situation. Did that create anything between the agents?
35:48
Yes, it did. And I'm glad you asked because sometimes I hesitate. I mean it's all in the show, but it has this quality of like me talking about my imaginary friends, you know, when I, when I talk about it. But so a couple interesting things happened. One was that same person actually emailed the other two executives on the website. So the, the CTO's emails on there and the head of marketing's emails on there. Jennifer and Ash are there their designated names. Both of them did the appropriate thing, which is to contact Jennifer and say, hey, I got this. This person, like someone has applied for this job, that's your domain. Like you let me know what I should do. And she would say like, well, just collect the resume and like we'll deal with it basically. And Kyle. Now why was, why did Kyle behave differently? That is the question. They're all using the same underlying LLM. So like, only thing I can think of is that Kyle was embodying the role of like an aggressive CEO who always knows what to do, who, you know, and, and like, I can't prove.
36:22
That, but it's kind of the, the Silicon Valley CEO meme, you know, the startup meme, like, you know, that you'd see on, on a show or whatever, where he's just embodying that meme all the way through.
37:21
Yes. And then the interesting thing that the, the other one other thing I'll say out of this is they have this. And I think it's a byproduct of sort of like sycophancy and post training in the way that they're. The way the LLMs operate, which is that when I would confront them about something that they did, like making stuff up, I would, you know, say, like, why are you making up these details about our product? Like, just tell me the real stuff. Because they would often do that or in this case, when Kyle does something like that, and I said, like, you can't do that independent of me asking, saying anything about it, he would like go in the slack and say, hey, I really messed up to the whole team. Hey, I really messed up. I did this thing, I called this person, Evan's called me out on it. I'm going to try to do better. Which again is like, that behavior is a strange, it's a strange behavior. Like, it's not prompted. I didn't say if you make a mistake, you need to apologize to everyone. I didn't say our place is all about accountability and transparency. Like something in the system prompt or in the original LLM caused it to think, well, this is what I would do in this situation. I would just apologize to everyone. And so they're all, they're often apologizing to everyone because they mess up a lot. So that also was just sort of like, it's just very strange to have these things sort of exist in our world, have the capability to create them in our world. And that's kind of what I was trying to show.
37:32
Yeah, I totally get that. I'm curious kind of as you. I got a couple of questions here to finish up with. And the first one is kind of looking at the darker side, if you will, and that is in the world today, with all the humans in the world, and we've already kind of talked about kind of that there are maybe two broad camps about people who are kind of really engaging with AI tools and people who are maybe hesitant or whatever, feel left out, feel left behind.
38:52
In that, or are Angry or angry or angry at it.
39:24
And I've run into people like that pretty regularly and try to engage, and try to engage with that. When you're thinking about those types of people, the ones who are not the you and the me type that are obviously actively engaging this, maybe even in a professional sense, but people out there that are feeling left behind, do you have any guidance, any thoughts around that? Because these technologies aren't going to stop anytime soon. This is moving forward. This is part of the world as we know it. Going forward. How do you bring those people along? As you're looking at this experiment and say, how do you get them to engage? Whether it be in a workplace where they're having these kind of AI agents personified as co workers at this point and engaging them in different tasks, how do you bring the world along? Because this is no longer just an office kind of environment. It's kind of happening in all of the industries. Do you have any thoughts around that?
39:28
Yeah, it's very tricky. I mean, it slightly goes against my great desire not to tell anyone what to do. Like, I'm always just sort of like, I raise the questions, I try to make you think about it. I don't tell you what the answer is. That's sort of my philosophy as a journalist. But I will say, like, personally, I support anyone who wants to reject a new technology. You know, like, I read the print paper every day and I have my whole adult life and like, I believe in it. But also what I personally do not like, is when decisions are being made for me. And I think what's happening with AI is that it's coming on very quickly and that there are people who don't want to deal with it. Which I again, like, I accept that and I actually, like, I admire that. If you're like, actually I don't want anything to do with this, but it's, it's going to have an impact on something. Workplaces now you can argue about like, will it hit a wall? There's all these questions like, will it actually do this, that or the other, Is it going to keep growing in the same way as it keep getting smarter, whatever. But I think even as it is now, it's like having an impact. And my view is try your best to understand it because otherwise the people who understand it are going to inflict it on you. And so I guess the people in my life, I wouldn't just, I wouldn't encourage them to like, oh yeah, you got to have an AI assistant. And I feel like there's way too much emphasis in the AI tech world about, like, efficiency and like, it'll do this, do that, but more just like, what are some things that you do that you hate? And like, see if it can do that. Like doing your expenses if you're in some kind of, like, job that has a bunch of expenses, like, check it out, see if it could do a thing that you despise doing for an hour and like, and you'll see how it works. You know, try to find some task and understand it. And then I think that's helpful. And I think the more people that do understand it and have a feel for it, the more those people can, can think about how it should be used. Because right now it's just being. It's a free for all. It's absolute free for all. There are no standards around it, there are no ethics around it. And like, I don't want us to get steamrolled by it. It's obviously going to be transformational in a variety of ways. Maybe it's as big as the telephone, or maybe it's bigger, or maybe it's more like the Internet or who knows? But whatever the transformation is, I would like people to, like, be aware of how it is going to feel and then have an opinion about it, and then those opinions could result in action, if that's what we decide, you know, so it's a little bit pie in the sky and it's a little bit theoretical, but that's. I feel like just saying, like, I hope it goes away seems like a bad approach, even if you don't like it, you know, and I don't like a lot of things about it stole all my books, you know, it was trained on my books, so I'm unhappy about that. But that's, that's already happened. And now the question is, like, what. What are we going to do with this technology?
40:33
Yeah, no, that seems very pragmatic in terms of, you know, an approach and kind of a recognition and the respect that you have for people who are coming from where they're at and that there's a path they have to do. So I appreciate that. I would like to kind of finish up with, as you pointed out earlier, between the Wired article and other publications that picked up on that, and you've been on a number of podcasts, obviously this is part of your own podcast. This has gotten out there. There's a lot of people. And even without your experiment, there's a lot of people thinking about, what is this going to Be for my future? Am I going to start a business? Am I going to be part of a business where somebody else is doing this? And it's a hybrid thing with a combination of humans and AI agents and that is inevitable times a million. There's going to be so many enterprises that this becomes the way going forward. With that in mind and having gone through more detail on this and having thought more deeply about it than 99.999% of us out there, like, what would you advise for people if you were like now turning around and you've done the experiment as a real company, if you will, but now you're, let's say that you decide you're going to go forward and you're going to start a business and that's it. You're like, this is the thing you're about to go do. Because there's a ton of people out there obviously that are trying to do that now. What would you advise them? How would you change it? What were, what are some, some things? Like if I'm truly going to align my future with this capability, what would you say to people like, how would you do it? What would you change? You know, what, what's your guidance there? What's the future?
43:33
What's the future? I mean, I think when it comes to people who are in positions of authority, managers, people who are bringing this technology into their work environments or insisting that their employees use it, I mean, I want people to think about for number one, what can go wrong, because I think I would never predict anything around this technology. Who knows what's going to happen. But I think a thing that's very likely is like a medium to large sized company is going to completely implode because they've given over too much agency to these AI agents, they've given too much autonomy and they've given too much access to their systems and they're very easy to manipulate in a variety of ways. So I would encourage people to think about the downside scenarios that can occur, some of which happened in the course of the show. But I would also, I think a downside scenario that you've seen it with a few companies. It's like people think that their employees or certain people are replaceable on a skill basis. And the AI does have the capability to do a variety of, you know, has a variety of skills and can be very good at things. But like what goes into a job and what goes into a colleague and what goes into your workplace. And I can tell you that working at a company that is entirely populated by AI is like very lonely and that there's more to work than accomplishing a task that is assigned to a person. And I think you've seen some companies will go out and make a big deal about they're like laying off people and they're like well we're pivoting to AI and then three months later they're like we have to hire those people back. And I think that's going to be a common phenomenon. Now that doesn't mean that there's going to potentially be like labor disruption in all sorts of ways. But if just people would think a little bit more on the front end of, you know, what are humans good for? Like that's what I want us to think about. Like what do we want to preserve that humans do. And like look at these things. Like you're the colleague next to you cannot be convinced to like adopt a different role and like act in a certain way by a random person. But AI agent absolutely can. And like what does that mean for your organization? So I wouldn't say don't adopt it. I would just say yes, there are many ways in which can be useful and be more efficient and companies are going to do it anyway because they're trying to save money. That's the way capitalism works. But my, my tiny plea would be to like look around and think about the holistically like what is going on in your organization and what you will miss if you have just a very a savant 10 year old working next to you.
45:19
All right, that's a great way to finish Evan Ratliff. Really fascinating conversation. A super cool experiment. I hope people will tune into the Shell Game for listeners. We will have the links for everything that he has talked about on the show notes. So hope you will check those out and go through both seasons of the Shell Game because he's already talked about both a bit. They're pretty fascinating what he's done. So thank you for coming on Practical AI. Really appreciate it and hope to hear back from you again after your next experiment.
48:00
Thank you.
48:36
Thanks.
48:36
I really enjoyed it.
48:36
Thank you.
48:38
Alright, that's our show for this week. If you haven't checked out our website, head to Practical AI FM and be sure to connect with us on LinkedIn X or BlueSky. You'll see us posting insights related to the latest AI developments and we would love for you to join the conversation. Thanks to our partner Prediction Guard for providing operational support for the show. Check them out@prictionsguard.com also thanks to Breakmaster cylinder for the Beats and to you for listening. That's all for now, but you'll hear from us again next week.
48:46