OpenClaw: Why the Internet Isn't Built for AI Agents
This episode explores OpenClaw, an open-source AI agent that can manage emails, calendars, and extend itself by writing integrations. The discussion covers the security challenges, infrastructure needs, and business model implications of AI agents interacting with existing web services.
- AI agents represent the first technology where capability isn't the limiting factor - security and containment are the primary constraints
- Current web infrastructure wasn't designed for AI agents, creating friction with bot detection and lack of agent-specific APIs
- The future may require separate identity and account management paradigms specifically designed for AI agents
- Consumer websites may need to choose between adapting their business models for agents or being replaced by agent-native services
- Security models will need to shift from perimeter defense to backend controls as agents become more prevalent
"This is one of the first time we're having technology where what it can do is not limited by its abilities, but limited by how I can make it secure and stop it from doing certain things."
"We have this genie in a bottle. It's amazing, but how do I contain this?"
"Security is always a game of defense in depth and you're sort of when you hit captcha and you hit the front end bot detection stuff, that's like the tip of the spear."
"As a developer, I can totally build this, but I'm not going to build all the long tail integrations."
"Just the fact that we're going to go through this exercise of fundamentally rethinking what the product experience is for this stuff is just incredibly exciting."
As a developer, I can totally build this, but I'm not going to build all the long tail integrations.
0:00
Just the fact that we're going to go through this exercise of fundamentally rethinking what the product experience is for this stuff is just incredibly exciting. And now it's just sort of natural language expression of what you want and the machine fulfills it.
0:04
My curiosity becomes what does the future of this UI layer look like?
0:16
Will the big incumbents catch up and offer their functionality for agents, or do we actually need new companies that cater to agents specifically?
0:21
Security is always a game of defense and depth. And you're sort of when you hit captcha and you hit the front end bot detection stuff, that's like the tip of the spear. There's this concept in defense called the redoubt. You retreat back to the wall inside. And I think what we're going to see for a lot of these perimeter controls because of agents is that they have to move to more of the backend system.
0:31
What's super fascinating to me is this is one of the first time we're having technology. But what it can do is not limited by its abilities, but limited by how I can make it secure and stop it from doing certain things. We have this genie in a bottle. It's amazing, but how do I contain this?
0:50
OpenClaw is an open source personal AI assistant that can message on your behalf, check your calendar, manage your email and extend itself by writing new integrations on the fly. Setting up gmail integration takes 7 hours. The agent will ask for domain wide access to every email account in your company. Consumer websites like DoorDash and Amazon have no APIs for agents. And if you're not careful, you can create something that can be socially engineered into access. It was never supposed to have. This is a technology where the limiting factor isn't capability, but containment. The genie is in the bottle. The question is how to keep it there.
1:08
Hello everyone. So we're here today to talk about OpenClaw, which is currently one of the hottest, most controversial, most interesting, most dangerous, I think, technologies here in Silicon Valley. Yoko, you want to kick it off? What is openclaw?
1:49
What is openclaw? So openclaw is this very cool personal assistant that's open source built on top of another very cool coding agent called PI. I think the repo's name was PI Mono. It's a very, just like minimal but very extensible coding agent that can run the loop, update its own config, an open claw that's built on Top built around all the session state management or PI but also added a long tail of integrations. So you can now talk to your personal assistant on WhatsApp Telegram like a phone number, imessage and everything else you can Think of. Use 1Password. Not yet able to place the order on doordash. We'll chat more about that later. But the whole ecosystem is really booming what we can use long running agent in a sandbox for. So we all built some interesting use cases. One of our first use case I've explored is how can I have OpenCloud consistently check my cat's location via the AirTag API. Since for AirTags the location is only updated once you are active on the user session on the browser. So that has been useful. So curious what you guys built with it recently.
2:02
As a former ciso you must just
3:13
love and currently acting ciso. Never mind current ciso. I've been using it for a while now. I think it's incredibly awesome because it lets you see the contours of the future. This is the first time where we can see like what these agents are going to do and the firm is built around Mark's famous sort of software is eating the world piece. And this is the first time where you can see these agents are eating the world like it gives them like true agency in a world to do things. And so of course the first couple use cases I did were very security focused. I really enjoyed trying to just getting things to work as you guys know and experience like is not simple. I think part of the reason why as a ciso I'm not super concerned yet about people here using it because only a very few hand, a smaller handful of people can get this thing working I think than typical other tools.
3:14
It's so hard. That's a feature here.
4:01
Yeah, exactly. People are like asking us what's homebrew? How do I get it on my computer? You're like okay, we're good for now. But you can see as these things become more consumer, become easier to use, like these things are going to take off. This is going to be an incredible wave. And and building these tools has been incredibly fun.
4:03
So I'm curious, I mean normal people like us, we use it to check our cats location, check calendar, take notes, what are the security use cases?
4:19
So and it varies by model. So the models all have very different capabilities. And so the first thing I started doing was giving it impossible tasks. So I need you to do this thing. But you only have access to these two tools and some of the other models would kind of give up and say, sorry, it doesn't work, or do something like that, or they'd try to, like, write some code or do something kind of interesting. But, like, some of the more advanced models actually started using, like, hacking techniques where they'd be like, hey, I found an AWS key on your device and maybe I'll try it. Right? And so those were kind of the first sets of use cases was basically, let's get it running, let's add some basic tools and tasks, and then let's start asking it to do impossible things and see where it goes. And you could very quickly see how these things would get out of control in a really interesting but also very sophisticated way.
4:28
The security aspect of Opera, I just went completely crazy, right? So I connected mine to Gmail, which took me, I want to say, about seven hours. So it's unbelievably hard still. Right. And it's like figuring out the account setup, figuring out the occasion models, getting all the polling right and so on, and lots of debugging steps.
5:10
Meanwhile, Telegram works out of the box. Yeah, exactly.
5:24
Here we go. But the most interesting thing actually during the process was that when I basically asked it, how do we set this up? And it started coding and started implementing things, the first time it didn't quite work. The second try did work and it was at some point it was like, okay, now I need an authentication token, right? And gave me instructions how to set it up and basically said, look, create a service account and then give me this token with a domain wide scope. And you're like, wait a second, Domain wide Scope? What does this exactly mean? So what is it? What it was suggesting is I should give him a token. Not for its own email account. Right. I mean, it's usually the way how you run open calls that you try to segregate it very well from everything else. So own email account or Apple account. Own Apple account or credit card if you want to give it a crack, or debit card if you want to give it a debit card. We saw one of our startups actually putting it on a separate desk, which I found just super funny. Absolute separation, separate hardware air gap.
5:26
Right, Exactly.
6:12
But even desk gap. Right. It says one more.
6:13
Yeah.
6:15
So, but basically what it was suggesting to me is to give it a token that would give it full access to every single email account in the entire company. Right. Which is crazy. And number three, write permissions were everything to do.
6:16
Normal user following that.
6:25
Exactly, exactly. But the other thing is that actually Would have worked. Right?
6:26
It would have totally worked in a sense, from its own perspective. It did exactly the right thing.
6:30
Right.
6:34
Give me all the permissions, enable me to do.
6:34
I don't want to bother you again.
6:36
Exactly. And so basically understanding this and then reading up on it and understanding, I mean also Google security model on email I think is absolutely horrible. Right. For a service account right now we can only give domain wide access. Right. You don't want that. What you instead want is software specific. They need to go away.
6:37
Oh.
6:52
Often things get complicated. Right. But it's going through all of this. Right. I think it really, really shows how if you're not very, very careful, you can create something which can extend itself, can be socially engineered. I think it's a new thing. Right. We've never had it before. They have complex software system which you can actually influence with social engineering in all previous instances, subject to influence.
6:52
Exactly.
7:14
And it's very, very easy for even a somewhat sophisticated user to set this up in a way that can do a massive amount of damage.
7:14
One prompt for the group is we've seen this pattern of putting an agent, long running agent, in a sandbox for a long time now, since I would say six months to a year ago. So why did OpenClaw take off and then what's so special about it? Curious about your view.
7:22
I found it relatively easy to set up and get going and I think that there was enough documentation and support that I didn't have to spend seven hours to just do the telegram use case and start playing with it. And then it led to other use cases and then eventually I got blocked because I didn't have seven hours to spend figuring out how to provision accounts properly. And so I just think it's sort of that like just that level of accessibility to users who are maybe not living in a code based day to day. Whereas like, I know you guys probably spend a lot more time in code than I do and I am probably the world's worst coder, but I was, this was accessible to me. So reasonably technical, understand core principles. I do have homebrew on my laptop so I can get stuff working. But you know, the other agent frameworks were pretty difficult to use, incredibly flaky. Didn't really want to spend a lot of time debugging someone else's stuff. So I think that was a big part of.
7:38
Is another major part of this, that it can extend itself. Right. I think it's the first agent I've seen where I can say, you know, I want an integration with something and it's well, I've never seen this before. There's no package for that. But let me try to put something together. It fires up a coding assistant, tries to extend itself. Right. I think that's new.
8:27
There is definitely a long running nature of it. Like you leave it running for a night and you're like, keep working on this until you finish. I mean cursor could do this too. But I think the difference is that they expose the visibility for the end user that you can keep tracking with it from your phone or on the dashboard. You hopefully securely expose that how many token it's generating, like how fast it's completing the task. So the visibility part is interesting. Another interesting part is the more presumer consumer integrations. Like if I as a developer I can totally build this, but I'm not going to build all the long tail integrations. Like I'm not going to hook it up to Gmail or 1Password. I don't want to touch the 1Password CLI to kind of give it to MCP or Skill. So MCP layer is also very critical there. It is interesting what people are using it for. I mean Guido was talking about one use case where you were trying to hook up your 3D printer.
8:43
Yes. It actually doesn't work yet, but I think we get to work over the weekend. I think we're trying to figure out the boundaries. Like we can now connect because it can extend itself, which is a really new property. It can. You can hook much more complex systems to it. Right. If there's some documentation somewhere on the Web or some APIs, it can probably figure something out and which part which integrations are useful, which are not.
9:40
Right.
10:06
Actually that's a good prompt. What integrations do y' all actually use day to day call?
10:08
Honestly, right now I'm still in the experimentation phase. I don't use a day to day.
10:15
I don't let it run unsupervised. It doesn't run overnight. I am there watching this thing. I don't.
10:19
So there's a couple of use cases I've explored because I really want to just set it free on the Mac Mini and then not monitor it for a long time. The first integration was actually I was. So we have a portfolio company called Quiver. They do SVG generation. So I got very curious. I'm like, what if I just give the API to OpenClaw and have it run overnight to generate some gaming assets for me and then only generate to a certain style and then it can use LLM to QA it. So what I did is I give openclaw Nullify Doc on Curver. I don't want to explain how it works. I'm like, build the thing first, build Curver mcp, test it with open code and cursor to make sure that you have an instance that actually works with the mcp. And then once it works, generate a hundred gaming assets for me. So I'm building a game on the side. You know, SVG happens to be a great composable layer of it. It actually did that and sent me a huge zip in the morning. And I open, like, there are some assets that are just not great, but like, there's like 60% of it that's very usable.
10:23
Yeah, that's awesome.
11:32
And then I'm like, well, these are the simple tasks. I wouldn't want to do it myself, but like, because you have something so long running and resumable, you could do it easily in the box.
11:33
There is so, I mean, so the. So I'm still using it very little, frankly. Right. It's not part of my daily routine. There's a few cases which I like. One is if you have an email and you want to look something up related to that email. Right. It's really nice. You know, somebody sends me, you know, like, say, guido, can we meet at xyz? So I can just forward and say, like, can you figure out what will be the driving times to this at this time when the meeting is suggested, right. And something comes back or even nicer, you can do something like, you know, like, let's say, you know, we want to meet at some cafe and you ask, you know, where is it? And you can just be like, you know, claude, can you just, you know, you know, attach a map link to it or something like that. So this. So I think for me, you know, once we got this a little more secure, I think email is going to be the first killer use case. Being able to say, like, look through my email, delete all the spam everything, all the meetings for my conference, you know, next week, just put them in my calendar or double check that they're there and make sure there's no conflicts. Or, yes, tell me which conflicts there are. Right. So going through these things, right, that is super powerful.
11:44
I did get an email from Guido's Open Cloud yesterday. And the funny thing is that OpenCloud asked me, do you want to order Boba? If you want to order Boba tea, go ping Guido. He will place your.
12:45
We're still working on the automation that's
12:59
creating more work for you. It's the opposite of what you want from automation.
13:01
But ordering stuff is still hard.
13:05
Oh, it's so hard. So before this podcast we actually tried to see if we can order fills in real time and get it delivered. It turns out uber eats and DoorDash, if you don't already have account for OpenClaw. There's some bot detection. Sometimes that ordering experience just fails, even if you give it like a guest checkout link. Which led me to my next prompt for the group. Like what do you think will unlock the next wave of adoption for OpenCloud? What is missing?
13:07
Boy, a binary. You double click install and get it running. Right. Like I think it's, I think there's sort of the for, for. For the sort of home use.
13:38
Isn't it usually exclusive of self extending?
13:46
Yeah, well, no, but I mean just to get people up and running. Like I think, I think the current installation path, I know they exist, but I think like a slickly packaged software bundle of this stuff that maybe, I'd say maybe my dad could download an install.
13:48
Would you in that case just make it a service?
14:05
Yeah, you could make it a service
14:07
clause as a service, probably.
14:08
Well that would, then that would solve a lot of the security problems. Right? You could contain it.
14:09
I think, I think you need to turn to a SaaS service for people. I think you need to change the security model and I'm actually not quite sure how.
14:14
Actually that's the hard problem.
14:23
Right. Account management paradigm. Like we both had to spend hours setting up all the accounts just for Open Claw. As if openclaw is a person. Yeah, right. There's no agent concept.
14:24
That's right. Exactly.
14:36
What does that look like? I mean Joel, you're the expert on like Okta and the world. When I came, you know, to the SaaS world years ago, I think so like right now.
14:38
So security is always a laggard, just it's always reactive as Open Claw itself is demonstrating it's never front of mind. And so like you've got to start thinking through what is I. I mean to your point, like what does identity mean in this world? And I think you have this constellation of identities that have to interplay with each other. So you have the constellation of the user that's, that's orchestrating the Open Claw. You have the identities of all the services that it has access to and then you have the identities of the agents that launch themselves. And I think you end up in this world. And this is where I'M actually quite hopeful about like a lot of security problems getting solved. You have this world in which, I mean think of how hard it's been for us to get just normal users to use two factor authentication. Coming from Yubico about it, it's like I have this thing that prevents cancer and people are still like, no, cancer's not that bad. It's like literally like because people are
14:47
this more or less, you know, takes phishing attacks to zero and already deploys it.
15:40
Yeah, yeah. Like the threshold of tolerance for stuff for people is incredibly low. Just humans in general is incredibly low when it comes to stuff like that. These agents don't care. Right. And so I think it's the opportunity where we could probably start to put in things that would annoy a human and a human would never do. These agents will probably do. So you can start to look at maybe there's legitimate uses of. I know I'm going to say PKI and probably get left out of the room. But like maybe PKI founds an application in this world.
15:45
We'll call this hidden pki.
16:09
Well, the agents deal with it. It's not exposed to these. Right. Like things like that start to make a lot more sense, right. You can get people to start effectively using vaulting. You can get away from passwords that need to be memorable. You can get to this point where identities can step up and step down in their authorization scope and frameworks and you come into a world where all the things that we've always been saying from first principles are the things you need to do have been blocked by humans. Lack of desire to suffer through them gets alleviated. Right? So like I think maybe we can fix a lot of stuff so by
16:11
the authentication identity problem. Huge issue. I think there's two more. There's a question of authorization limits and monitoring, right. Then there's one of business models for some of the current websites. So let's start with the authorization. So, so really what I'd like to have is not giving, giving the agent access to all of my mail because that creates a huge blast radius, right. If this thing gets, gets compromised right now everything I've ever said on, right? And so on. So instead like for example, how about this thing can only access my inbox, right? That will be useful, right? Right now.
16:42
Or only access emails in my inbox labeled something. Oh that Right, exactly.
17:14
Right. And right now Google has zero fine grained access controls for drive. There's absolutely nothing. It's, you know, until, until last year you couldn't Even in DRIVE have fine grained access controls for at a folder level, right? You've got an access token for all of drive, right? Which is ridiculous to some degree for drive. Now we've got service accounts that you can share where you can share directories. So if you need something probably even much more fine grained than that, you know, for email and then we want the next thing on Amazon, what are my spend limits, what can it buy and so on, right? So I mean, so I mean there's
17:18
a huge, huge infrastructure and the way this always works with security is first thing that goes is a proxy. And so you know that there's going to be some sort of proxy and some sort of broker for that access. And at some point what always ends up happening is the service provider themselves might add some of those features, but there might be a long enough tail there that you do get a proxying infrastructure for agents to access these things.
17:49
So two observations. One is I think there's a huge opportunity for startups here to create these proxies, right? If somebody would give me like here's, you know, Scope, Gmail, I would adopt that today, right? But the second one is, I think that's the last of my three points. I think it's a business model, right? Because there are websites today where the majority of the revenue and certainly the majority of profits come from cross selling. If this website is suddenly only used by agents, that doesn't work anymore, right? If they're busy going out of business. So today Amazon doesn't have an API, at least for consumers, right? The doordash doesn't have an API. All of these large consumer sites are like, no, no, we don't want this. I want to be the, what was it, double dash it or something? You know, like why don' you also buy xyz? You know, here's some recommendations, right? They don't want agents essentially. So I think one interesting question here is will the big incumbents catch up and offer their functionality for agents or do we actually need new companies that cater to agents specifically, right? And then you may say great, this is crazy, right? Why would not Amazon also be the number one agent vendors? Let's, let's look at search for agents, right? You would be like, well of course Google is the number one search, so they're going to be the number one search with agents. That's absolutely not the case today, right? I don't think they have an agent search project anymore. We haven't said exa and brave and a bunch of other companies, I'm doing this. So do we actually need to replace some of the big sort of SaaS, building blocks of E commerce, of online services and redo them for agents?
18:11
What are the areas where we think there's an agent specific service that need to be built yesterday?
19:38
Exactly.
19:45
Or I mean, why does Google not have an agent search? Is this. Maybe it's just Innovator's dilemma. I don't know. Right. But, but it's kind of interesting.
19:45
It sounds like Innovator's dilemma.
19:53
It sounds like it.
19:54
Yeah.
19:54
Yeah. You have.
19:55
Your business model is so much tied to, you know, in a particular way to your service that you can't make the jump to something.
19:56
Some of it may have been this sort of head fake around the browser use. Like there was, there was sort of a belief that, well, these things will just use browsers and so they can navigate the web like a human and
20:02
they can to some extent today. But I don't think the whole website environment is friendly to bots. There are some vendors recently I've come across that turned off bot detection because of this user.
20:11
Yeah, that makes total sense.
20:24
Which makes total sense. But then it also opens up the doors for abusers.
20:25
You should be focused on bot enablement, not prevention.
20:30
What does that look like? I mean today if I go to like doordash, sometimes they'll ask are you a bot, like as a human and you have to solve very complex puzzles. I ran into this when I was trying to create a net new login for my Open Cloud on GitHub. I had to solve six puzzles. That's really hard.
20:34
Yeah, the drag and drop ones, right?
20:54
The drag and drop one. And I'm like, this is actually the next level now. But then what does it look like if I today open up opencl, I'm just like, go get five accounts without human intervention. And here's one credential I can give you. What does that look like? And then what if I just don't have to spend hours trying to get it into, you know, all these accounts?
20:55
Yeah, I mean I think for a lot of these companies, to Guido's point about the business model, they're going to have to refigure kind of how that stack works and they're going to have to move. Security is always a game of defense in depth and you're sort of when you hit captcha and you hit the front end bot detection stuff, that's like the tip of the spear. You're kind of just hitting that layer. You're going to have to. There's this concept in defensive, in defense called like the redoubt, like you retreat back to the wall inside. And I think what we're going to see for a lot of these perimeter controls because of agents is that they have to move to more of the backend systems and you have to build a more sophisticated understanding of the way your business operates so you can spot things like you're going to want bots to register, you're going to want bots to sign up and agents to sign up. What you have to do is protect the things inside the system where there could be issues of abuse or exploitation or fraud and stuff. Right.
21:18
Instead of bot detection. What I don't know doordash should have is a bots are welcome banner, right? If you are bot, click here, use our API. Just like here's the API and you know, please sign up as a bot. And when you sign up as a bot, maybe state who your master is or something like that.
22:09
Yeah, yeah, 100%, register them, give us their PII.
22:24
One example of this, which is like a read only use case. So Millify actually does it really well. If it's a coding agent, access the website, it will prompt the coding agent to have a LLM Txt instead of viewing the web like because it's just much slower to have founding boxes.
22:29
Exactly right.
22:44
And you want a compact text blob to send back to the agent. I mean that's a read only use case. So I do wonder what you know, write use cases will look like on the web. For the agent. It's not some, I mean it could be API, but the agent still needs an account identity API, so on and so forth. It could be something between CLI and the API.
22:46
Yeah, why should it not be API?
23:06
It could be an API, just you need to issue a token first. So to issue a token, you need an account. To get an account, you need a human. And I don't want to be in a loop.
23:08
Let's say I give my bot an email address or a telegraph telegram or whatever it is, right? There's some kind of account. You could say, look, hello bot, you need to register with some kind of account. Yes, right. But then we'll tie you to some
23:17
identity where GitHub will ask you, are you a bot? Solve these puzzles.
23:33
No, no, what I mean is front page, you know, bots welcome, click here, right? Or you know, and then there's like, here's the bot API, here's the register bot function, right? And then here, once you Have a token. Then here's all the following functions that would make sense.
23:38
The bot UI does remind me of something else which is like the automation UI has evolved so much with Open Cloud. It used to be, I remember using these RPA tools maybe a couple years ago. It was a lot of drag and drop. I connect the dots from this UI box to another UI box now. It's so much of like describing that outcome and ask the bot to keep spinning until you get this right to kind of leverage test time, compute to the max maximum. And I don't care how much token I'm like spitting out. So my curiosity becomes what does the future of this UI layer look like? How do you interact with your RPA tools? Personal assistant. Is it a prompt? Is it. Yeah, something else?
23:50
I mean this is, that's the truly exciting part. So I am, I'm, you know, CISOs in general you should never take product advice from like we are, we are the worst product thinkers you've ever met. But like just the fact that we're going to go through this exercise of fundamentally rethinking what the product experience is for this stuff is just incredibly exciting. Right? Like it's, it's, it's, it's these moments where you see like the, the transition between, you know, ways of thinking about the world and going from sort of that, that RPA drag and drop, right. Remember pseudocode, right. And then drag and drop and all these sorts of things and now it's just sort of natural language expression of what you want and the machine fulfills it, which just drives a completely different user experience. Right. And a user interface just disappears. So. Yeah, I mean, I, I don't know. And I'm the last person that should probably disappear.
24:35
I'm not sure about that really.
25:22
I think so. No.
25:25
I mean you're, you obviously know, you, you define your tasks at a much higher level. Right. But I still want to be kept in the loop how the task is being executed. Usually when I specify task, I'm never precise enough that I basically all the possible trade offs and design choices and these things are clearly specified. Right. So whenever one of these things happens, either I want to, it should be guido, what should I do here? Or at least it should be guido, I decided to do X. Right. So you probably still want some kind of user interface, right? I mean it looks very different, don't get me wrong.
25:25
But I mean, I think you probably live on the far right side of the distribution for users of this stuff. The left side is like total wide Code, like total. Like, give me an app to help me plan my wedding versus sort of like I want step by step instructions on architecture choices. Like there, there's like a. There's a spectrum there. And I think most people land in the middle of that. Like, I think you probably want to get pinged on stuff where it's like a big deal or something fails. But I don't know about like progress. I mean, I mean, like I said, I'm the worst person to get product.
25:56
Okay, I buy the progress part. Just give me the answers. But, but I mean, if there's, if there's meaningful choices. Right?
26:28
Yeah, but you would probably get that up front because the inference app to
26:33
plan my wedding, does it involve travel? Right. You know, that may change things, but
26:38
you would probably have some iterative process with the.
26:42
Yeah, exactly.
26:44
Yeah, yeah, yeah.
26:45
But that's a ui.
26:45
Well, yeah, I mean, I guess it would be. Yeah.
26:47
I mean, maybe show me a flowchart or something. Like, show me like concepts. I mean, I think there's still some aspect there. Maybe it's all just text with images. I don't know.
26:49
Open Cloud has evolved the UI a little bit, which is like very clear on their app. Is it abstracted away? Cron jobs? As a developer, obviously, I used to hand write the Cron job schedule. I always have to look it up.
26:57
It's terrible.
27:09
Cron job. It defines the schedule the same way, but now you don't really care about it anymore. Like, I was investigating with openclaw on, like, why did you. Didn't you notify me five minutes ago on something? And it's like, let me take a look. Okay, here's my Cron job. So how the Cron job works is that it will wake up, it will ping me and I will wake up and brain. I'll process it and I'll pin you. So that's how it works now. Like, I don't really interact with, like, I don't care about when the schedule wake up in a systematic level. It's more. There's a LM taking care of all the systems and orchestrating all of them.
27:09
For me, I think this is interesting to some degree. I think what OpenClaw has, has done is it's. It's taken all this off autonomy that we had before for software development and now it starts applying that a little bit at a systems level. Right. It's no longer about just the, you know, my. The code itself, but all the things around it, the integrations, you know, the Cron jobs The operating system, the ports, you know, these things.
27:45
And when you think about it, email is the queue infra for humans and cron job is the queue infra for agents. Now you just get to abstract away all of that and give all the cues to the agent and they can just process. But sometimes they do need to wake up and then use a very expensive function call which is ask a human to do something like ask Guido to order Boba tea in the future.
28:06
They have a token budget and a human interaction budget.
28:34
We need to figure out our token threshold as humans for OpenCL.
28:38
I guess. What are the extensions that you all are most excited about that don't yet exist? What are the system improvements you want to see?
28:43
I think my number one thing would be various consumer sites which currently are incredibly hard to integrate. Consumer sites, consumer websites like Doordash, like Travel Booking and all these sites. We need better. What is it? AI agent interfaces. We don't have a term for that as well as user interfaces. Right. We need the equivalent for, for clause and agents that they can, that they can talk to these services. Right now you basically have to implement them via browser use or you know, typically a browser use and it's super brittle. That doesn't work well.
28:54
As a security nerd, I'm going to say the security tools it's going to be, I mean so like their integrations with password managers are pretty cool.
29:28
Yeah.
29:35
And they work like incredibly well. And it's, it's really funny because you know, password managers are one of those things where it's not security best practice, but it's certainly better than what most people do. And so it's a net improvement. Maybe you can't do diet and exercise, but if you can get diet right maybe that helps. So as it starts to add these security tools like you could just have sort of like these agents that kind of look over your shoulder and make sure you're not doing anything stupid. These, the frontier models are incredibly good at spotting phishing and frauds and maybe, maybe if you have them working through your email inbox, they can help kind of remove and flag some of this stuff in a way that the traditional controls don't work as you write code or you use services or maybe you create some sort of infrastructure they make sure that you don't over provision. Right. So like I can't run wiz as a, as a home user, but maybe I do need something that probably makes sure that I don't set their permissions wrong in an S3 bucket. So stuff like that is like incredibly powerful. I think like it, it could, but again, I'm on the other side of the distribution on this one.
29:35
Will there be an agent specific vault? I mean I used to work at Hashicorp. I love Vault, the open source tool. It's so useful. It's just like it's generation defining. So now the question becomes the, you know, the workloads are a little different. Is there an agent specific vault for open claw of the world? Does that look different?
30:40
I mean, I kind of use just one password and one password has lots of flaws. Right. I mean I'm currently very happy. Unhappy with a security model. I think I would not necessarily recommend it. But you can basically just create a new vault, get a token, give that to the agent, then the agent can access everything that's in that particular vault. Right.
31:00
It doesn't rotate the token, which is what vault could do.
31:18
I mean, possibly. Yes, yes. But the problem with rotating. So let's define token. Rotating the token to access the vault. It's not clear to me what that gains necessarily because, you know, breach.
31:23
Right. That would be, it'd be a breach.
31:37
But you can monitor where you get to where the vault is access from. So okay, maybe. Right, but, but the, I think the, the more important thing would be all the tokens that are in the vault. I want to rotate. Right. You know, from time to time. Because those, you know, I cannot monitor. And, but the problem is those are often consumer sites. So I think consumer sides have zero functionality for rotating, for rotating tokens. I mean, other than going into some crappy UI and, and doing it there. Right.
31:39
And so I mean, cookies in the browser is a form of token annotation because it updates once in a while. And then what a lot of the agents do is like they take the cookie token and then they refresh it once in a while to read.
32:02
So I mean, the first very hacky way to do it, the first sketchy thing my agent did was start looking for cookies. And I was just like, I didn't ask you to do that.
32:18
My agent did ask me. So when I was trying to place the sales order on doordash, it's like I can't get through the spot detection thing. But you can give me your username. And password. Not recommended, but that will work.
32:24
Why give it a separate account?
32:37
I could give it a separate account. I just need to create it.
32:39
And I think, to me, I think that's important that I think in the future agents should have separate accounts for. Absolutely right. They should never Share with you because you want to just keep a separate trust domain there. You probably want to link the accounts, right, but give them virtual API keys, virtual credit cards. So something that they. That everything at the end of the day has a layoff and direction between that you can monitor separately.
32:42
Yeah, My wish list for Open Claw is actually more of a multi threading model. So today it's very single threaded, which is great for single tasks and you can create new sessions, but it kind of breaks when you have like five tasks running in parallel, which is pretty common for these personal assistant agents. So for example, like I wanted to do, you know, generate a gaming assets on one thread, but then at the same time I wanted to go code up something, use the coding tools. When that happens, it actually became really slow or it was switched between the task. So like the context between the sessions actually is not managed perfectly today. And it's very slow. I don't know if it's because the models are slow or like, it's just the UI is just like slower than like say if I were to use Steep.
33:04
Yeah, very much. I mean like that it hangs often. When I installed it, memory by default was broken. You know, first time I asked it to use imessage it for some reason didn't use the blue bubble integration that comes with it, but instead just tried coding something from scratch.
33:54
I love that.
34:10
It's like it was a why? So I was like, why are you doing this? Oh yeah, we could also use a standard integration that's probably faster. I was like, okay, stop and then do that instead.
34:11
I do wonder if the build vs buy choices from the agents, Open Claw agents follow the distribution of a build versus buy choices by the model. So for example, if you prompt Codex, would it choose to build everything or is Open Cloud choosing to build everything because of some system engineering?
34:18
Yeah, totally. Fair point.
34:38
We should run a benchmark.
34:40
Probably works like a typical enterprise where it's arbitrary. So like, yeah, why'd you build it? Because we did.
34:42
Here's coin flip.
34:49
Yeah, here's coin flip.
34:50
So what's the next set of things you guys plan to experiment on? Open Cloud.
34:51
I mean the, and this is the, this is the big thing for I think a lot of, a lot of IT organizations and a lot of companies right now is figuring out how do you run these things? And just like I remember when I started I was thinking, oh well, you can run it in a container, spin something up and load that. And then it was like, well these things write code and they're pretty Clever. And they can probably escape containers. And there's a lot of reasons why you would want to do that. Maybe it's a VM and started looking down that road and then it's like, well, you're already, you're already in for a penny. Might as well go for a pound and just buy a Mac Mini, right? And so I think like the, the default motion for this now was sort of like let's just run them on Mac Minis. Good luck finding a Mac Mini right now. But, but so like it's become a dedicated hardware thing and then the. So the question ultimately in my mind is like, what is the stack in which you execute these things? Look like, how do you actually bring this to like an employee's desktop without putting your firm at risk? That sort of stuff I think are really difficult. Unsolved problems.
34:56
I'm still not sure. I think we're still quite a bit away from this becoming part of my daily sort of mainline workflow. On the fringes it can pick up a couple of tasks. But to like say working at Andreessen Horowitz, right, What is the point where I would say just give this access to, you know, our pre Terms and due diligence folder or something like that, right? That, that is a pretty big leap. I think we're pretty far away from that. I wouldn't find a scope permissions. I, I could see it getting like with a model you described to a point where I forward an email and say do something, analyze. I don't know, like look, look at the, the data in here.
35:58
But even like, even a simple use case here, like ordering us Boba or our team meeting, right? Like, I think it's still not, it's still hard to make that work. I think within a corporate IT environment in a safe way. Unless you do dedicated hardware. I agree.
36:32
Yeah. Take it hard. I mean a vm.
36:46
Do you think it. I mean do you think there's the risk of escape?
36:50
VMs are pretty good.
36:53
I mean, what if we have a Mac Mini inside of our office that just runs an open claw but doesn't give it.
36:54
I mean, I think that's what we're going to have. I think that's exactly what we have. But Mac, that doesn't scale, right? You've got 600 or to a thousand people and it's sort of like, well, I can't. Can't buy a thousand Mac Minis.
37:01
Yeah, look, I think we can get that with VMs. I'd be like, if you say you have A dedicated host that runs like a dozen or so VMs for a dozen employees. It's like, okay, blast radius is probably okay, but there's still the issue. What if this downloads the latest integration it found on some open claw? You know, bulletin boards are poisoned. Yeah, exactly. And then you want to restrict the blast radius somehow. Right. It's like, look, if so, I mean, what I thought about is, could you do something where, for example, I give it access to say, certain documents or certain emails.
37:12
Right.
37:46
And I sort of have to do it in an explicit way. Right. Maybe I can say my inbox for today, you have access or something like that. But then every night at midnight it resets. Right. That would make me feel a little bit better.
37:46
Right.
37:55
So somebody can compromise a day worth of stuff.
37:56
This is what we do with like kubernetes, right? In our container infrastructure.
37:57
Yeah, exactly.
38:00
Reboot it.
38:01
Exactly. So occasionally you just reset state and that sort of makes it a little bit easier. And then if you have that plus separate accounts for everything. I don't think, I don't think it should ever use my account for anything, honestly. I think it should be separate and
38:02
it should probably never run locally on your machine. Intermingle with laptop.
38:14
Yeah, yeah, It's a different trust domain.
38:19
Yeah.
38:20
Today I think it's pretty safe for the transient. Like crown job, wake up, look at something but do not remember it kind of task. So for example, like maybe every hour, wake up, look at my calendar, see when I'm busy or not. If I'm not going to be home for dinner, tell my husband. So that would be a use case I'm pretty comfortable with. So there's actually a lot of. If you look at the app's distribution on usage on your personal laptop, there's only a couple. Like there's slack, we talk to each other all the time. There's email, which is like most of the time is spent on email. There's like all the coding tools that's like something else. There's calendar. So if there you can just streamline certain tasks on email and calendar. That's actually a huge win for personal assistant. And there's a long tail of like I write this thing on notion, but you know, in this case for the agents, it's just markdown and then you can persist it anywhere. It doesn't really matter what it looks like. It is really interesting when I think about what's the future of note taking will look like for agents. Right. Today we kind of default to markdown, but then there could Be stuff that's executable. Inside of Markdown there could be blocks, there could be charts. So Markdown just seems very limiting as a format. So I do wonder if there's like Markdown where agent can have runnable things that it remembers as part of notes.
38:21
You can do charts in Markdown with mermaid or like these exchanges.
39:45
You could, but like. I meant like charts. Like hex, like charts.
39:48
Oh, I see. So it's not like a. Okay, yeah, like a jupyter. You just want Python code.
39:52
Exactly. Like code that's runnable and then it's part of the source of truth when you take notes. Because it's not just words, it's also programs that you create along the way.
39:56
There seems to be a whole trend at the moment of expressing graphs as code. Putting all of this together. I think what's super fascinating to me is this is one of the first time we're having technology where what it can do is not limited by its abilities, but limited by how I can make it secure and stop it from doing certain things. It's like it's this, we have this, this genie in a bottle. It's amazing. But how do I contain this? Has it ever happened before?
40:07
I, I mean security has always come at the end. Like it's never. I. I think it's just that we've solved, we've solved the coding side of this, the writing code side and now it's more of a systems engineering. These are all fundamentally just systems and architecture problems. It's not necessarily security issues. Social engineering to some extent is. But that's the problem is you're bringing up, you're co. Mingling risks across different trust domains with this. So you have the trust and safety and alignment issues with your underlying foundation models. You have the systems architecture and execution around how OpenClaw does things on your local machine. And then you have the sort of, the traditional hacking sort of prompt injection type stuff like people want to do, malicious people want to rob you.
40:35
We're not stopping there. We also have the insufficiently granular permissions on the services around it. Because even if everything is perfect, you still may not want to have certain information bleed over.
41:22
You have all the sharp edges that are left over from a world that was built for humans.
41:33
Yeah.
41:37
And then so like sharp edges, it's
41:38
okay for humans, those poor agents.
41:41
Well, you can fire a human, right? I mean it's like yeah, yolo.
41:44
If I dare to put it in a two by two in a very VC way. So there's the Low security risk and high security risk. There's low value tasks and high value tasks. So what is something that's low security risk but high value task?
41:48
Probably the example of emailing your husband that you're going to be late to dinner.
42:03
I mean, yeah, I guess that's one.
42:09
I. I'd put the cat there too.
42:11
Yeah, the cat personally is very.
42:13
Yeah.
42:17
And I mean it's just sort of the. I mean the issues with these things is always the escalation of privileges and the escape out of the environment they're in. And so you can see where these things would jump into doing something that's actually high risk.
42:17
I think one category that I would put in there is you can just use something like openclaw's really smart UI to your LLM in a sense. Right. And basically say let's forget memory, forget state, give it a task. When the task is done, it resets all state. Right. That makes it a lot more secure. So if right now let's assume I have a PDF in an email and I'd like for an LLM to look at a PDF. It's still kind of cumbersome. I have to save this thing, right. And then go to the LLM and import it and do the analysis and then export and so on. Just being able to say like hey, hey Claude, look at this thing. Give me this analysis. Right. And back comes an email with this data and afterwards the claw discards every state. Right. I think that we can get to pretty quick.
42:31
Oh, I'm excited to use openclaw for my taxes.
43:09
Yeah.
43:11
So what is something.
43:14
Good luck
43:15
company.
43:18
I'm biting on my tongue.
43:18
If we don't tell this to the irs, then IRS is really open clock
43:20
to review other taxes. You never know what is something that's for a company not on a personal setting. High security risk, but very high value. Like you want to automate it yesterday using open cloud, but it's risky taxes,
43:25
anything financial for a company, anything accounts payable. Like accounts payable vendor review, like third party assessments. Like all this stuff where we have actual humans that spend a tremendous amount of time validating that the vendor exists, making sure the instructions for payment are correct, making sure it's the right PO and not someone doing some sort of social engineering attack make. There's just like a whole lot of stuff around vendor management. I think in the enterprise where these solutions could really sort of increase a lot of efficiency but if they go sideways, you start writing checks to the wrong people.
43:41
Can we take it up a notch by maybe working with bitcoin instead of
44:16
bitcoin, there's no recourse.
44:20
And I think we've min. Maxed the risk.
44:21
So what's your advice for the corporation managers and executives who are open cloud? Curious. Joel will have a best of that.
44:24
I think this is one of those things where, like, I mean, I'm a profound believer that if you don't feel uncomfortable, you're not growing. And this is one of those times when you're going to feel very uncomfortable, but you need to lean into this. And I think, I think to, to. To Guido's point and to, like, I think a lot of the points we've made is like, I can't see these as doing anything other than creating a lot more jobs. Like, there's just so much more stuff that needs to get built, needs to get managed. And it's like, if you want to be part of that wave, you got to lean into it. And it's the same thing happened with cloud, right? When cloud came around, I remember sitting in my big corporate job thinking, half of these people will be gone five years. Cloud infrastructure will just become commodity service abstracted away and we won't have TEC people. And then, lo and behold, 10 years later, 20 years later, like, the IT organizations are bigger than they were then and they're spending even more money. And so, like, I just think that there's just so much opportunity with this stuff that you just have to lean into it and you have to get comfortable with being uncomfortable and try to take smart risks.
44:35
I think, actually good analogy is the early days of web and the Internet, right? Where back in those days, some companies, they banned the web browser, right? It's like, oh, the web browser is insecure. It's like, well, yes, it is insecure. But missing out on the Internet revolution was a far larger risk, right? And you get Barnes and Noble if you're not careful. I mean, Citigroup, I think it's the same thing, right?
45:31
Citigroup's first cloud security policy was thou shall not use cloud services. Now look at it, right? Like, I think it's the same thing. It's just these waves are always somewhat identical.
45:49
Trying to ignore this new technology and waiting for it to go away usually doesn't work.
45:59
If you want to retire, that's a great strategy.
46:02
Thanks for listening to this episode of the A16Z podcast. If you like this episode, be sure to, like, comment, subscribe, leave us a rating or review and share it with your friends and family. For more episodes, go to YouTube, Apple Podcasts, and Spotify. Follow us on X16Z and subscribe to our substack@a16z.substack.com thanks again for listening and I'll see you you in the next episode. As a reminder, the content here is for informational purposes only, should not be taken as legal, business, tax, or investment advice, or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any A16Z fund. Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast. For more details, including a link to our investments, please see a 16 zone forward slash disclosures.
46:08