"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Pioneering PAI: How Daniel Miessler's Personal AI Infrastructure Activates Human Agency & Creativity

148 min
Jan 18, 20263 months ago
Listen to Episode
Summary

Daniel Miessler, cybersecurity veteran and creator of the Personal AI Infrastructure (PI) framework, discusses how AI will fundamentally reshape the labor economy and the importance of building personalized AI systems that understand individual goals and context. He argues that most knowledge workers will be replaced by AI agents within the next few years, making it crucial for individuals to develop their own AI infrastructure and activate their creative potential rather than waiting for corporate employment that may disappear.

Insights
  • The key to AI effectiveness lies in scaffolding and context rather than raw model capabilities - personalized AI systems that understand your goals vastly outperform generic chatbots
  • Most companies' ideal employee count is zero, and AI will enable this by allowing founders to do all work themselves through AI agents rather than hiring humans
  • The future of cybersecurity will be AI attacking AI - organizations must build defensive AI stacks to counter increasingly sophisticated automated attacks
  • Human 'activation' - helping people realize they have valuable ideas worth sharing - becomes critical as traditional employment disappears and people need to become creators rather than workers
  • Memory systems for AI should use file system approaches with multiple levels of summarization rather than traditional RAG, allowing for better context management and self-improvement loops
Trends
Shift from AI as chatbot to AI as persistent digital assistant with deep personal contextRise of personal AI infrastructure frameworks that put individual goals at the centerAutomated cybersecurity warfare between attacking and defending AI systemsTransition from call-and-response AI interaction to proactive, background AI assistanceMovement toward AI systems that can upgrade and improve themselves based on user feedbackConvergence of all major tech companies toward personal AI assistant modelsIncreasing importance of scaffolding and context engineering over raw model performanceGrowing need for UBI as AI displaces knowledge workers faster than expectedEmergence of bespoke, highly personalized service economies alongside AI automationIntegration of multiple AI models and providers within single personalized systems
Quotes
"One of my favorite metrics is how much you dread Monday. And one of my favorite metrics for what a good life looks like is do you look forward to Monday?"
Daniel Miessler
"I think for most companies the ideal number of employees is zero. I think that's always been the ideal number."
Daniel Miessler
"The difference is when your AI understands what you're trying to do. The more your system knows about you, the more it can customize its responses."
Daniel Miessler
"The game, as of probably last year, definitely this year and going forward is it's the attacker's AI stack against the defender's AI stack that is the competition."
Daniel Miessler
"I think 2027 is the year for AGI in my definition, which is the ability to replace an average human knowledge worker."
Daniel Miessler
Full Transcript
3 Speakers
Speaker A

Hello and welcome back to the Cognitive Revolution. Today my guest is Daniel Meisler, a cybersecurity veteran, founder of the Unsupervised Learning newsletter, and creator of PI, the personal AI infrastructure framework. With the recent explosion of interest in Anthropic's CLAUDE code and this week's release of CLAUDE Cowork, the timing of this conversation was perfect. The world is collectively waking up to the importance of scaffolding, not just for task automation and coding use cases, but all sorts of knowledge work. And we're finally seeing the potential that well designed harnesses have to transform a frontier model from a chatbot into a genuine digital assistant. We begin the conversation with Daniel's philosophy and personal mission and his vision for the future of work. His goal is to increase what he calls human activation, which means helping people recognize that they can be more than cogs in a machine and that their ideas are worth developing and sharing. And he believes this is critically important because he expects that corporations will, with the arrival of sufficiently adaptable AI knowledge workers, automate routine work and reduce their human headcount, ultimately converging to a point where many companies consist of just a single human owner, supported by an army of AI agents not content to sit.

0:00

Speaker B

Back and wait for the new UBI.

1:25

Speaker A

Style social contract that he does expect we will ultimately need. Daniel's work today focuses on realizing the vision of an integrated AI system built around a single human and squarely focused on their goals both for himself and for others. Because his background is in cybersecurity, we talk a bit about how AI is changing the threat landscape, the tools and skills that his own digital assistant, which he calls kai, can use to test company systems with an unprecedented combination of speed and coverage, why everyone should expect to be the target of highly personalized spear phishing attacks going forward, and why he believes that AI systems that monitor every log and configuration and state change is really the only viable defense. From there, and for many, I expect this will be the most interesting and.

1:26

Speaker B

Valuable part of the conversation.

2:13

Speaker A

With we get into the architecture of his PI framework and some of the most interesting lessons he's learned through his tireless iteration. He describes his Telos framework, which helps individuals or organizations articulate their purpose, mission goals, problems, strategies and more, and how this provides PI with rich context at the start of every session. His file system approach to memory, which uses multiple levels of summarization and abstraction, takes help the AI navigate history, how the system tracks sentiment and assesses itself proactively to gauge how well it's helping him make progress toward his goals. How he integrates multiple model providers and orchestrates sub agents for tasks ranging from security tests to deep research. How hooks and skills allow his system to review and evaluate its own work and even upgrade itself based on new feature releases. And finally, his principle of giving the AI permission to fail as a way to reduce hallucination, task faking, and other undesirable behaviors. For me, Daniel's work represents an interesting mix of challenge and opportunity. Of course, I've used countless AI products, successfully automated many tasks for various companies and for the podcast, and generally maintained a strong sense for the AI capabilities frontier. But I've never been a particularly organized or systematic person, and to date, I've not felt that AI could really change that in a meaningful way. But now, seeing what Daniel and other pioneers have accomplished with the latest models and scaffolding frameworks, it suddenly does feel possible to use AI to overcome some of my core weaknesses and transform the way I work at a fundamental level. Will I be able to find the right mix of structure and spontaneity that allows me to more efficiently and scalably get things done while continuing to maxim my exploration and learning? And will this be the beginning of a different kind of relationship with AI, where I go from using it to allowing it to begin to shape me? The only way to find out is to take my own timeless advice and get hands on with these frameworks as much as possible so that a bit late will be my New year's resolution for 2026, and I'll definitely report back on how it's going to stay tuned for that. But for now, I hope you enjoy this exploration of personal AI infrastructure and the future of human activation with Daniel Meisler.

2:14

Speaker B

Daniel Meisler, founder of Unsupervised Learning and author of Personal AI Infrastructure welcome to the Cognitive Revolution.

4:44

Speaker C

Hey, thank you for having me.

4:53

Speaker B

I'm excited for this conversation. I think it's very timely in the sense that obviously the world is waking up to the power of Claude code, and now we've got Claude cowork mode for desktop as well. And so everybody's kind of like, oh my God, you know, this is changing my work in this way. That way I'm creating whole simulations of things that I previously just thought about. And I've got, you know, memory palaces that are now like, not just in my mind, but are actually, you know, in, you know, durable mode on computers.

4:56

Speaker C

And.

5:26

Speaker B

And you've been a pioneer of that over the last couple of years. I've certainly been an AI obsessive for that same timeframe, but have not gone nearly as deep into the personal AI infrastructure world as you and other pioneers have. So I'm really looking forward to just picking your brain as I start to play catch up a little bit on this dimension and definitely think that there's going to be a lot to learn on that and some other fronts as well maybe for starters though, you've got a background in cybersecurity. You've worked at several big companies along the way. Now you're independent and doing a handful of different things. Want to just kind of tell us like a little bit of what your portfolio looks like today and how we should kind of think of the different activities that you're known for?

5:27

Speaker C

Yeah, yeah. So my background is definitely cybersecurity. That's what I did for and I'm still doing, but that's what I did for my whole career starting in 99 and that took me all the way through. I started getting into AI at Apple. I joined a machine learning team there, was doing a bunch of stuff with machine learning and security there. So I got exposed to AI, I want to say probably around 2016 or so and then took the job at Apple in 2018 and got more exposure there and been thinking about it for a long time. But it wasn't until I went independent in about six months before ChatGPT actually, so great timing. And then ChatGPT came out in late 22 and obviously I hard pivoted, not getting away from security, but just seeing security as embedded inside of AI. AI is like a container for magnifying everything else that you're doing. My main focus now though is basically trying to help humans and companies, mostly humans, to just be able to adapt to what's coming. That's the main thing. So I do a whole bunch of open source stuff. You mentioned the PI project, that's probably the biggest one. I've got another project, open source project called Substrate, and all of it is just trying to just move humanity forward. I feel like the place that we've been at all this time has not been a good place. And it's only after it starts getting disrupted. People are like, oh, AI is going to disrupt our jobs or whatever. But right before this happened, everyone hated those jobs, you know what I mean? It's like everyone knew that this was a bad way to live. One of my favorite metrics is how much you dread Monday. And one of my favorite metrics for what a good life looks like is do you look forward to Monday? And I think Going by that metric, we haven't really been happy with corporate jobs for a very long time. What I'm trying to do is figure out what does it look like to have a better version of the human future and obviously using AI to like.

6:13

Speaker B

Sort of power that I love the starting point that just reminds us that most people didn't and frankly still don't love their jobs. I think that is one of the weirdest bits of clinging to the present or some sort of cope or whatever. It's a very strange thing to me, and I think it obviously correlates strongly with the fact that a lot of people who are in AI professionally are very privileged in many ways. And one of the great privileges that they maybe don't even realize that they have is that they have employment that they find intrinsically valuable and motivating and to some degree would probably do some of the same things even if they weren't being paid to do so or didn't need to to work for money. But I think that is just not the case for the large majority of W2 workers in the economy today. And we do. I think we would do really well to remind ourselves of that a bit more often. A couple of things that I wanted to click on that you said. One is the just what's coming. So I want to have you unpack that as you see it. Obviously people have radically different understandings of what's coming. Everything from still outright denialism, which I think is increasingly discredited and can be ignored, but there's still the sort of more credible version of AI as normal technology. And then we've got people thinking the singularity is very near. I'm somewhere in the middle, but I think I'm definitely more toward the latter. The other thing that I thought was really interesting, there was AI as a container for security. I don't know exactly what you mean by that, but it does strike me that is in contrast to a lot of what I see going on in the AI safety and control space, where the idea is like, we need to put AI in a box somehow. And so let's develop all these security measures around it, whether that's formal verification of containers to keep them sandboxed, or all sorts of other AI agents checking each other's work or what have you. But, yeah, let's start with what is it that you see is coming. And then we can go into the sort of way in which AI and security relate to each other.

8:26

Speaker C

Yeah, I think what I see coming is largely the same as a Lot of people, not everyone, but a lot of people are saying just this. It affects the balance of capital and labor, right? So it's like what happens when most knowledge work, jobs, robotics is a separate thing. Who knows how long that follow will be? But I don't think it'll take too long after AI, but essentially labor gets massively diminished. And so then ownership matters a lot more. All right, and then the question is, okay, cool, we've done all this. Productivity sounds amazing. You can now make a thousand times more stuff for one thousandth of the cost. Who's going to buy it? Because traditionally the entire system has been built on this concept of you spend your wages to buy things and then some people make things and then the cycle goes round and round. What happens when that fundamentally breaks? So that's the main change that, that I'm worried about. That's going to break the status quo. But at the same time, I'm happy that's going to happen. I'm not happy about how it's going to happen. I think it's going to be disruptive and a lot of people are going to get hurt by it. And that's the whole point of what I'm trying to do is like ease that transition, if possible. To answer your question about the security and AI thing, I think it's a great question. There's no doubt that AI is creating a bunch of security problems. But here's the way I think about this. After doing all this consulting all this time, a big part of security problems, I would argue one of the major problems is actually that people don't know what's going on. There are too many things happening inside of an organization. New products are being developed. Leadership has no idea. Things are being shipped to production. Servers are coming up and down, ports are opening up, applications are opening up, new APIs are being presented, software is decaying and becoming vulnerable. And all of that is happening at a speed at any size of company, like any decent size of company that you just can't human keep up with, right? Even if you're logging all this stuff, there's nobody to look at the stuff. There's not enough people. Let's say you have a hundred people and you're like, we really want to take this seriously. Let's increase our number of people to a thousand people. Which is not going to happen in any security org, right? Because security is not the priority. But even if they did that, it still wouldn't be able to look at most of the logs at most of the Changes because things are just happening too fast. The unique thing about AI is that with the whole agent stuff, and more importantly, the ability to just encapsulate an explanation of what we're trying to do, easily form our goals and align our projects and our work actually with those goals. This is a thing that I can do all the time, right? It could be doing this continuously. So it can help with planning inside of the company. It can help a security team, for example, or an engineering team explain to management and to other teams what they're actually doing. Right? And usually these explanations are come in the form of these big presentations. It takes dozens of people or hundreds of people in the organization, not even hours, more like days or weeks or even months to prepare the next plan to present to other people. And in the meantime, all those plans are changing from the top. So you have this constant state of churn and just old information inside of organizations that fundamentally is causing a lot of these problems with being able to efficiently manage the company and definitely secure it. So when I say AI contains other things, it means that all the things that I think are required to run a company well and to secure a company, they get easier when you have more access to the data and you can instantly produce narratives of what you're trying to actually accomplish. And basically it removes the opacity of other orgs, it removes the opacity of the top, explaining what the vision is and giving it down lower. And the broken state of that communication is just the cause of so much trouble.

10:30

Speaker B

Hey, we'll continue our interview in a moment after a word from our sponsors.

14:55

Speaker A

You're a developer who wants to innovate. Instead you're stuck fixing bottlenecks and fighting legacy code. MongoDB can help. It's a flexible, unified platform that's built for developers by developers. MongoDB is acid compliant enterprise ready with the capabilities you need to ship AI apps fast. That's why so many of the Fortune 500 trust MongoDB with their most critical workloads. Ready to think outside rows and columns? Start building@mongodb.com build.

14:59

Speaker B

That's mongodb.com build.

15:32

Speaker A

Your IT team wastes half their day.

15:36

Speaker B

On repetitive tickets, password resets, access requests.

15:38

Speaker A

Onboarding, all pulling them away from meaningful work. With Serval, you can cut help desk tickets by more than 50% while legacy players are bolting AI onto decades old systems. Servl allows your IT team to describe what they need in plain English and then writes automations in seconds. As someone who does AI consulting for.

15:42

Speaker B

A number of different companies.

16:03

Speaker A

I've seen firsthand how painful and costly manual provisioning can be. It often takes a week or more before I can start actual work.

16:04

Speaker B

If only the companies I work with.

16:13

Speaker A

Were using Servil, I'd be productive from day one.

16:15

Speaker B

Servil powers the fastest growing companies in.

16:19

Speaker A

The world, like Perplexity, Verkada, Merkor and Klay. And Servil guarantees 50% help desk automation by week four of your free pilot.

16:21

Speaker B

So get your team out of the.

16:32

Speaker A

Help desk and back to the work they enjoy. Book your free pilot@servl.com cognitive that's S-E-R-V-A-L.com.

16:33

Speaker B

Cognitive 2 yeah, two major threads there. So there's the question of like, how do we defend the role of labor and for how long can we defend it? And then there's this whole security thing around. And you even started to expand beyond security, I would say, to just like organizational dynamics in general. Yes, certainly. Anybody who's dealt with server logs knows that you're absolutely right, that there's no way to scale human time and attention to read all of the server logs. So two organizations that are coming to mind and other conversations that I've had and hope to do full episodes with before too long. One is Workshop Labs. You may have seen one of the founders there, maybe two founders there wrote the Intelligence Curse. And they're working toward a similar goal where they're like, how can we defend the bargaining position of labor as long as we can to keep the. And it's a good challenge to me because I feel like I'm. And this is maybe something worth interrogating a little bit in terms of possible difference between our worldview. I feel like in the end, again, there's a lot of cope going on. Right. I look at somebody like Tyler Cowen, who I respect tremendously and have read his work for literally 20 years now. I looked back recently and I think the first mention of zero marginal product workers, ZMP workers, he's called them, dates to like 2010, maybe even a little bit earlier than that. And it was a financial crisis, mortgage bubble bursting sort of thing, where all of a sudden, and this is fairly typical in recessions, all of a sudden companies look around and they're like, okay, we gotta get by here with less. Who do we not really need? And they don't tend to do that sort of thing because it's painful in all sorts of ways until they're really forced to. But the financial crisis forced them to. And then what seemed to be discovered in a lot of places is, hey, we could actually basically do the same thing with 10% fewer workers. And I don't know if Tyler coined the term ZMP workers or not, but he was certainly blogging about it quite a bit back then. Fast forward to today and he's don't expect the labor share to go down all that much. There's going to be various reasons it'll rebalance out. And I'm kind of like, I don't know man. It seems like we're already at a place where I'd rather work with Claude code in many cases than hire a junior developer. I'm not sure. I think it's still very much debated, like how much that's hitting aggregate statistics. And we'll only know that in the rear view mirror. But I have a very hard time imagining a world where the majority of people don't end up in a ZMP situation where. And this also goes to what you're talking about with organizational dynamics and speed. And Dwarkesh has put out some good essays on this and I think like Ajayakatra has also philosophized quite effectively in terms of as the volume and the speed becomes so overwhelming, like only AIs can handle it. So I guess if I try to boil that down to a question for you, like how AGI pilled are you? Like, how far do you think this goes over the next couple of years? And if you imagine this sort of waterline rising from maybe even before AI in 2010, it turns out like big companies didn't need 5 to 10% of their people. How high does that go? To me it seems like it clearly goes to a majority of people that are just going to have a really hard time contributing in the sort of fully realized AI ified enterprise of the future. And maybe we still have executives because we want judgment or decision making or whatever, but there's not a lot of executives. So I tend to come to an end state of we're going to need a new social contract, we're going to need a ubi. And then obviously it becomes a huge question, how do we get there and on what timeline and what does that transition look like? And I don't have good answers. I like often wave my hands and say we'll have to figure that out. We necessarily have a lot of time to figure that out. But anyway, yeah, how far do you see this going? How much of the current labor force do you think is like long term defensible and how much can hold up. How many people do you think ultimately are, have a place in the sort of fully realized AI firm of the future?

16:41

Speaker C

Yeah, I think to operate in the current system, very few people will survive that current system and be useful inside of a corporation. And here's the way I frame this and it's kind of extreme and it's a little, I guess, anti worker or whatever. That's definitely not my intention because I'm trying to get us to the stage pass where everyone is, is much happier. But the way I think about this is the baseline for actually doing, for actually passing what is required to replace workers is extremely low. If you just think about what most knowledge workers are doing we already talked about they're not happy in most cases, they kind of dread Monday, they're not happy going into work. And the work that they're doing, most people, I would say most workers is very sort of rote. And it's sort of just like, you know, you've got to get the email, you got to summarize the email, you got to write the report, you've got to look at a number of different reports and create another one. And it's like, if you look at the dead center, what AI is good at, it is so covering of like all these or many of these jobs. So I, I don't think the bar is very low, is very high at all. I think it's extremely low, especially because the workers aren't really trying. This is their job. This is the thing stopping them from doing life. They're literally just trying to get through the day. At the same time, they're being onslaughted by Game of Thrones politics constantly. Right. There's just, it's just a hostile environment. So it's not like people are coming to work and saying, wow, let me just unlock my creativity and let me be maximally intelligent in a way that's going to compete in some way with AI. So I think the bar is extremely low for passing what an average knowledge worker does in their jobs, which is, you know, of course, hundreds of millions of jobs. I would say on the other side, this is kind of an extreme way to think about this, but I think it's valid. I think for most companies the ideal number of employees is zero. I think that's always been the ideal number. So the way I like to think about this is like if I had an ice cream stand, I wasn't trying to scale, I wasn't trying to do anything like that. I just had my truck and I had my ice cream And I was selling the ice creams and I was making tons of money or whatever. I was making 500 a week. And it was, I could live off that. People could not pick at me outside and say, why haven't you hired me? Because I am only I don't have any employees. It's just me. I go out on the ice cream truck, I make the money I want. That is what most companies wish they could do. They wish they could do all the work themselves. We literally hire people and this is so weird. It's just like stuck in our brains. The reason we have a labor economy is because the people who came up with the company or the idea or the product, they can't do the work themselves. If they had that many brains and hands and could live in multiple places, there would be zero employees already. So a way to think about this is AI is about to return to a more natural state of everyone does their own work. Everyone literally does their own work. You come up with an idea, you spin up a whole bunch of agents, those are your employees, air quote. And they go and do the work. So if someone says outside, hey, why haven't you hired me? It's like, what do you mean? I'm doing the work myself. Everything is fine. Why would I hire someone extra? So I feel the combination of those two is just really bad for the outlook for human labor in this traditional corporate sort of structure. No, that's fine.

20:50

Speaker B

I guess one thing that obviously we should give the sort of skeptic they're due, at least in terms of one follow up question there. Why hasn't that happened more than it has already? And I'll confess that I'm not a super forecaster, but I have done some of these forecasting exercises where a year ago I predicted a bunch of stuff where it was going to be today. And I would say I always overestimate how much disruption, at least over the last three, four years. I think I've consistently overestimated how much disruption we would see in the next year. I think I've had a better sense of where the capabilities would go. Probably overestimated that a little bit as well, but not much. But I've much more so overestimated like how different will the world be a year from now? Yeah, so maybe I've just been wrong as to where thresholds are that are really the key thresholds. But honestly, I do think even going back to 2024 for sure, from the time you could Basically fine tune GPT4, it seemed pretty clear to Me that most organizations, if they were determined to really do this and go just take a systematic look at, like, how are people spending their time? What are the tasks? Where is all the. Where are all of our resources being spent? And they just started making a priority list and trying to get AI to do those tasks that I think they could have got there sooner. And so this. And I'm pretty confident in that view. But then it leads me to, okay, now we're here in early 2026 and it's the old computers thing again, that we see it everywhere but the macro statistics. How do you make sense of that?

24:53

Speaker C

Yeah, it's a great question. I make sense of it because I think the value of AI is actually in the scaffolding more so than the models. So what the model is capable of doing doesn't really matter if it's not inside of a scaffold that allows it to take inputs and produce outputs that are actually useful. This is why Claude code has gone crazy. Because it is the best scaffolding system. Right? The difference between Opus 4.5 and the best Gemini model or the best OpenAI model is not much. And the other two are better in some ways. In fact, the open source models are very close. It's not anthropic. That's blowing up. It's not Opus 4. Five that's blowing up. It's Claude Code because it's scaffolding. And to answer your question about why this hasn't happened before, even before AI, but in the previous three years of AI, it's because the. In my mind, it's because the average knowledge worker job is extremely general. So when they come into work, it's, you've got to check all these emails. Oh, but you have to watch this video because it's mandatory secure code training. Oh, but also there's this fight going on with your boss and this other person and you've got to talk about that. Oh, it turns out you have to have an HR meeting. Oh, actually, corporate goals just changed completely. Now we have to redo all of our, our work. So we're not working on that project anymore. We're spinning over to this other project. So in the course of a week or a month or a year, human workers are being asked to do like these vastly different things, even in the course of an hour. You might have to check emails, you might have to fix your email, you might have to watch a training course. There's not a scaffolding system that exists right now that would allow an AI to do all of that. It just wouldn't be possible. So you would have AI that's really good at the coding part, maybe it's really good at writing reports. But how is it taking all those inputs in and producing the output in the same way that a human worker can? It can't. Right. And that's why we don't have like giant armies of AI employees out on the market yet. And here's what I'm very worried about. And this is why I think 2027 is the year for AGI in my definition, which is the ability to replace an average human knowledge worker. Right. The question is, when will it be? When will the scaffold. And we just saw Cowork come out. Is that what they called it? Anthropic Cowork? Yeah, we just saw that come out. That is a scaffold system for doing broad tasks at work. Right. It's actually for more general tasks as well. But that is the type of thing that somebody can build an AI product on that actually replaces human workers. Because now all those weird general things that are happening inside the company, those are just one off tasks. And here's a really crucial point here. It doesn't matter for the replacement of human work and the disruption of the labor economy. It doesn't matter if it happens with the wizard behind the curtain, which is actually doing a whole bunch of narrow AI, but it's able to do it for all the tasks that an average worker does. And it's just being handled seamlessly with the scaffolding. It doesn't matter if it actually does it way better than an average employee. So when I talk about AGI, I'm not talking about what does Arxiv think, the technical research papers. I think it's cool that they're going down that path. And I can't wait to see what they do if they create a truly AGI ASI intelligence. But what I care about is the humans. I care about who's getting fired, who's getting not hired. And I think the way that happens is through a scaffold that can actually do their work better than them, which I think is going to look a whole lot like PI, which is the project I'm doing CLAUDE code, which is what PI is built on, and Cowork, which they just built with CLAUDE code. They said they built it in a week and they. There were no humans involved. Cloud code wrote all the code. Yeah.

26:23

Speaker B

Anthropic is in many ways an organization to watch in terms of a leading indicator on what the future is going to look like. I understand they're not really hiring any junior roles anymore pretty much at all. And the execution time on some of these things is getting extremely impressive. We've seen some of that from OpenAI as well from time to time. Yeah, forget I think Codex, they said that they did in six weeks and that's like two generations ago of models powering it. But those are pretty ambitious things to spin up in a remarkably short period of time. So I guess the key thing there, and I share this intuition, is that I frame it a little bit differently. But I think we have a pretty similar intuition there where if you can get over this threshold of the drop in knowledge worker and the interface from the boss to the work getting done can basically be swapped out from you talk to a human to you Talk to an AI system that might be 3 AIs in a trench coat or 57 AIs in a trench coat or whatever. But as long as it can handle with sufficient generality, whatever you might want to throw at it in a similar way to whatever you might want to throw at a person and not get boneheaded falling over responses back then, it seems like you get to a point where people have very just obvious and they're not going to miss this. Right? Obviously the economic incentives are very strong to not miss this opportunity as it really starts to work. Then people are just going to have like behind door A you can hire a human, or behind door B, you can hire an AI. And the AIs obviously have so many advantages in terms of breadth of knowledge, 24, 7 availability, immediate response, like cost, obviously there's just that just to name a few important ones. And so it does seem like we both share a threshold model where when that flips, it could flip really fast and then we could be in a world in. In a pretty sudden way where there really just aren't junior jobs in the way that there used to be. And potentially like a lot of people who, I think it should be said too, like, even our fairly highly educated people, like fairly high status in society may just find them and AI can do what they do. And obviously then we have a crisis on our hands. Hey. We'll continue our interview in a moment after a word from our sponsors.

30:50

Speaker A

If you're listening to this podcast, you're probably thinking seriously about where AI is headed and maybe about how you can actually contribute to making it go well. I want to tell you about an opportunity that could become a pivot point in your career and a springboard for you to make a positive difference. A program that I've been so impressed by that I've supported it with a personal donation. I'm talking about MATS, a 12 week research program that connects talented researchers with top mentors working on AI alignment, interpretability, security and governance. These are researchers at Anthropic, Google, DeepMind, OpenAI, the AI Security Institute, Redwood Research Meter, the AI Futures Project, Apollo Research, Govai, RAND and other leading organizations. The track record here is remarkable. MATS has accelerated over 450 researchers with 80% of alumni now working in AI safety and security. 10% have co founded AI safety initiatives including Apollo Research, whose co founder and CEO made the 2025 Time 100 AI list. Matt's fellows have co authored over one hundred and twenty publications with more than seven thousand citations and helped develop major research agendas like activation, engineering, developmental interpretability and evaluating situational awareness. The program is fully funded, a $15,000 stipend, $12,000 compute budget, housing, catered meals, travel and office space in Berkeley or London. Everything you need to focus entirely on research for three months with the chance to extend up to a year. Applications open December 16th and close January 18th. If reducing risks from advanced AI is.

33:23

Speaker B

Something you care about, you should apply.

34:58

Speaker A

For more information, check out matsprogram.org TCR that's matsprogram.org TCR or see the link in our show notes the worst thing about automation is how often it breaks. You build a structured workflow, carefully map every field from step to step, and it works in testing. But when real data hits or something unexpected happens, the whole thing fails. What started as a time saver is now a fire you have to put out. Tasklet is different. It's an AI agent that runs 24 7. Just describe what you want in plain English. Send a daily briefing, triage support emails, or update your CRM. And whatever it is, Tasklit figures out how to make it happen. Tasklet connects to more than 3,000 business tools out of the box, plus any API or MCP server. It can even use a computer to handle anything that can't be done programmatically. Unlike ChatGPT, Tasklet actually does the work for you. And unlike traditional automation software, it just works. No flowcharts, no tedious setup, no knowledge silos where only one person understands how it works. Listen to my full interview with Tasklet founder and CEO Andrew Lee. Try Tasklet for free at Tasklet AI and use code COGREV to get 50% off your first month of any paid plan. That's code COGREVASKLET AI.

35:01

Speaker B

So give me a little bit more detail on, like, how there's, I guess, a couple dimensions of this. One is you building this gets to the PI project and we can unpack that in. It's almost a fractal way because there's a lot of depth to it. And then the other question is, how does that translate to a world where, you know, some significant share of people can actually maintain some sort of market power, some sort of bargaining position, some sort of ability to be economically viable in the face of the transformations that might come to corporations?

36:24

Speaker C

Yeah, yeah, if I can. Let me. Let me add something real quick to the previous part of what you were saying. Basically, I see AGI as being a product release as opposed to like a model release. So I think some company is going to come out with whatever virtua worker or whatever they're going to call it, and it's going to be a Claude code like system that can basically do this work. And I think the way to know if it's working is, is if they are actually deployed inside of companies, not proofs of concept, they're actually deployed in companies. And here's the standard, which I think Carpathy might have mentioned this, or somebody I was following a while back mentioned something like this. They onboard, they show up, they're in the cohort with human employees. They go through the onboarding, they watch all the videos, they do the training, and then Monday morning they show up and they're on the all hands with the team manager. And the manager's, yeah, here's what we're doing, blah, blah, blah, Sarah's over here, Ravi's over here, Chris is over here, and we're going to assign work. How was your weekend? And the AI says something, oh, I read some books, whatever it's going to say to try to act human. And it proceeds to take work from the manager and do the work and return it. And importantly, when the manager says, hey, our goals have changed, you're not doing that work anymore, you're doing this other work. It needs to be able to pivot just like a human does. So this is a scaffolding AI product as opposed to computer science. You know what I mean? Obviously there's lots of computer science underneath. But to me, this whole encapsulation is as a product, which honestly could happen this year, I'm guessing 2027, but I could be wrong. Like it could be 28 or 29, but it just seems inevitable. That's what in my mind, according to my definition, AGI looks like with replacement of workers. Yeah, I actually can't remember the second part you were asking about.

37:04

Speaker B

So I was going to start to get into what you're building to help people carve out their own niche for themselves. And then I think there's still plenty more big picture questions too, but maybe let's get into a little bit like, okay, so we've got this problem. Corporations are going to be, like, extremely AI ified. Jobs are going to go away. What does that leave for people? And what are you building to help them defend or seize? What? I don't know if it's defend or seize. Both Seize and defend the opportunities that remain.

39:08

Speaker C

Yeah, yeah. The way I frame this is that I don't think most of humanity is activated in terms of a very specific thing that I'm talking about here, which is I use this heuristic of a visiting alien with a clipboard. So the visiting alien shows up and they just go to random people on the planet, a billion random people all over, and they're like, hey, who are you? What do you do? I've been all over the galaxy, 19 galaxies, actually. And I just interview people like, what are you about? And they're like, I'm a accounting specialist. I work at Company. I provide this sort of thing. I do this. I check the spreadsheet, I update the thing, I send the report. They're like, no, who are you? What are you about? What are your beliefs? What do you think is wrong with the world? How do you plan on changing it? And they're like, yeah, I don't know. That's for special people. It's, do you have ideas? Do you talk about your ideas? Do you put them out into the world? It's, oh, no, no, I'm not. I'm not an author, I'm not a YouTuber. So there's a default sort of state. I think that's just. It's no one's fault. It's just like the history of humanity where people have been taught that there are special people who have podcasts and have ideas and write them down and think that they are worth sharing with others. And then there are the regular people, which are the 99%, and our entire education system for all these, whatever thousands or hundreds of years has taught us that your goal is to get a job from one of the 1% people, and you're a worker. And this mindset has basically shut down, like the creative capability of the entire planet rounded down to zero. Right. Because there's very few People who are currently on YouTube who actually believe that they have something worth saying. So my whole plan and I have no idea if it's going to work, it's just too sad to think about it not working. So it's the only reason I'm running full speed towards it, because I'm like, this might be possible to help bring about. Therefore I'm going to try. And that is we have to activate people. We have to turn more of the 99, whatever the numbers are, might be 99.999 or it might be like 95, whatever. We have to turn more of those people who think that they are just workers for someone special to realizing they also can be special. They also have ideas. And I have seen so many pieces of evidence of this over my life where you can activate somebody by just believing in them, by just telling them that they are capable, by just saying, hey, you realize that was a really cool thing you just said. Have you ever written that down? It's no, that's. Nobody would read what I would say. How many people are like, believe that they're just mothers, they're just moms, right? They're just providing this. And you're like, hey, that was really smart what you just said. Have you ever shared that with anyone who wants to read what I would say? So here's a sort of theatrical way of saying this. Imagine that planets from this alien has visited, have stats hovering over. They can see a stat for creativity activation for planets. And when they're scrolling through their phone, looking at all the different planets, the trillions that they've looked at, when they scroll over earth, it says 0.0013. That's how much human activation of creativity has occurred on the planet. Right? That is massive opportunity. And my favorite version of this is having a persistent tutor, a persistent assistant. And this is a little bit in the future, but we'll get there. A persistent persistent tutor that is working with this person, letting them know, like not going super sycophantic, but letting them know, hey, look, you do have ideas, you do have value, you are smart. Hey, do you want to learn more about that? And just always being available from a young age and obviously you have to be careful with this stuff early on, but having children be able to be tutored both in mindset and believing that they are capable of things, but also enabling them with tons of knowledge. Right? So I feel like that would be a huge lever. I feel like obviously we need to fix like society and the way governments work and all that kind of stuff, which will be difficult because a lot of times the challenges are very real. It's my parents are working three jobs each. They don't have time to nurture me. Therefore bad things happen. Right? So we have to fix all of that at multiple levels. But I think AI presents an opportunity to encourage people, especially children, but really anyone, to unlock this power within themselves. So sorry for the rant there, but all this to come around to PI. So PI is designed to be a customized personalized AI system where. So I've got this project called Telos, which basically it gathers from people what are their goals? It basically does this alien interview. Who are you? What are you about? What are your goals? What do you think is wrong with the world? It actually starts with problems. Problems is number one thing. What do you believe are the problems in the world? And then, okay, what do you want to do to change that? What are your obstacles to doing that? And it could be personal problems, it could be like, I'm too heavy, I've never been able to lose the weight, I have low energy or whatever. But this scaffolding of problems, to challenges, to projects, this system basically tells, can tell the PI AI what it is you care about, what you're trying to accomplish. And at that point the AI spins up with all the scaffolding to help you with meal planning, to help you with like encouraging you to help you find other artists, right? Because this, I'm not trying to build a product for tech people. Tech people are already techie, right? This is not about coding, this is about enabling a human to be better at what it is that they want to do to help them activate their full self. So practically that means capturing their goals, their, their current capabilities, where what they would like to learn how to do. And it also starts with mapping out what do you normally do during a day, right? And that's in work, that's in personal life. So for me and you, it's a lot of writing. It's a lot of writing and thinking. And so my workflows are, many of them are largely focused around that. So I could capture an idea. I just wrote a replacement for Buffer, so I could go from an idea to red teaming the idea, having a Council of AIs debate the idea, fight with me about it. And I'm in here editing, right? Making the adjustments on the fly, or I'm doing it with dictation, with shout out to whisper flow, and I end up with that. And now I say, cool, put it on Accent, LinkedIn and it's able to do that. So this workflow, which I see as extraordinarily human, the most human thing you could possibly do, which is have an idea and share it with the world, that is now made extremely simple through this whole AI workflow. And it's all built into PI. So I'm literally telling my da, my digital assistant, Kai, hey, hey, I had this cool idea. What do you think? Even better. I have this pennant that I wear. It's limitless. So I can go on a walk out by the bay and I could ramble off some halfway stupid idea or whatever. I get back and I'm like, hey, go get that conversation that I just had. Let's work on it as an idea. And now I'm live editing because it pulled it from the API. So I'm just like removing all this friction to being able to do more human things in your life.

39:44

Speaker B

I don't want to get too bogged down in some of the things that we probably can't resolve today, no matter what we do. And I definitely want to get into more of the tools and the sort of practical stuff. First of all, I have to agree with your sense that of course the podcasters are the special people and totally that the socialization that we've put in place for society broadly is, like, on the verge of becoming, may have served as well for the last 150, 200 years as the structure was what it was. But it does seem like it's on the verge of really becoming a major liability for us because it does have a lot of people answering, I think, in the way that you describe. I'm a little less clear on. And you can either respond to this or just say, yeah, we'll see how it goes over time. But I'm a little less clear on, like, how many people really want to scale their agency or be like, change makers in the broader world, even, you know, given versus, like, how many would say, actually, no, I'd rather just focus on my relationships and spend a lot of time having the best VR, perhaps mediated experiences that I can have. And I guess it's a production versus consumption question on some level. Like, how many people would be if you gave them a life of leisure and relative abundance, and all the time they need to focus on the relationships that they have. How many people would say, that's not enough for me. I want to go make a difference? I'm not sure about that, actually. It'd be very interesting to find out, and I do think it probably will be somewhat generational because either people who were socialized in the current way are going to have to do some quite challenging unlearning or re education or it's going to have to come from like a next generation. Obviously a huge challenge here is we don't really have, unlike previous revolutions, the industrial Revolution, I always like to remind myself, took depending on how you want to count, certainly multiple generations. The electrification of the United States was like a 60 year process from when electricity was Edison's first wiring up to when my grandmother in rural Kentucky got electricity as a young person. That's literally a 60 year three generation timeframe. And we don't have three generations today to to bring up people that are going to be AI native. So there's definitely some like major open questions there in my mind and major challenges and I'm not sure how much more time we should spend on it or if you have additional thoughts that you would want to offer there.

48:04

Speaker C

I also don't know that number. Right. I'm also agnostic as to that number. I do think it is a high percentage. I think. And here's, here's the even more important point. I think it's worth trying. I think we constantly try to ping with that encouragement because I've hardly ever seen anyone who I try to activate in this way. And sometimes I try seven times over the course of 13 years or whatever and it bounces off each time. Fine, I'll be back in two years and I'll try again. So it's fine if it bounces off. But it could be that people are just so used to being in consumer mode that if you give them the option, for example, they're watching a Netflix show, they're like, look, I just want to watch Netflix. I just want to read stories. And you ping them and you're like, yeah, but have you ever thought of a cool story? What story would you like to read? And they're like, oh, I would love to read a story about this or this. Guess what? In 2026 they're about to be able to write that story and publish it and become a famous author. That is super exciting to me that somebody could actually the first step, the most important step is that they realize it's even possible they stop talking negative to themselves in the sense that, oh, that's for other people. So I feel like these activations, these barriers to the creativity have to come down and which is all part of this marketing I'm trying to do around activation. But it could be like it bounces Off a lot of people. That's fine. I don't know the numbers. I. It's. I think it's impossible to know the numbers. But I think it's worth trying.

50:42

Speaker B

Yeah. That reminds me again of Tyler Cowan. One of his famous refrains is that one of the most high impact things you can do is try to raise the ambitions or aspirations of other people. And I totally agree. It's absolutely worth trying whatever that number ends up being. My buddy Gopal also says think less about what the number is and more about what you can shift it too. And that applies for so many things I think, including this. Maybe just one more beat on the kind of big picture before digging in on the actual practical implementation side. How so?

52:26

Speaker A

There's.

53:02

Speaker B

I just want to untangle a couple concepts. One is I can create that I might have something worth saying. I might have something worth a curtain of an idea in my head that might be worth realizing versus reflexively shying away from that. That seems to me like it's absolutely worth encouraging. It's certainly part of at least some sense, some definition of a life well lived. And even if it's not for everyone, it's to do work that expands people's option sets to include that. It seems like obviously good. Then there's the related but distinct question of is that something that people that can sustain something like the current economy with something like the current social contract?

53:02

Speaker C

Yeah.

53:45

Speaker B

Or do we still need a fundamental rethinking of that foundation such that this sort of agency stuff kind of becomes in a way like its own form of consumption? It's maybe more of a creative consumption. But I might write books or create my own whatever prestige TV series for myself or my family or a few friends. And maybe that's awesome. Maybe it's a great experience. Maybe it's enriching. Maybe it still never goes totally famous. Or especially if everyone's doing that right. Time obviously is the core, core constraint at some point. Like we can't all watch each other's prestige TV shows. So it could be awesome. But I do still wonder how much work you think that can do for us in terms of allowing people to earn income as a way to sustain themselves versus being another way for people to self actualize on top of some different social contract base that we might need.

53:45

Speaker C

Yeah. Absolutely. I don't know what that looks like. I know. Or I feel like I know some pieces of it. So I think there's an opportunity for. I did something about this like 10 or 15 years ago. Basically. If Everyone is broadcast. Imagine like a LinkedIn, everyone is broadcasting their capabilities. It's like I'm a trained dog sitter or whatever. So it's like you basically publish via like a daemon or something, put it out on the network that you need this thing done. I need this tile replaced on my roof. I need a dog sitter and I need someone to teach me Spanish or whatever and that beacons to the people who are available who have those skills. And so you have this web framework thing that just links people with desires and capabilities, needs and capabilities. Right. So I think that is an opportunity for a future tech oriented alternative to an economy. I don't know, I don't feel like I'm smart enough in this area to know if that's enough. I feel like it's definitely not practical as like an alternative to what we currently have. We can't just jump to that. I don't see how that works. I don't see how people pay their landlord, I don't see how people just pay for their groceries using this. So I feel like there's probably got to be some sort of agreed upon shared system that is like paying people to survive. So I don't see an alternative to UBI needing to happen in the next few years or at least five to ten years or whatever. I think that's probably going to need to happen. I'm guessing around 28, 29 there's going to be just a raw like demand for UBI because things will start falling apart. But I do think this tech based exchange of need and capability will be one of these layers. Ideally it would be the only layer, but I think that's so far in the future. Even if it's possible that it's not really worth practically focusing on. What I'm most focused on is getting people where they are broadcasting those capabilities, they are broadcasting those ideas. They do believe in themselves, believe they have something worth sharing and producing and that's valuable to others and they're actually whatever they're paying in doctorows like wufi, like reputation, score points, whatever they're paying in. But I think there's likely to need to be a more practical transition to that which involves yeah, you're actually receiving money to survive and then maybe this other layer is like on top of that.

54:49

Speaker B

Yeah, I think that's probably, I think that's very close to kind of the best ideas that I've come up with so far as well. It I can certainly see and it does feel exciting to imagine a kind of second Level economy of like highly bespoke, highly personalized, potentially highly local services where I, for whatever reason, my mind always goes to the sort of murder mystery dinner, which I've never even done one of those. But this is something that's just obviously a luxury. Obviously the kind of thing that people create these sort of highly crafted. Yeah. Curated experiences for each other. And that feels like it could be a great way for people to interact and express themselves and have status and value and have some exchange. But yeah, it doesn't feel like that can be the foundation for. Not everybody can get their. Get their calories, certainly from that kind of activity. Yeah. I think we're pretty much on the same page there. And it's crazy how crazy this stuff is. Right. What a weird moment in history where just all these things are on the table for rethinking. Of course, some people don't believe that or don't recognize it. Other people think it's going to be even more insane. Like we're all going to die extremely quickly, which I don't entirely rule out as a thing to be worried about. For the record, I guess on that. How do you have a P Doom or what's your. What's your sort of existential risk story?

57:28

Speaker C

Yeah, my. I don't know. I feel like I have lots of different P Dooms and I feel like they change a lot. I just. I'm not sure how to think about that anymore. Honestly. I've gone through all the literature and all the arguments and when. Yeah, I can't remember his Yudkowski. Yeah. When he went on Friedman for the first time. I lost a lot of sleep that day. And yeah, I think the chances. This is another reason I'm doing this. And so focus on the positivity. The chances of things going bad just seem so high to me. I. In some ways I feel like the most likely thing is, no, not for anytime soon, maybe never. We don't get this future value exchange layer and all of that. The most natural tendency is elites get extremely powerful with this really powerful AI. The other 99% kind of have nothing. And they don't even care to look for it because they're so diverted by really immersive games. And then the governments mobilize and basically China and potentially us, like they're just authoritarian regimes using this AI to control people. And it's more effective than it ever has been. Right. So I feel like that's a really easy one. Another really easy one is just everything just breaks and there's just Chaos, right. And then you have to rebuild things after that. So I feel like there's like this thin walking path where there's like chaos over here and it's just really bad stuff. And then mostly it's authoritarian, like control. Authoritarian, elite control. And it's just all bad. And I'm like an emotionally sensitive person, so if I scroll that stuff too much, like, it's not good for me mentally. So I literally am trying to lock on to, okay, break out of the mold of what is possible. Is there a path to possibly making this thing good? Go and build things that could potentially make that happen. Right? Which is all the open source stuff. And then, like, try to get other people to do the same. Right? And there's other people doing this already. And then just lock onto that and breathe it fully and just like. And people will be like, well, you're not seeing the downside? Oh, no, no, no. I see the downside. In fact, I think it's probably more likely. But I can't live in that world. I. I can't survive just thinking about how bad it can be. Right. Yeah. I'm not sure. The one that I think is least likely is like, boom. ASI pops. And it's like the cliche paperclips instantly, that one I don't see happening. I just see so much friction layers, so many friction layers in between and stuff like that. So I don't see that as being like one of our main risks. I think an AI control would be more, I would say gradual and hopefully gentlemen, but it could still be really bad for humans. It could still lead to the extermination of humans or whatever. But I don't know. I don't see, you know, 2026 or 2028, the ASI pops up and just destroys us. But I see much more possible and practical negative things that I definitely want to avoid.

58:58

Speaker B

Yeah, it's funny, I was an Eliezer reader way back when he was on overcoming bias for the OGs that remember those days. And I do agree that the.

1:02:28

Speaker A

Sort.

1:02:42

Speaker B

Of classic canonical paperclip model seems much less likely now than it did then. Certainly Claude is remarkably ethical and has remarkably strong character. At the same time, I do worry that, geez, these frontier companies, or at least a couple of them, seem to be really keen on sprinting toward the automation of AI, R&D, which then would, I think, would have to raise your paperclip or paperclip family of concerns higher again, because it doesn't seem like we have a pretty good loop Right now, that is, I think, making Claude pretty good. Like mostly right. And even when it does bad things, you can squint at it and say it lied there because the user said it was going to change Claude's values to be bad. And Claude wants to be good. So. So how should I think about that? I can at least be somewhat sympathetic to Claude in a lot of those scenarios. Or even though autonomous, I don't think we necessarily want AIs to be doing autonomous whistleblowing in that scenario. It had reason to blow the whistle, right? Like it was the hypothetical drug company was like faking data and reporting fake data to the fda. Claude is not wrong to object to some of those behaviors. Nevertheless, I don't think we have for all that that's good. It doesn't seem like we are quite ready to like spin the AI automated AI R&D centrifuge at maximum RPMs and expect that thing will just stay stable and stay in place. So, yeah, I don't know, it's. I also find some of these things, like I can talk myself in circles. I don't want to force you, I don't want to put you in a emotionally stressful position.

1:02:42

Speaker C

It's fine, let's talk about it.

1:04:14

Speaker B

But just one, one area there because it is like your professional background and expertise. How do you see cybersecurity playing into this risk or this sort of family of concerns? We've got AI could go totally rogue and do something like extreme. We've got gradual disempowerment where it's like everybody willingly and rationally at each step gives AI systems more and more decision making, discretion, power, autonomy, whatever. And then next thing, there's not really any humans in the loop anymore. And. And that might be like, okay, but now the AIs are really running the show and we're just along for the ride. And then somewhere in between is this like cybersecurity world where of course AI seems to amplify all the threats. It also seems to provide at least have some promise for a sort of DAC infrastructure hardening or whatever. Another episode, hopefully I'll be doing before too long with a company called Asymmetric is literally just as far as I understand right now, and I'm more to learn. But they seem to be really trying to do the like log reading that you were describing earlier. They said basically, oh, nice. What happens when there's a security issue today in a company is people go do forensics on it and they try to get down to a root Cause but they only do that once harm has been done and now they're called to attention and they have to go investigate. And so their idea is basically, what if we just scaled cybersecurity forensics as much as is needed to read all the logs all the time and try to identify these things before they actually become critical issues or whatever, before harm is actually done. Anyway, that'll be an episode kind of coming soon. But where do you think we are right now in terms of. I don't even know how you want to frame it, but offense, defense, balance. Is cybersecurity about to become our worst nightmare? Or might we use AI to get it under control?

1:04:15

Speaker C

Yeah, I think it's definitely a combination. My favorite frame for this is basically that the game, as of probably last year, definitely this year and going forward is it's. It's the attacker's AI stack against the defender's AI stack that is the competition. So the goal of the defending security team is going to be how good of an AI stack can they build to actually do this stuff? So I, I've been doing this whole attack surface management thing for decades or so. Many people have also been doing this. It's about, do you understand your attack surface? Right. And with all these AI tools, the attack surface is everything. It's. It's total knowledge of the company. It's total knowledge of every employee. Yeah, I built a thing that, like, it just, it finds all employees and creates a psychological profile on them, which allows me to write the perfect spear phishing email. Right. And it's, oh, yeah, you adopt dogs, therefore, here's what this thing looks like. And I could also figure out, oh, you're also one of the people making this core product. Oh, it's also releasing a new version. Oh, it's also running on this platform that's vulnerable. This is all work that a red team could have done. But it comes down to this concept of many eyes, which was supposed to secure us all this time with open source, but turns out the fact that humans could look at something doesn't mean they will. And that's what. That's the case with this, this asymmetric thing you're talking about. Right. With all these logs. The logs are there. There aren't enough eyes, there's not enough time, there's not enough attention. Humans need to rest. They miss things. So it's a matter of maintaining state. You have to understand the state of your company. Right. And this is, I think, the big picture here. If you understand the state of Your company, what is your profit and loss? What are your goals? What are your competitors doing? What is your infrastructure look like? What is currently facing the Internet? What applications are you running? What stack are they running? What vulnerabilities do those stacks have? What just changed in the last 13 seconds while I was saying that sentence? Right? Faster and faster granularity. Oh, this person left the company. Oh, so and so joined the company. Oh, that person is extremely vulnerable to this type of social engineering. So now we're going to spin up this entire thing, this campaign to go after them to get access to the company. Now prior to this, all this could be done by a high quality attacking team, high quality pen testing team. I'm thinking more like attackers. So like a really skilled advanced persistent threat team. But they are very small teams, they are specialized in specific industries and verticals and they could only go after so many companies just because of the time. Now we're in the situation. Claude code is the model here, PI is the model here where the attacker basically says look, I'm an expert at going after these types of vulnerabilities. Spin up capability to do continuous recon, to find all employees inside of a company, produce psychological profiles. We've got another module over here that writes the social engineering attacks. Got another module over here that does the network attacks and the scanning and this beast, they basically just put in a target and it starts hitting them and it spins up all these different modules and agents and it's constantly hitting you. Now on the receiving side, there's only one way to survive this and to defend and that is you have to be doing the exact same thing. There is no game, you can't, we need to hire smarter people in our company. No, that's not going to work. It's not going to be enough. The only thing that's going to work for is helping them improve the AI them helping the AI improve and get better because the scalability and the pace of change is actually what matters. So all that to say it's attackers spinning up better and better versions of CLAUDE code, basically cloud code, co work or whatever. And I'm not saying they're only using that, but Anthropic did say that they've already seen automated attacks using Claude code being extremely successful. So these sorts of stacks attacking the planet, attacking the all these companies and then all these companies have to have a similar stack that's defending them.

1:06:04

Speaker B

And that defending is, it's not, it's the first version that I imagine is that it's going in like self attacking and trying to find the vulnerabilities to presumably patch them. Is there a better or more comprehensive version of that? Is so yeah, yeah, describe that for me.

1:10:43

Speaker C

Yeah, yeah. So in this world, if the AI stacks are equally capable, the Defender will actually have an advantage because guess what, the Defender has actual access to aws. Direct access to aws. They have direct access to the network logs, they have direct access to all this stuff where attackers hopefully are inferring this from external signals. So hopefully the Defender has a massive data advantage. A big part of cybersecurity is just misconfigurations. It's not like writing special malware, it's just, oh, I didn't even know that thing was still out there. Oh, I didn't even know we still had that company. It's like huge like own goals. So the internal agentic AI stack should be watching all of that stuff very carefully. And really it's just a game of it finds it first. So it's doing this self attack, it's monitoring all the logs, it's seeing all the configuration changes and it's saying, oh look, that was bad. And you go back 15, 20 years when I started doing this and it was like you would have weeks of a window, you better shut this down within a few weeks, someone's going to find you. Now now it's down to hours and minutes, right? And pretty soon it's going to be eventually seconds and it is already in some places. But the attacker should have a disadvantage because they have to infer signals, whereas the Defender can just get it directly from the source.

1:11:02

Speaker B

I wonder how you think that applies to like the social side of social engineering. One thing that happened to me recently.

1:12:37

Speaker A

Was.

1:12:46

Speaker B

So the company SendGrid, which is now part of Twilio, has this like email sending API. I think it still remains like a market leader in terms of just high scale programmatic email sending. Naturally there's. If you can get access to somebody sending sendgrid and you're a scammer, that's a, at least for a minute. That's a really valuable thing to have because you've got like their sending reputation and so you can potentially actually hit the inbox with your scams based on the fact that you're hijacking somebody who's maintained a good reputation in the email system and using their channel. So people aren't actually trying to hack into other people's send grids all the time. I got an email the other day that I don't know how personalized it was certainly the psychological profile part. Like I wasn't so great that I was like, sure that they had profiled me. But basically what they sent was posing as SendGrid and saying, we support ICE, join us in supporting ICE, whatever. So naturally putting people into this kind of pissed off state, they're like, wait a second, what? My email company is like taking a stand with ice. This is going to get people inflamed. That's going to get people to click on the link, if only to then go log in and cancel their service or go log in to try to register a complaint or whatever. I didn't click the link, but I would expect that there was probably a very prominent give us your feedback. Hi there. Then, okay, now go log in to SendGrid so you can give us your feedback. And then of course you're getting pwned. So much of what you just described was like managing the attack surface on a technical level. But when I give somebody my password, that's a little bit of a different beast. Or maybe you think of it as the same thing. But how do you think about the social. The fact that we are just such juicy targets as humans, maybe more so, at least at a more mature state where the AIs have gone and closed up the open ports and fixed the misconfigurations, there's still like the human gets pissed off at a fake email and goes and gives their password away before cooler heads prevail. What do you think AI does for us about that?

1:12:48

Speaker C

Yeah, so it's exactly the same sort of model of attacker versus defender AI stack. So I could easily, right now I could say hey, and I'd have to be very careful with my relationship with Anthropic here. But I could say, hey. So based on all the history of social engineering attacks being successful and the fact that you have all these psychological profiles of this company, why don't you come up with 16 or 36 or 128 really cool campaigns that would work against these employees, right? Or against SendGrid, for example. Or find me a company and come up with a campaign that if you send it out, it's going to produce outrage, Right? But you don't even have to give it that much. You could just say, okay, you understand that outrage produces clicks. You understand that being sycophantic produces clicks. So create me 256 campaigns and we don't have to pick one. We could say launch all the infrastructure to send the emails, launch all the receiving analytics to gather the Data, which includes the passwords, which includes going and performing the attacks using those passwords, including sending that up into the exchanges where you're actually selling the access and everything. So before, this would be a whole bunch of attackers hiring very smart coders who are not going to get caught by the police, are not going to talk about it and blab about it and get themselves caught. And now it's simply, that's a prompt that I sent into cloud code or open code, which doesn't have all these restrictions. Right. That is a prompt. One prompt in two minutes. And now I have 250 campaigns going off with different ways of attacking people through social engineering, using completely different psychological tactics. And they all spun up separate infrastructure. And now a bunch of passwords are. And access tokens are floating in. So it's just how quickly you can go from an idea of how to harm to actually making it happen. And that's what's crazy. And on the defender side, you just have to assume that millions of agents are being pointed at you with all this knowledge about your company and about your infrastructure. And that's the assumption you just have to travel under.

1:14:51

Speaker B

Sounds like there's going to be some spectacular hacks over the next couple years before everybody really gets that message.

1:17:29

Speaker C

Yeah, I think it gets worse before it gets better. Yeah.

1:17:35

Speaker B

Okay, let's turn to more positive themes and finally get into PI. Maybe for starters, you've done a little bit of this along the way already, but let's take a moment to just share some of the stuff that is, like, magical for you to just try to inspire me and others. And for context, like I said, a little bit at the top, too. But I. I use AI every day. I use tons of different products, but I mostly haven't, especially over the last couple years while I've been doing the podcast, doing this, like, AI scouting thing. The thing I have prioritized most is learning. And then producing a podcast is great in that some people seem to want to follow my learning adventure and learn with me. And also it turns out that you can actually make a living doing this, which is a shock that I try never to take for granted. But I've never really been trying to scale anything. And I'm not a super systematic person, so I'm not, like, instinctively trying to systematize things. So much more of my activity is going out and being like, oh, let me try this product for this thing and see what happens if I go here and do that. And what's the limits of how much medical history An AI can handle before it can't absorb that anymore. A spoiler, by the way, on that one. They're very good. But. So I haven't done this kind of build my own highly bespoke personal AI infrastructure, for lack of a better term. So what is relative to going out and scattershot, doing a ton of stuff, which certainly has the effect of teaching me about AI and very often does improve my productivity. How do you think the personal AI infrastructure, like, sets you up for a different lived experience and maybe give us like some of the highlights to inspire and then we'll dig into how it works?

1:17:39

Speaker C

Yeah, yeah. I would say the big difference is the main concept that also underlies cloud code itself, which is this whole scaffolding more important than the model. Model. Right. So the difference is when your AI understands what you're trying to do. So when you make a request to, to a tool, especially a year or Two ago, like ChatGPT or whatever, it would largely be just taking it out of context. It would just be finding the best answer according to the world knowledge or whatever, the model's knowledge. But the, the magic is when it's actually encompassing everything about you and incorporating that into the, into the pursuit of the best answer. Right. So the more your system knows about you, the more it can customize its responses. And it's not trivial customizations, it's things oriented around your goals. My. My challenge to you and to others is to basically sit down and dump via dictation or writing or whatever you want to do, or just drag a bunch of documents and be like, look, this is. You're basically doing a telos assessment of yourself to figure out what you think the problems are. Your own problems inside your, your. What you're trying to do with your career, what's wrong with the world, or whatever. You dump that. Then you say, here's what my capabilities are. You're basically doing this interview with the AI and that builds out the telos structure of what you're trying to accomplish. That is then part of your PI, your personal AI infrastructure. Now having that when I initiate Claude code, which is running PI, it reads my entire thing on startup. So it now knows me, it knows my digital assistants personality, and most importantly, it loads all my skills, which are customized also for me, my blogging skill, my writing skill. I'm reading this amazing book right now by Mark Forsyth. I think it's Elements of Eloquence, I think, but it's about the rhetorical figures going back to Greek and Roman and basically how to write well. So that's now. So basically when I learned that I read this book, I literally have an upgrade skill inside of PI. I can take any YouTube video, just paste in the link. It goes and gets the transcript. This thing is absolutely insane. It goes and gets the transcript. It reads my entire telos, what I'm trying to accomplish. It looks at my full PI system and gives me recommendations on how to upgrade itself. So that means all the skills, all the hook system, all the context, the memory system. So another thing that the PI system has, which most other systems don't have, is a system of memory, which is writing signals that I'm giving the AI about how it's doing. And this is like this rotating loop which goes back into the upgrade skill. So it's okay. How good are we doing as an overall system in helping Daniel to accomplish his goals? How happy is he with the system? And then that just goes round and round to making little tweaks and updates to the system itself. So when Claude code releases a version, which they did yesterday, I'm looking at it right now. It's 2.1.6. Okay. They released a bunch of capabilities in there. That's in their change log. They also might talk about that in an engineering post. They also might have more detail inside of GitHub. I just say perform upgrades. It goes and hits podcasts, it goes and hits YouTube channels to see if anything new came out. It reads every anthropic engineering blog, it looks at the change notes for cloud code, and then it comes back with a prioritized recommendation list of how to upgrade our PI system so that it will work better using the new features. So it's this continuous loop of getting better at accomplishing what I'm doing. I would say that's the biggest thing. And just as a little bit of like, like partial testimony here, I've. I do a lot of bug bounty stuff. So basically finding legal programs where you can find vulnerabilities and get paid for them. And I've got a whole bunch of friends who are in this space as well and they're constantly looking for vulnerabilities. I got this one friend, he's a amazing guy, he's a cardiologist, so he's over here hacking at the same time he's doing, he's actually in the clinic and he's, he's working with patients and everything, but he find he specializes in client side vulnerabilities. So he had been using cloud code because I got him onto cloud code. But when he switched to PI, it basically enrolled all of his personal techniques as skills. So now when his PI loads up, it's thoroughly trained on how he likes to find vulnerabilities, all his personal techniques. So now he could just bring in a target. It. It goes and gathers the stuff and the number of bugs that he has found has gone massively up and they're paying out more. And pretty much everyone that I've talked to who's using the PI system on top of cloud code, they're getting just much more value. And to be clear, this is the same direction that cloud code is going, right? They're going to have this type of PI like stuff before too long as well. But the short answer is when it's more. When your AI stack, your agentic stack or whatever the term is more tied to your actual goals and knows more about you, it is just infinitely more capable. Plus, we've got a lot of quality of life stuff. So I do everything inside the terminal. I'm a vim person, so tab completions. I've got a full voice system that uses 11 labs for customized voices. When I spin up custom agents, they all have their own voices and personalities, so. So it really feels more like I'm dealing with my friend Kai. Then I'm talking to a coding agent that's producing code. Can you.

1:19:20

Speaker B

This is. You mentioned, like, Claude code is going this direction as well. Can you give a little bit more detail on like where CLAUDE code ends and where PI begins? One of my funny refrains is like, everything is isomorphic to everything else. By which I mean you can always play hide the intelligence. And I find that there's a lot of different ways to structure these things. And I'll maybe pitch you on a different one in a second, get your reaction to it. But what there's important functions that you're talking about there. Where. And I do want to get a little more detail on those too, but context management or having really good starting system prompts, those are obviously key toward consistently customizing the AI's behavior toward what you want Claude can do. Claud code can do a lot of that. What? Where is the line? How is the line moving? What do you think are the most important things that you are bringing to Claude code that it itself doesn't have yet?

1:25:47

Speaker C

Yeah. So what Claud code doesn't have right now is it doesn't start by saying who are you and what are you about? It doesn't encourage you to bring over your work and your personal goals and your main Workflows that you perform in life and for your career. It's not onboarding you to have clog code, be your assistant. Okay? It's still its primary identity, which started as a coding agent, and that's still what it does the best. And it's the best at it because they just have the best approach to this. But what I'm building towards, this thing called. What is it called? P paim Personal AI Maturity Model. And it goes from chatbots at three levels, agents at three levels, and then assistants at three levels. And I think right now we're at like agents level two. And when you start getting into assistance, the world is completely different. So like I'm sitting in front of these screens right now. What should be happening is my AI system should be able to control any of this tech. It should see all these screens, it should hear everything that's happening and I should just be interacting with it. One thing I love to do, I stole this idea, at least partially, when I was at Apple. They stole the idea from Amazon, but it's start in the future that you want and work backwards. It's called a PR in Amazon and Apple terminology. So what we're actually looking for is like her and tars. So you start with what you actually want, which is an AI that can see and hear and interact with anything you are interacting with. When you say play the perfect song for this moment, first of all, you shouldn't have to say that it should just play it. But when you say that, it should be able to. I got this idea riding in coyote hills with my friend Mark, who on mountain bikes, wouldn't it be cool because we both grew up very close to these mountains for it to play the perfect song? How is it going to know what the perfect song is? It has to know who Mark is, his relationship to you, what was happening in the 80s when we grew up, what were the perfect songs? And how does that associate with mountain biking in the wilderness? All of that is context. That's why the scaffolding is so important, is because the context, engineering is what makes the AI powerful. It's not the models themselves. So PI starts with this concept of what are you trying to do? It starts with deep personalization. Your, your AI has a particular voice, it interacts with you in a certain way. It knows what your capabilities are. It has full access to all your skills. So it's more like you're interacting with a da, a digital assistant, as opposed to interacting with an AI model that has capabilities. And that distinction seems Small, but it's actually massive. It's absolutely massive.

1:26:49

Speaker B

So it's about. If I try to echo that back to you in different terms, it's really about putting you, the person at the center, in a persistent way as opposed to with like clogged code off the shelf. We have a project level focus and then of course we go to the chat itself. We have a task or a conversation level focus. In practical terms, like how big is your default prompt? What, how much detail is the is PI loading up? Or I guess your personal one is Kai and the PI is the empty one that you publish for other people to. To customize to their own individual circumstances. When you're doing your own thing. How much starting information is it getting on every session? Init?

1:29:57

Speaker C

I haven't counted recently. I want to say probably I think it's something like 10,000 tokens, something like that. I try to keep it fairly clean. And it's also responsive. So inside the Skill MD file, which is the cloud code structure, I have a whole bunch of other sections which point to specific additional context information. The Skill MD file is like the core. It explains the entire PI concept. It explains where all the resources are and when, because that loads initially I forced the load through the startup hook. It then knows how to find all that other information. So for example, I can email people, I can text them, I could do whatever it knows. If I say email Jason or Sasha, it knows who that actually is. So it can send to the right person at the right time, but it doesn't need to go and read all of those files all at once. This is the advantage of the cloud code skill system is there's three levels. There's the front matter, which loads by default, which is like a routing table. There's the Skill MD file itself, but then there's references to other parts of the system. So inside of that system I have user system and work. And work is customer. Not so much customer, but like offerings related stuff. User is very personal stuff. And then system is the stuff that goes into the PI project. So we're talking about probably like 30 different context files plus the main context file being the skill MD. So yeah, it ranges between 5 and probably 15,000 tokens. It's not all that much. There's a lot more context available for it to go get if it needs it.

1:30:47

Speaker B

Yeah, how. When you, you mentioned obviously you're building this on cloud code, but there is open code out there and this last time is getting so weird, but I don't know, I think it's been like the last 72 hours right now as of when we're talking this, that Anthropic has changed their policy to not allow subscribers to Claude to bring their inference budget to other projects like opencode. So now if you want to use a Claude code thing with at least without paying the API token rate, which I understand is easily an order of magnitude more, then you have Claude code with CLAUDE integrated and that's going to give you a much larger inference budget for your 200, whatever, 100 or 200 bucks a month versus if you said, okay, I'll use the API key and go use open code with Claude, that now doesn't look like such a great option just because it's going to cost you a lot more. And what exactly are you gaining? But obviously with open code you can use a lot of other models and OpenAI has tried to counter by saying they're committed to continuing to support these open source frameworks. It'd be interesting to see if that continues. It's been funny how anthropic has followed OpenAI. These two companies, they're very interesting circling each other in so many ways. Anthropic has followed OpenAI in so many ways. OpenAI has followed anthropic in so many ways. Which one is going to bend on this so they can come back and have ultimately the same policy in the end will be interesting to see. But the question is, if I'm as I am thinking about making a real investment in this sort of thing right now, how would you decide between Claude code versus Open code? And what could you tell me to do so that I can at least minimize my lock in? Because I do think I probably want to go Claude because I like Claude. Certainly for all this personal stuff, it seems like it might be the way to go. But then I do worry about this sort of lock in and returns to scale running away with the whole thing. And I do want to have some sort of off ramp. So how do I decide and how do I make sure that I retain as much flexibility as I can?

1:32:42

Speaker C

Yeah, fantastic question. So the whole agnostic system is built in from scratch from PI. It's hard to be fully agnostic because in my opinion cloud code is just way like generations ahead right now, which could change in the matter of days or weeks or months or whatever. But they're so far ahead. So the system is definitely built on Claude code. However, the entire system is markdown files. Right. I'll give you an example. This is a great example of like this whole thing. When open code came out, I switched to it for about two weeks. I did a whole YouTube video about it, comparing the two. I got great results from Open Code. This was at the moment that Boris supposedly had taken a job somewhere. And this really gets to the answer to your question. If Boris takes a job somewhere or I hear a signal, or let's say the cloud code team, like 70% of them leave and they all go to the Gemini team or something, I'm going to be switching. I'm going to be switching because it is that leadership, it is the vision that keeps me on Claude code, right? My platform, the PI platform, is markdown files. It's skills, it's mcps, it's just context files, right? So that is extremely portable. I could take the PI infrastructure and put it on open code and it would be awesome. It would be much better than most other things just because of the context. The reason Claude code is the base is because there is no other company that gets the concept of a harness as much as Anthropic. It's not even close. Google is extraordinary at backend, right? We've, we've known this. They are not good at making interfaces, they are not good at empathy, they are not good at understanding what actual human users need and what the interface needs to look like. OpenAI, in my opinion, is a little bit all over the place right now. I don't see them being as focused on this whole core mission as Anthropic is. And a thing that I kind of realized about this, which I thought was kind of interesting, it's in the name. Everything Anthropic is doing, it's literally Anthropic. And their art, their messaging, the fact that they're, they're constantly warning all the way from the CEO, hey, this is coming. We're worried about you. Please upskill, please get ready. This messaging of human first has been consistent through the entire thing. And what do you know, they happen to be putting out a product that puts the human first and the human experience first. So this is why, why I am like 4000% in the anthropic cloud code ecosystem. Because the leadership and the vision is there for building this system that PI is essentially. And I just don't see it from anywhere else. And the way that manifests is they're shipping every day. They had like a day and a half of rest over the holidays or whatever, and the whole world was like, what are you doing? When's a new release coming out? They're like, can I, I take A nap. It was like insane. But they are shipping so fast. They listen to users, they're live on X, like responding to people. Like you could ping them and they'll just respond and it's just like there. There's no comparison in terms of like, having a vision and executing on it compared to the other platforms. In my opinion, that's a really interesting take.

1:34:45

Speaker B

If I just try to Contrast it with OpenAI, it seems like they have a somewhat similar vision in the sense that they want to be your durable personal. They've invested in memory, for example, right?

1:38:22

Speaker C

Yes.

1:38:37

Speaker B

The AI is supposed to feel that it knows. You're supposed to feel like the AI knows you from one chat to another. They also now have the Pulse product, which kind of at least suggests a sort of more proactive future. And I do think that product is pretty good. And certainly like most days when I see my Pulse notification, there's like something in there that I feel compelled to click through and check out. I guess one obvious point of differentiation would be just how portable it is. If I have all my, like, memories locked away in some OpenAI memory store, possibly as explicit text, possibly in some other form that's like, hard to do anything with. It does. I think they're trying to create lock in right with that product form factor. Like, they want you to come to ChatGPT all the time because you feel like ChatGPT knows you best and can support you best. Is there more to it than portability that you think differentiates those two approaches?

1:38:38

Speaker C

Yeah, great idea here. I've never thought to try to separate these two, so I see them as extremely different. But like you were saying before about how everything rhymes, they're all going the same place. I wrote this really crappy book in 2016 where I was like, look, the future of this is basically you have AI assistants that have all your context and they will. There will be APIs for everything and you'll just talk to your assistant and it will use all these services. And I'm really happy I actually wrote that down, forced myself to get it out there. But I feel like Sam Altman particularly really gets. This is. This is one of his big bets. And that's the whole Jony I've thing. And there was a leak that supposedly it's an ear thing. I don't know if you saw that that, but he is absolutely all in on personal assistant, digital assistant. It knows everything about you. I think he's trying to skip the whole mobile phone thing and just this is your platform and if you look at that personal AI maturity model thing, that's where I'm going as well with PI. In my opinion, that's where Claude code will end up. Google will end up like everyone's going the same place. It'll be so obvious that it's boring once everyone gets there. It's like, obviously everyone's going to build that. Here's the distinction though. I think Sam is trying to build the device and the interface first in a sort of consumer disrupt the industry leapfrog over mobile sort of thing. I think that's the direction he's going. Claude code and anthropic, they accidentally got here on a different path. And my whole thing with PI is like that's been like this human first thing, which is like on a third rail. So it's like there's the human side, there's the coding agent that gets you there and then there's the Sam Altman way that gets you there as well, which is like consumer hardware bypass, the mobile interface sort of way. But in my mind, X number of years, I think honestly like three years or something. This is what the whole space is going to look like is we are reinventing how we interact with technology. You talk to your digital assistant and your digital assistant does stuff for you and the details are all abstracted. And that's kind of already happening with cloud code.

1:39:38

Speaker B

So when it comes to using something like PI today and investing in this now, what is it? What is the value driver of that for like most, for people who aren't like professionally responsible for keeping up with AI, I feel like I have to do it for that reason of no other. And at this point I am, I think it might actually move the needle for me. It seems like it maybe is mostly just about training yourself to think and work in this way. If you skipped it, you could in 27, 28 have probably. It sounds like you expect similarly capable infrastructure spun up for you very quickly by at least a couple different companies that would be eager to be your digital assistant of choice. And so what do you gain between today and when that is like a really polished consumer product? Am I right to say it's maybe most about like your own habits of mind, your own like strength as a user of these systems or are there other things that you think will help people accrue advantage relative to those that just kick back and wait for the very polished version to become available?

1:42:00

Speaker C

Yeah, I think the very polished versions will take a lot of time and it'll be highly vendor locked. So for example, an OpenAI version, I'm not sure you're able to see your files and edit. I guess you probably could, but it's going to be a lot more opaque. An Apple version of this, which we're probably going to see this year, it sounds like through Gemini, right through Google. So that whole ecosystem of all your Apple data that's now going to be available via. I don't know if they're going to keep the S name. I don't want to trigger my thing, but I don't know if they're going to keep that name. But this is going to happen in their world too. But you're definitely not going to have the same access to the environment that you do in the cloud code. So here's my aggressive way of answering your question right now is like the craziest moment of punctuated equilibrium of the world is changing so rapidly right now. So you do not want to wait to have an AI platform that understands you and can help you go from. I, I've got this concept within PI, which I'm trying to convert to being the primary center of the algorithm or the center of the platform. But it's a little bit, it's outside of my working MEM and IQ capabilities. So I'm like really trying to push on this thing. But it's essentially this thing I wrote about a long time ago, which is the desire to move the universal algorithm is going from current state to ideal state. That's the universal algorithm and this is within PI and then inside of that current state to desired state. You have the scientific method. So if you look at like the RALPH loop. Have you seen that the RALPH loop for. Yeah. So this, I've been thinking about this forever is what is the loop that your AI platform is constantly trying to perform on your behalf? So it's literally saying Daniel is in this state, career wise, personal wise and everything. We're trying to get him to this state. And also when he asks a random tactical question, what is the current state, what is the ideal state? And how do we rotate through this loop to get him there? That is so powerful. You want to start right now with it. You want to get into a system that can do this for you. I've been hearing really good things about open code lately, that they are actually shipping features and stuff like that. Somebody wants to use open code, I say go for it. I just think most of the innovation on the scaffolding is stronger on Claude code. But I would say do not wait, do not wait to build an AI that has your telos and knows what your ideal state is. Because think of it this way, every time you ask a rando AI a question and get back an answer, the whole purpose of getting back that answer is to do something that furthers your goals. If that is 50% better or 5% better or 2% better inside of this personalized system than it is in a disjointed system, those accrue, those add up. That means I'm going to be way further ahead. Anybody using a PI system, in my opinion is going to be way further ahead in a week or six months or two years than somebody who's using the disjointed system. So I would say the worst possible time to wait and see is right now.

1:43:12

Speaker B

One other way I can imagine trying to construct something like this and I can definitely see advantages and disadvantages, but I kind of want to get your thoughts on them is so I'm using again all the frontier companies just mainline products. ChatGPT Claude Gemini I also have been a big fan of Tasklet recently which has been a sponsor of the podcast. But I genuinely really like using allows you to create these sort of long running agents that basically have a job for you. And I shared the outline of questions that Tasklit created for this conversation and I give it access to drive and.

1:46:55

Speaker C

It was really good by the way. It was really good strong.

1:47:31

Speaker B

So I've been pretty impressed with that. But it, it doesn't quite have this thing that you're talking about with the person at the very center of it.

1:47:34

Speaker C

It. That's right.

1:47:41

Speaker B

Is a. It's a little bit less ambitious in scope where it wants to have one job and then try to do that job as well as possible. And it can take advantage of a lot of context because the reason, one of the reasons it did so well on this question outline writing process is I gave it access to a bunch of previous outlines of questions that I had done so it knew what I was looking for, the kinds of. The kinds of questions I would generally want to ask. But yeah, it's not like me as a sort of sovereign individual is right at the center of that. But I wonder if there's a way to think about because I guess just one more sort of bit on the tee up of this as I've been getting into this a little bit just in recent days I do find that oh God, there's a lot of initial friction. Right. So just for example, Claude code. Okay. Claude within Claude code. Like how can you tie into My Gmail, my Calendar, my Google Docs.

1:47:42

Speaker C

Yeah.

1:48:35

Speaker B

That is like, not nearly as easy as one might think it would be. Or like, the choice is not nearly as obvious. There's this MCP by this guy and there's a few command line tools over here, but Gmail doesn't really have a command line tool. And if you want to go that route, you got to go set up a Google Cloud account and have a developer relationship with Google and set up that. And then you can do oauth in and it's like, okay, what is this? What everybody's doing. In contrast, the same company that makes Tasklit also makes another email client called Shortwave. And these things are like. Trogwave in particular is like highly specialized and they put a lot of effort into making it a very good way to access everything that I have in Gmail. So if I'm sitting here trying to create my personal AI infrastructure, how much time do I want to be spending on tools and MCPs and skills and developing all of those and figuring out like whether yours is the best or. My buddy Chris does a ton. He's a madman with this kind of stuff too. And I plan to do a full episode with him. And he's got his version of this. And I think you guys have very different, interestingly, quite different intuitions in terms of he's very much an open code guy. Both are doing like amazing things, but which one is right for me? And then I think maybe what I could do or should do is go use a product like a Shortwave, where they've done the hardcore engineering of they even take all your emails and put them in their own vector database so they can do their own kind of search against your Gmail. That's like over and above what Gmail itself allows with API searches and whatnot. And then maybe that thing, maybe the model would be like that thing could call into the sort of Nathan bot or the, the Nathan Telos Oracle that could say. So when Tasklet's trying to write an outline of questions, or when Shortwave is trying to write a draft response to an email, maybe those systems are like, better specializing in all of the nitty gritty of the tools and the implementation, maybe they call into me and say, hey, here's the context. Like, how do you think Nathan would want to respond to this? Or has there been any goal changes that would change how we would go about writing this outline of questions? Are there any new themes that are top of mind that we might want to. To bring in?

1:48:35

Speaker C

Yeah.

1:50:49

Speaker B

So this is kind of why I said that everything's isomorphic to everything else. Like you can see either way working. But what do you think? Obviously you're betting on this, this PI framework as opposed to like these sort of products that kind of exist in like a constellation out there off orbiting around this center thing.

1:50:50

Speaker C

I don't know.

1:51:09

Speaker B

What do you make of all that?

1:51:09

Speaker C

Yeah, not quite. You mentioned in the tasklet, why not other models? So this is a thing perhaps I missed with the explanation here. Here my. I have a research skill and I have three levels of the research skill. So I, if I say do deep research or heavy research or whatever it goes, it spawns all five of my research agents, but it spawns eight of them and all of them have separate subtasks so they all go off and do their work. But guess what? It's not a bunch of anthropic agents. That's Gemini doing that, that's Codex doing deep research. Those are command line tools. All of my tooling that I actually use, Kai has access to if they have an API, if they have an easy way for me to interact with it. So my personal productivity software that I do use to run my team, Kai speaks that language. Kai went and reverse engineered all the MCPs, turned them into typescript so I don't actually have to load up any mcps, which take up a lot of context. But Kai now speaks this productivity software. Kai speaks Salesforce, Kai speaks email. I get to bring the best of the best tools to Kai and say this is what we use for this. And what's cool about this is that it's exactly what you said, it's best in breed. You don't have to reinvent things. I'm not trying to rerun write smtp. I'm using existing ways to send emails productivity software. I'm not going to make a new piece of productivity software, but if I want to replace a piece of software, I could say, hey, I don't like paying for this subscription anymore, go make a piece of software. And it will use all my context, all my tech stack, all my design preferences, all my UI preferences and art preferences and everything. And it will build that software. So it's a mixing. And I'm also we're going to be adding Ollama as well. So you could use local models in addition. Right. So when you're using PI, fundamentally it's anthropic, but I've got probably six different model providers that Kai is using because they're better at different things. For example, Google is the best at extremely large contexts and like haystack performance.

1:51:11

Speaker B

But what about this kind of other does in practice? Is Your number of third party SaaS products used trending up or down? Because I feel like mine is still trending up and I think it sounds like yours is trending down.

1:53:34

Speaker C

That's an interesting question. I would say, I would say maybe down, but I'm definitely experimenting with new things all the time. Oh, and the other thing is in my workflow, if I triple tap the back of my phone, it opens OpenAI chat because it's the best of breed. Inside my car I could talk to Grok and Grok is getting extraordinarily good and like the conversational flow, the voice, it's just amazing. I could use OpenAI inside the car, but I prefer to use Grok inside of the car because the user interface. I am also sampling all these different tools. I don't see PI as a concept, as a competitor at all with any of these because Kai is. And the PI project is just unification around self. And one other thing I would say it's not so much that PI is putting you at the center, it's more like it's putting your goals at the center. Right. It's. It understands what you're trying to accomplish and it keeps that locked on for its ability to help you do things. But no, I'm still other than agentic platforms I'm not really messing with right now because I'm on cloud code, but in terms of like model capabilities, yeah, specific sort of niche products, I will either use them natively or I will have Kai learn how to use them and then that'll just be part of the ecosystem.

1:53:51

Speaker B

Do you have any thoughts for like how folks who are making these products like Task with Shortwave, obviously tons and tons of others should think about the world that you're envisioning. Like I still am wondering if the right way for me, even as I set all this stuff up right and get my goals instantiated and build up all the context, should I be like going to that terminal and saying go triage my inbox and tell me what I need to respond to and have the responses drafted that way, or should I do it in a product that was really built for email and have that product kind of call into the Nathan Oracle for what context or judgment assistance it needs at any given moment in time? Because I do feel like a lot of people, you're a seasoned vet when you know a vim guy, as you said, right. That's Obviously a very minority profile and I'm comfortable enough to go do command line stuff, but I would probably side more with the typical user who like wants a graphical interface or at least is more comfortable with it most of the time.

1:55:17

Speaker C

Yeah, that's why I have this maturity model thing to keep reminding myself what the actual goal is and to work backwards. I should not be on the terminal at all in my PI system and I should not be in some. I use Superhuman, by the way. Is it the. That's my email client, but I shouldn't be over there. What should happen is I say what should I be looking at? Who should I respond to? Is there anything important? I just speak those words and the things happen. Whether in the short term it pops up that client and I have to go interact with it there, or Kai is able to do it himself because he can control the clients. I think Gemini is definitely getting there very fast with turning on a bunch of Gemini features in Gmail. But to me, like, no, we should not be dealing with any of this kludge of even an email client is kludge if you think about it compared to Minority Report or the movie her. Right? You remember when you onboard in that operating system, you just say, hey, what's going on? Anything I should know about? And she's I just read your 940,000 emails. You got a new one from Sarah this morning. That's the interface ultimately I think everyone is building towards. So I try to keep that in mind. I will say one other thing about you because you're asking about like product advice, the ultimate product advice that I'm seeing and I help, you know, companies with this all the time, especially in cyber security. I have this one piece of advice which is if you are doing a cool product feature in a space like vulnerability management or Threat intel or whatever, and it's pretty good and you are competing against someone who is also pretty good, but they understand the customer and you don't. For example, this is a. Vulnerability management is a great example for this. Do you know all the engineering teams? Do you know how they push code? Do you know what their repositories are? Do you know how they're measured? Do you know all of those things? And their ticketing system and their CI CD pipelines. If you know that and your vuln management program or your vuln management solution is a little bit worse maybe than someone else who doesn't have all that context, you are going to lose like you're going to lose to the company who has more Context. So my expectation is that even somebody who seems like, oh, we just make a tasklet and it just puts out this little piece of context their entire drive, they will either not survive or they will move towards the model of, you know what? Turns out we actually have to learn a lot about this person. We should have a pie for them. And everyone's going to build this deep knowledge of the customer or the user and that is going to be what powers how good of outputs they can produce regardless of the product.

1:56:31

Speaker B

Yeah.

1:59:31

Speaker A

Okay.

1:59:31

Speaker C

It's another example of what you were saying, where everyone's going the same place.

1:59:32

Speaker B

Yeah. What's working in memory? I've been fascinated with memory systems for LLMs, AI agents, whatever you want to call them for a while. And this is another area where I feel like everybody recognizes that there's something missing or that it could be better, but instincts are very different in terms of how to deal with that. So what have you tried? What is working? Are you using any dedicated memory infrastructure companies to support your memory features? What do we need to know about memory?

1:59:36

Speaker C

Yeah, I'm very much team file system. Yeah. When the first version of PI came out, I don't know, sometime middle of last year, I came down firmly on the side of file system. File system is my memory, it is my storage, it is my context management system. I do have an archive of all my writing back to 1999 that is like tens of like over 10,000 posts. That one is a rag. So occasionally I have a rag, but I really dislike rag because I feel like it's just lossy and messed up. I prefer file system, I think it's the absolute best. So I have underneath the cloud directory in all caps is memory. And under memory I have learning, I have signals, I have all these different things that are pulling from the projects directory, the events JSON L file, which is every single transcript that's happening inside of the cloud code system. But on top of that, what I have is built on is this thing that relates to the algorithm I was talking about. It is constantly through the hook system determining how happy I am with responses. And then the post hook is looking at what the current sentiment level is. I have histograms of like how happy I have been with the results coming from the PI system. And so what that means is the system is designed to look at those signals, look at what I asked for and look at what it produced and then the sentiment and say, oh, he obviously wants to go more in this direction or he wants to go more in that direction I should do more of this and less of this. And this is all in service of the ratcheting up of the improvement of this overall algorithm. The overall ability for an agentic system to take for any particular task or for a long term goal, the ability to move from current state to desired state. So I'm using the memory system to gather extremely granular stuff and all signals, but it's all the entire purpose is self improvement, recursive self improvement.

2:00:13

Speaker B

And does that practically operate on just a runtime agentic search basis where Claude just decides what it wants to look into and pull stuff into into context on its own? Or are you doing some sort of post background batch processing? Because I've also been quite interested at times in. There's a. I did one episode I did was on a system called Hipporag which was taking inspiration from the hippocampus multi step process where you'd have these, whatever your corpus was. First you would go through and do entity recognition and deduplification and then create like a graph structure that would have the entities and then the documents in which they appeared. And that way you could rag into it anywhere in natural language, but then see, oh, that connects to these concepts, which connects to these other documents and expand out in a sort of network based way through the corpus. As opposed to a sort of purely hierarchical approach to retrieving information that gets pretty complicated, obviously pretty quick, but. But it does feel like something like that might be needed relative to maybe this is also just my like lack of confidence in my own ability to organize myself and my thoughts well enough. I certainly do recognize people who are quite different in this regard. But I feel like I need a sort of cross boundary layer that would probably have to be batch processed in the background to make these connections between all these various disparate things as opposed to being able to put each one in its proper place such that like Claude intuitively and correctly decides where to go just based on structure.

2:02:33

Speaker C

Yeah, this to me is the whole advantage of the scaffolding and like being able to infinitely tweak the scaffolding according to first principles. So because I have the core skill which is the bootstrap for the entire PI system, because I have that laid out and it gets loaded and it has all the context of like what we're trying to do and everything it gets also the architecture of the system, including the memory system. Now all this stuff that you're talking about doing with like scripts and stuff like that is the cloud code hook system. The cloud Code hook system is extraordinary. So I have I think right now 12 hooks that are active and this is, I've got a whole bunch for user prompt submit. So there's security checks in there, there's sentiment analysis checks in there. It's actually routing throughout the PI system according to what I'm trying to do based on the sentiment analysis, which is uses Haiku. So I have a custom inference tool which has three levels of inference, Fast, standard and smart, which is Haiku, sonnets and opus. And so the entire system is using this to like, like self route. Now the memory system and all those sentiment analyses and all the artifacts of keep in mind this is, it's fully archiving cloud code does this naturally every prompt I send, every tool use that it runs, every output of the tool use, this is all recorded, it's all there raw for us to analyze. So I am am taking that and putting it inside of this memory structure and I'm overlaying on top of it sentiment analysis. This is all being done dynamically. I'm not seeing anything. It's all just handled automatically due to hooks. So hooks are constantly adding the sentiment layer of how good the algorithm is doing, how good the PI system is doing overall. So at any point in time I could say what upgrades have we made to the system? How have they gone? How has our performance been going in the last month? And PI will come back, Kai in my case will come back and say, yeah, it seems we tried this, that didn't work, we uninstalled that, we went back, we went in another direction and currently we're doing this and you seem much happier with this. So this seems like a direction to go. Do you want to do any more work on that?

2:04:13

Speaker B

So and that's all just operating on RAW logs. There's not like a summarization level or some sort of. Because that sounds like just like a ton of content for it to wade through.

2:06:44

Speaker C

Oh, there's tons of summarization happening. Yeah, that's what the inference piece is. So the memory system is dropping its own artifacts which are summarized versions. So that. And they are also creating indexes in JSON Elliot, which can be read like instantly fast. No, you couldn't go and parse like the entire thing all the time. That would be too intensive. Yeah, this is stealing from a Stanford idea called Reflections where you get a whole bunch of context and you summarize it like maybe in one line or one paragraph.

2:06:57

Speaker B

This is from like the AI village.

2:07:29

Speaker A

That's right.

2:07:31

Speaker C

Yeah, that's right.

2:07:32

Speaker B

Yeah. I think about that a lot as well.

2:07:33

Speaker C

I got a lot of inspiration from that. Yeah. Summarizations into indexes which can be parsed. And of course they could always go look at the raw log if they want to, but they should be able to go off of the index and then. Yeah, that's all happening just with hooks, and hooks are happening anytime the system runs.

2:07:35

Speaker B

In practice, when you see people take your system and modify it, how much are they modifying it? Are people, like, following in your footsteps relatively closely or are they. They veering off in. In all sorts of different directions?

2:07:55

Speaker C

Yeah, I've not seen many modifications. It's more so population of the system. Oh, someone just posted one yesterday to the discussion in on GitHub. Holy crap. It was. I was like scrolling. It was like 20 pages. It was like the most insane thing I've seen. Oh, I think the guy's name is Jim and maybe the agent's name is James. I can't remember. Something like that. That. But anyway, he just. It brought over so much context and so many things and it. It was just massively impressive. So it's just a matter of he knew exactly what he wanted. This is what activates PI. He knew exactly what he wanted. He's been struggling with all these same PI problems of PI not existing. Claude code not existing in the past. He's been sitting on all these things like I have for decades. He knew what he wanted. He knew what he wished he could do. He saw PI brought all this stuff over and now he's producing content, way more content. He can make products. So it's more so like activation of what was already there, but dormant, rather than. I have seen some expansions of the system. There's lots of feedback pull requests and stuff where they're like, hey, could you add this? Could you tweak this? Or whatever. And so we're obviously trying to listen to those.

2:08:13

Speaker B

How does it feel to you? This is a bit of a weird question. We have obviously highly plastic brains that can really surprise people in terms of just how adaptable they can be. And here I'm thinking like blind people seeing through a prosthetic that, like zaps their tongue and they learn to interpret that as a visual signal. VLISS is long there, right? I think it was. I'm not sure if I could say his name quite correctly, but Jaron Lanier, hopefully. I'm saying that he's done fascinating experiments with virtual appendages in VR and getting your brain to learn to control some prehensile tail or something like that, and you can actually learn to do it. I'm wondering. And then, of course, I'm also thinking Neuralink, right, is about to. They start scaling up its customer base. And obviously their ambitions go way beyond treating paralyzed people. And who knows what that's going to look like in the future. Is there a feeling that you have of, like, this thing being a sort of literal extension of you, where if it's turned off or you don't have access to it for a time, do you, like, begin to feel like something is missing? Another version of this, real simple one, but digital is the feeling of something being on your clipboard. I know this is. I recently looked this up. It's a fairly known phenomenon on. I've always felt. For 20 years now, I've felt like I know when something is on my clipboard. I sometimes don't know what it was anymore and I have to paste it to see what it was. But I know that there's like something there. That part of my brain has developed or changed in some way, shape or form to be tracking that very closely. And it is a felt sense that there's something on the clipboard. So I wonder how this feels to you and if you can describe that. This is like a way to try to get at what the end state would look like. If I'm using this kind of thing, how should it feel to me? How will I know that I'm like, hitting pay dirt based on feeling how it feels to you right now?

2:09:36

Speaker C

Yeah, yeah, totally. I love that you brought this up. I think I was way back in the army in the 90s, and I came across this book called Getting Things Done by David Allen. And ever since then, let me reach into the pocket here, I have index cards, and index cards are like my way of capture. So the prime directive for David Allen is never let anything sit in your brain because it will hassle you and trouble you and cause, like, executive function problems because your brain will be like, hey, what about, hey, what about, hey, did you remember that thing? So I'm a massive clipboard person. Not technically clipboard, but in the way that you set. So in front of me, I've got different colored sticky notes. I have this system. I have my space pen, which is my favorite gift to friends. And this is just what I travel with to make sure. And now I have this limitless pendant, which just got bought by Meta, by the way, so I think I might switch off of that. But capturing what I'm thinking at the moment has been Critically important to me for like over 20 years. It just feels like massively important. I just recently created a reminder file inside of PI so I could just say, hey, remind me to do this, remind me to do that. But honestly, the vast majority of that is I have 2900 Apple notes. So Apple Notes has been my main capture for a long time. Unless I'm doodling or capturing ideas like visually, which is on the cards now, again, going forward, I should not have to be doing any of this. I'm going to keep my cards just for history reasons. But what should be happening is more like with her Joaquin Phoenix, it's like, hey, make sure I don't forget this. Hey, make sure I don't forget this. And Agentix systems should be switching away from call and response to your reminder list is always there, Always ready for your DA to shoot you a prompt. Hey, it's time. This would be a good time to do that. Hey, do you want to revisit some of your TO dos? I saw a really cool thing on X yesterday. It's like a little clock next to them and it's the daily agenda in analog form on this digital clock or whatever on their desk. But it cloud code generated, right? So whatever they're doing, they must have their own PI system and it's right there in physical form. So it's like, like crossing these two worlds, which I really like.

2:11:35

Speaker B

How do you think about the triggers for the system? Obviously you can ping it and then presumably it can be pinged by any number of external or you can allow it to be pinged by any number of external events in the world. And then there's the kind of background processing or if you want it to be proactive for you. Is that like a daily job or an hourly job? What do you think is the right balance between. You go to runs on a schedule, something triggers it from the rest of the world or maybe some mysterious fourth thing. What's the right way to think about that balance?

2:14:15

Speaker C

Yeah, that's a wonderful question. They now have the ability to launch remote agents. So you can actually send a task and it will run off in a GitHub infrastructure in their environment and then return results to you. The other thing I have, I'm a big Cloudflare person. So Cloudflare has the ability to create workers that can run different things on different scheduled time frames. So I have a whole. Most of my infrastructure is Cloudflare and they can talk to each other via authentication and access each other. Right. I even have an infrastructure for Running cloud code inside of a Docker, which agents can also talk to and schedule. So all of this is in service of. Again, going back to the. What I was talking about before, I should not have to think about any of this. I do right now, because the text's not quite there. But. But when I want to make something like you're talking about, I literally say to. To Kai, hey, look, I need you to not forget these things. I need you to remind me these things on a regular basis or whatever. What are our possibilities? And Kai will be like, yeah, so listen, right now, the whole trigger thing, like, that's not super far along. I tell you what I could do. I could spin up a worker. I could check every five minutes or every one minute against this set of goals, and I could ping you. Like, how would you like me to ping you? We could do the discord thing. I could text you, I could send you an email. So we're starting to, like, creep towards this in the kludgy type of way. But it's another example of everyone's going the same place, right? Because everyone's talking about background agents right now. Remote agents versus local ones. Part of the PAI maturity model is. And some of my friends are ahead of me on this, they're already calling in and accessing their terminal remotely. Me being a security person, I'm scared shitless about this. So I haven't done it yet because I haven't found a perfect, secure way to do it. But it is a huge problem that my system is a terminal inside a computer, right? If you want to get to the future of her, you've gotta. That's gotta be with you all the time, right? So that's. That's all stuff I'm thinking about. And scheduled tasks, like you said, or logical triggers is even better. It's better than scheduled tasks because, like, one of the first things I talked about in that book in 2016 is just like, proactive. That's a huge difference. Call and response, that's one thing. It's really cool, but it's still too close to a chatbot in my mind, right? You're like, ask a question, get an answer. Cool. Now you have to do something with it. What should be happening is it understands your environment, the timing. Like right now, Kai should not be interrupting me with, hey, did you see this cool news story? Because it knows I'm in the middle of a conversation. So small little movements, all in these directions from multiple angles, I would say so.

2:14:58

Speaker B

Earlier you mentioned that your friend Is like earning more bug bounties by doing something like this. Do you measure your own productivity in any similar way? And how much boost do you think you've got? And then as this presumably continues to create more and more leverage, that seems to imply that, like, you'll have to have yourself in the loop with lower and lower frequency. Right. If in the limit of this sort of thing, there's. You're only able to review so many things and make so many decisions. This is the gradual disempowerment. People would be saying, hey, you're talking about it right now, but if it's performing well enough, you'll be reviewing the things that matter and you won't be reviewing the things that don't. Where are we? What can you measure about your own output today? And where are you in terms of, like, how much scope of action you give the system? Does it ever send a response to an email? Does it ever send an email as you that you didn't review? Or do you, like, allow it to respond as itself without signing it as you, but, like, still try to move things forward without you actually being in that loop? Would you allow it to spend money on your behalf without you signing off? Yes. You want to execute that transaction? Are there other frontiers of action that you're watching the line move on what you do and don't need to be looped in on?

2:18:11

Speaker C

Yeah. Yeah. Great. I would say that being naturally a little bit cautious, I would say the scaffolding is not there yet for a whole lot of trust in this regard. When I'm sitting here watching it, I've got a big part of my hook system is actually a whole bunch of defenses. Watching it, watching what the agents are doing, making sure it's not accessing certain files and directories. And that uses the cloud code underlying system. It's got a whole bunch of cool permissions. I don't run dangerously skip permissions anymore. I used to. I turned that off. So I've got a whole, like, security scaffold there for file system access and stuff like that. Then I have a whole bunch of prompt injection defenses because those are massively dangerous as well. And I keep those layered. I just don't feel like the scaffolding is there yet to be like, hey, whatever, here's my bank accounts. Just run with it. I would say I'm okay with experiments. Okay, here's a separate bank account. It's only got a thousand dollars in it. Go crazy like that. You've probably seen the vending machine benchmark yeah, yeah, like cool. If there's bounds, if there's like blast radius control, sure. But when it comes to being able to send out emails and maybe my diary is sitting. I don't actually have my full diary or journal in the system yet because this is one of the things I'm a little sensitive about. But like you get a. Somebody sends me a link that's hey, Kai should go read this. I send Kai to go read it. It's a prompt injection and pretty soon my. I just published my diary on LinkedIn. Right. That's possible. Right. Much harder to do against me. But prompt injection is not like a super solvable thing. So I would say I level of trust, I'm going to say, I don't know, there's no way to put a number on this, but I'm going to say like 60% and I think over the next couple of years I'll probably get to 80, 90% but I'm still going to have, I still think security also being ex military and you know, just cybersecurity. I think in terms of threat models, here's all the things that would super suck if they happened. Just assume they happened, what could have stopped them. And a lot of that comes down to impact reduction in addition to probability reduction.

2:19:43

Speaker B

It's fascinating to think that you're not. Not a total maximalist on this stuff.

2:22:11

Speaker C

I am, I'm a total maximus on it, but it's just. And I'm doing a lot of crazy, sort of. I do lots of crazy experiments. I just have the blast radius limited quite a bit.

2:22:17

Speaker B

Yeah, yeah. Not a total yoloist. I guess maybe is the. Yeah is maybe a better way to say it. I've kept you a long time. I could go on longer but I should probably get us wrapped up and I gotta get, I gotta get deeper into to. This is obviously the next big thing for me to do. The one other thing I wanted to touch on from your print, PI principles and then maybe just give you a chance to touch on anything that we didn't touch on that you think I should know or anybody in the audience should know. But the last principle was permission to fail and I thought that was quite interesting. It certainly brings to mind things like when Anthropic gives Claude the option to end a conversation because it thinks it shouldn't be having this kind of conversation or to escalate something to the model welfare lead at Anthropic, it brings the bad behaviors of deceptive alignment, et cetera down a lot to give it that sort of escape valve. So it sounds like you're doing something very similar there where you're saying if you can't do this, don't gaslight me, like it's okay to fail, but just come back and tell me the truth. I think that's a really interesting fact that people should appreciate better about AI in general. And it's interesting that it's made your list of. Of principles. Interested to hear any more about that that you want to share and then maybe just anything else that you think people that I didn't touch on that you think people should not miss out on.

2:22:30

Speaker C

Yeah, I'll talk about that real quick. I think that's a very tactical one that we just understand as being a weakness of LLMs. More so the further back you go. This is a huge problem in 23 where it would just make up stuff because it's trying to do the right thing. So this is a very tactical thing. Basically saying, saying it's okay if you don't have the right answer. It's okay if you can't get to ideal state. Feel free to tap out and just tell me the truth because I value the truth more than you trying to keep confabulating something. So it absolutely does. It looks like from the studies it does actually improve performance, especially in not hallucinating and being sycophantic and all that sort of stuff. In terms of our know positive or other things to mention, I would just say that I've had this idea of slack in the rope for a very long time time. So the idea is, I feel like that as humans we talked about us not being unlocked. I feel as a species we tend to feel like the way history has gone that way because of our innate human limitations. It's like this because that's the only way it can be. We only have these medicines because we're at right. We're right at the limits. All of science is pushing perfectly with full strength. And this is the exact place. And to go 1% more would take infinite energy. I don't think that's true. And I think AI more and more is showing you that this is not true. And I am so bad at this because I'm also programmed. I'm constantly trying to break myself out of this. Of no. Once we start asking the right questions and providing the right context, we're going to be like, are you kidding me? You are at 1.7% and it's really easy to go to 63%. And we've seen this with AI models actually, right. For a long time. And I was arguing with some of my friends at these labs back in 23, they're like, yeah, whoever has the compute is going to win. I'm like, aren't there like looking little tricks where they're like, hey, I wonder if we. What if we just reverse the numbers and add them this way instead of that way? Oh my God. 47 increase. How many more of those are like lying on the ground, just fruit ready to eat that. It's just a matter of doing these combinations. How much research out there is partial the medical research? This one trips me out. It's like, how many studies did grad students do do? And they're like, oh, it turns out this molecule, if it encounters this part of a cell, it will produce this antibody. And this antibody will, by the way, kill all bad things. Hey, listen, I gotta go take this job. I'll just leave this research paper here and it's in some file somewhere or physically printed out somewhere and no one's looked at it. But there are hundreds of thousands of these across decades, right? And it's like going back to the security problem, no one has the time or the eyes or the brains or the hands to actually go and look at this stuff. So I feel like the combination of these two concepts means we're nowhere near any limits of what we could do. There's just so much opportunity. And when you start looking at things like everyone gets a tutor. Oh, here's a crazy one. 1. Here's a crazy one. What if we could not only change what we could pursue based on what we want, so eliminating the obstacles in front of what we want. That's cool. That's what we've been talking about. What if we could change what we want? There's this whole concept in philosophy of there's what you want and there's what you want to want. So it's very hard to be like, yeah, I just really wished I liked celery. How are you going to do that? Now a drug comes out, GLP1 or whatever, the Agonist. GLP1, Agonist. It literally makes you not want food. Okay. What if I wanted to be more self disciplined? What if there was an unlock for making me 10 smarter, which I would love. Both of those, right? These I feel like we don't know. It's an open question of which ones are easily slack in the rope fixable and which ones actually are physics that are stopping us. But I think a lot more problems in the world are likely to be the former.

2:23:51

Speaker B

I think that's probably a great place to end it. An aspirational note. I'm looking forward to digging in on this a lot more, and I really appreciate your walkthrough today and and so many aspects of the positive vision for the future that you've shared. Daniel Meisler thank you for being part of the Cognitive Revolution.

2:28:31

Speaker C

Thank you so much. I really appreciate it.

2:28:49

Speaker A

If you're finding value in the show, we'd appreciate it if you'd take a moment to share with friends, post online, write a review on Apple Podcasts or Spotify, or just leave us a comment on YouTube. Of course, we always welcome your feedback, guest and topic suggestions, and sponsorship inquiries either via our website Cognitiverevolution, AI or by DMing me on your favorite social network. The Cognitive Revolution is part of the Turpentine Network, a network of podcasts which is now part of a 16Z where experts talk technology, business, economics, geopolitics, culture, and more. We're produced by AI Podcasting. If you're looking for podcast production help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement@aipodcast.ing. and thank you to everyone who listens for being part of the Cognitive Revolution.

2:28:52