Every Agent Needs a Box — Aaron Levie, Box
Aaron Levie, CEO of Box, discusses how AI agents will transform enterprise workflows and why every agent needs secure data infrastructure. He explores the challenges of implementing agents in Fortune 500 companies, including access controls, identity management, and the need for enterprises to adapt their workflows to make agents effective.
- AI coding has succeeded because developers adapted their workflows to agents, and all knowledge work must undergo the same transformation
- Enterprise AI adoption faces unique challenges including fragmented data access, complex permission structures, and lack of documentation practices
- Agent identity management is critical - agents need oversight and can't have the same privacy rights as humans since creators remain liable
- Context engineering is the key bottleneck, with enterprises needing to bridge millions of documents down to limited token windows
- The number of agents will be 10-100x the number of people, creating massive infrastructure opportunities for data governance and security
"Every agent needs a box"
"We are changing our work to make the agents effective. In that model, the agent didn't really adapt to how we work. We basically adapted to how the agent works."
"All of the economy has to go through that exact same evolution right now."
"If a really, really smart human could not do that task in five or ten minutes for a search retrieval type task, your agent's not going to be able to do it any better."
"RBAC does seem to be dead"
Like, you don't write code. You talk to an agent and it goes and does it for you, and you maybe at best, review it. That's even probably like. Like, largely not even what you're doing. What's happening is we are changing our work to make the agents effective. In that model, the agent didn't really adapt to how we work. We basically adapted to how the agent works. All of the economy has to go through that exact same evolution right now. It's a huge asset and an advantage for the teams that do it early and that are kind of wired into doing this. Cause you'll see compounding returns, but that's just gonna take a while for most companies to actually go and get this dep.
0:00
Welcome to the latent space pod. We're back in the Chroma Studio with Chroma CEO Jeff Hoover.
0:37
Welcome.
0:42
Returning guest, but now guest host.
0:43
It's a pleasure.
0:45
Wow.
0:45
How did you get upgraded to that?
0:46
Because he's like, the perfect guy to be guest host for you.
0:48
That makes sense, actually. We love context. We both really love context. We really do.
0:51
And we're here with Aaron Levy. Welcome.
0:56
Thank you.
0:58
Good to be here.
0:59
Yeah. So we've all met offline and, like, chatted a little bit, but, like, it's always nice to get these things in person, in conversation. You just started off with so much energy. You're super excited about agents.
1:01
I love agents.
1:10
Yeah. Open Claw. Just got, by God, by OpenAI. Not bought, but you know. You know what I mean?
1:11
Some. Some, you know, acqui. Hiring.
1:16
Executive hire.
1:19
Executive hire.
1:20
Executive hire. That's my term.
1:20
Okay.
1:22
What are you pounding the table on on agents? You have so many insightful tweets.
1:24
Well, the thing that we get super excited about that I think is probably, you know, should be relatively obvious, is we've built a platform to help enterprises manage their files and their corporate files and the permissions of who has access to those files and the sharing and collaboration of those files. And all of those files contain really, really important information for the enterprise. It might have your contracts and might have your research materials, might have marketing information, might have your memos. All that data obviously has predominantly been used by humans, but there's been one really interesting problem, which is that humans only really work with their files during an active engagement with them, and then they kind of go away, and you don't really see them for a long time. And all of a sudden, with the power of AI and AI agents, all of that data becomes extremely relevant as this ongoing source of answers to New questions of data that will transform into something else that produces value in your organ. It contains the answer to the new employee that's onboarding, that needs to ramp up on a project. It contains the answer to the right thing to sell a customer when you're having a conversation with them, contains the roadmap information that's going to produce the next feature. So all that data that previously we've been just storing and occasionally forgetting about because we're only working on the new active stuff, all of that information becomes valuable to the enterprise and it's going to become extremely valuable to end users, because now they can have agents go find what they're looking for and produce new value and new data on that information. And it's going to become incredibly valuable to agents, because agents can roam around and do a bunch of work and they're going to need access to that data as well. And sometimes that will be an agent that is sort of working on behalf of you and effectively as you. And they are kind of accessing all of the same information that you have access to and operating as you in the system. And then sometimes there's going to be agents that are just effectively autonomous and kind of run on their own. And you're going to collaborate and work with them kind of like you did another person. Openclaw being the most recent and maybe first real sort of updating everybody's views of this landscape version of what that could look like, which is, okay, I have an agent, it's on its own system, it's on its own computer, it has access to its own tools. I probably don't give it access to my entire life. I probably communicate with it like I would an assistant or a colleague. And then it sort of has this sandbox environment. So all of that has massive implications for a platform that manages enterprise data. We think it's going to just transform how we work with all of the enterprise content that we work with. And we just have to make sure we're building the right platform to support that.
1:29
The sort of shorthand I put it is as people build agents, everybody's just realizing that every agent needs a box.
4:08
Yes.
4:15
And it's nice to be called box and just give everyone a box.
4:15
You know, if we can make that go viral, like, I think that terminology,
4:19
I think that's the tag like, every agent needs a box.
4:23
Every agent needs a box. If we can make the headline of this, I'm fine with this.
4:25
That's the billboard I want to.
4:28
Yeah, exactly.
4:29
Every agent needs a box. I like it. Can we ship this? Let's do it. My work here is done and I got the value I needed out of this pocket. But you know, so the thing that we kind of think about is, is whether you think the number is 10x or 100x or whatever the number is, we're going to have some order of magnitude more agents than people. That's inevitable. It has to happen. So then the question is, what is the infrastructure that's needed to make all those agents effective in the enterprise? Make sure that they are well governed, make sure they're only doing safe things on your information, make sure that they're not getting exposed to data that they shouldn't have access to. There's going to be just incredibly spectacularly crazy security incidents that will happen with agents because you'll prompt inject an agent and sort of find your way through the CRM system and pull out data that you shouldn't have access to. So we have God, right. I mean that's going to happen all over the place.
4:30
Right.
5:20
So, so then the thing is, is how do you make sure you have the right security, the permissions, the access controls, the data governance. We actually don't yet exactly know in many cases how we're going to regulate some of these agents. Right. If you think about an agent in financial services, does it have the exact same financial sort of requirements that a human did, or is it is the risk fully on the human that was interacting or created the agent? All open questions, but no matter what, there's going to need to be a layer that manages the data. They have access to the workflows that they're involved in pulling up data from multiple systems. This is the new infrastructure opportunity in the era of agents.
5:20
You have a piece on agent identities, which I think was today, which I think a lot breaking news the security people are talking about. Right. Basically I always think of this as like, well, you need the human you and then you need the agent you.
5:57
Yes.
6:10
And. Well, I don't know if it's that simple, but is Box going to have an opinion on that or you're just going to be like, well, we're just sort of the storage layer. Let Okta of Serial handle that.
6:10
I think we're going to have an opinion and we will work with generally wherever the contours of the market end up. And the reason that we're going to have an opinion more than other topics probably is because one of the biggest use cases for why your agent might need it an identity is for file System access. So thus we have to kind of think about this pretty deeply. And I think unless you're in our world thinking about this particular problem all day long, it might be like, why is this such a big deal? And the reason why it's a really big deal is because sometimes sort of say, well, just give the agent an account on the system and it just treat it like every other type of user on the system. The problem is that I, as Aaron, don't really have any responsibility over anybody else's box account in our organization. I can't see the box account of any other employee that I work with. I am not liable for anything that they do. And they have. I have. I have strict privacy requirements on everything that they are able to, you know, that they work on. Agents don't have that, you know, don't have those properties. The person who creates the agent probably is going to, for the foreseeable future, take on a lot of the liability of what that agent does. That agent doesn't deserve any privacy because. Because it's, you know, it can't fully be autonomously operated and it doesn't have any leg kind of responsibility. So thus you can't just be like, oh, I'll just create a bunch of accounts and then I'll kind of work with that agent and I'll talk to it Occasionally you need oversight of that. And so then the question is, how do you have a world where the agent sometimes you have oversight of. But what if that agent goes and works with other people and that person over there is collaborating with the agent on something? You shouldn't have access to what they're doing. So we have all of these new boundaries that we're going to have to figure out of. You know, it's really, really easy. So far we've been in easy mode. We've hit the easy button with AI, which is the agent just is you. And when you're in Claude code and you're in cursor and you're in Codex, you're just. The agent is you. You're offing into your services. It can do everything you can do. That's the easy mode. The hard mode is agents are kind of running on their own. People check in with them occasionally. They're doing things autonomously. How do you give them access to resources in the enterprise and not dramatically increase the security risk and the risk that you might expose the wrong thing to somebody? These are all the new problems that we have to get solved. I like the identity layer and identity vendors as being a solution to that. But we'll need some opinions as well because so many of the use cases are these collaborative file system use cases, which is how do I give an agent a subset of my data and give it its own workspace as well because it's going to need to store off its own information that would be relevant for it and how do I have the right oversight into that?
6:21
One thing which I think is kind of interesting about is that how humans work, right. I may not also just give you access to the whole file. I might sit next to you and scroll to this one part of the file and just show you that one partial file access. Well, I'm just saying I think RBAC does seem to be dead, right? You want to say something is dead? Probably RBAC is dead. And the auth story to me seems incredibly unsolved and unaddressed by the existing state of AI vendors.
9:00
But yeah, I think we're, I mean you're taking obviously really to upper limit that we probably need to solve for. And we built an access control system that was kind of like, you know, its own little world for a long time. And the idea was this. It's a many to many collaboration system where I can give you any part of the file system and it's a waterfall model. So if I give you higher up in the system, you get everything below. And that kind of created immense flexibility because I can kind of point you to any layer in the tree, but then you're going to get access to everything kind of below it. And that mostly is working in this world. But you do have to manage this issue, which is how do I create an agent that has access to some of my stuff and somebody else's stuff as well? And which parts do I get to look at as the creator of the agent? And these are just brand new problems. And humans, when there was a human there that was really easy to do. If the three of us were all sharing, there'd be a Venn diagram where we'd have an overlapping set of things we've shared, but then we'd have our own ways that we shared with each other. And. But in an agent world, somebody needs to take responsibility for what that agent has access to and what they're working on. These are like some of the most probably, you know, boring problems for 98% of people on the Internet. But they will be the problems that are the difference between can you actually have autonomous agents in an enterprise context that are not leaking your data constantly?
9:26
No, like, I mean, you know, I run a very, very small company for my conference and like we already have data sensitivity issues and some of my team members cannot see.
10:47
Yes.
10:55
The others. And like, I can't imagine what it's like to run a Fortune 500 that you have to worry about this. I'm just kind of curious. Like, you talk to a lot. Like 70, 80% of the Fortune 500 are your customers.
10:55
Yep. 67%. We're being very ICC.
11:07
Sorry. Yeah, okay.
11:10
Okay.
11:11
Something I'm rounding up.
11:11
Yes. For the government.
11:12
I'm projecting to the end of the year.
11:15
You do make it sound like, like we, we, we got to be on this. Like we're, we're taking way too long to get to 80%.
11:18
No, I mean, so like how are they approaching it? Right. Because you don't have a final answer yet.
11:23
Well, okay, so this is actually, this is the stark reality that unfortunately is the kind of like pouring the water on the party a little bit.
11:28
Yes.
11:36
We all in Silicon Valley are like have the absolute best conditions possible for AI ever. And I think we all saw the Dwarkash kind of Dario podcast and this idea of AI coding, why is that taken off and we're not yet fully seeing it everywhere else? Well, if you just enumerated the list of properties that AI coding has and then compared it to other knowledge work, let's just go through a few of them. Generally speaking, you bring on a new engineer, they have access to a large swath of the code base. There's very new engineer comes on, they can just go and find the stuff that they need to work with. It's a fully text in, text out medium. It's just going to be text at the end of the day. So it's really great from just kind of what the agent can work with. Obviously the models are super trained on that data set. The labs themselves have a really strong kind of self reinforcing, positive flywheel of why they need to do agentic coding deeply. So then you get just better tooling, better services. The actual developers of the AI are daily users of the thing that we're working in versus the probably there's only like seven Claude cowork legal plugin users at Anthropic any given day. But there's a couple thousand Claude code users every single day, so just think about which one are they getting more feedback on all day long. So you just go through this list. You have everybody who's a developer by definition is technical so they can go install the latest thing. We're all generally Online, or at least kind of the weird ones are. And we're all talking to each other, sharing best practices. That's already eight differences versus the rest of the economy. Every other part of the economy has 6 to 7 headwinds relative to that list. You go into a company, you're a banker in financial services, you have access to a tiny little subset of the total data that's going to be relevant to do your job. And you have to start to go and talk to a bunch of people to get the right data to do your job. Because Sally didn't add you to that deal room folder that, you know, the information is actually in a completely different organization that you now have to go in and sort of run into. And it's like you have this endless list of access controls and security. As you talked about, you have a medium which is not, it's not just text, right? You have, you have a zoom call that you're getting all of the requirements from the customer. You have a lot of in person conversations and you're doing in person sales. And like, how do you ever digitize all of that information? You know, I think a lot of people got upset with this idea that the code base has all the context that I don't know if you follow, did you follow some of that conversation that went viral is like, it's not that simple that the code base doesn't have all the knowledge. But you're a lot better off than you are with other areas of knowledge work. We have documentation practices, you write specifications. Those things don't exist for 80% of work. That happens in the enterprise. That's the divide that we have, which is AI coding has just fully, you know, we've reached escape velocity of how powerful this stuff is and then we're going to have to find a way to bring that same energy and momentum. But to all these other areas of knowledge work where the tools aren't there, the data is not set up to be there, the access controls don't make it that easy. The context engineering is an incredibly hard problem because again, you have access control challenges, you have different data formats, you have end users that are going to need to be trained through this as opposed to, they're adopting these tools in their free time. That's where the Fortune 500 is. And so we, I think have to be prepared as an industry where we are going to be on a multi year march to be able to bring agents to the enterprise for these workflows. And I think probably the thing that we've learned most in coding that the rest of the world is not yet, I think ready for. They'll have to be ready for it because it's just going to inevitably happen is I think in coding what's interesting is if you think about the practice of coding today versus two years ago. Yeah. It's probably the most changed workflow in maybe the history of time from the amount of time it's changed. Right?
11:36
Yeah.
15:40
Has any workflow in the entire economy changed that quickly in terms of the amount of change, at least in any knowledge worker workflow? There's very rarely been an event where one piece of technology and work practice has so fundamentally changed what you do. You don't write code, you talk to an agent and it goes and does it for you and you maybe at best review it. And that's even probably largely not even what you're doing. What's happening is we are changing our work to make the agents effective. In that model, the agent didn't really adapt to how we work. We basically adapted to how the agent works. All of the economy has to go through that exact same evolution. The rest of the economy is going to have to update its workflows to make agents effective and to give agents the context that they need and to actually figure out what kind of prompting works and to figure out how do you ensure that the agent has the right access to information to be able to execute on its work. This is not the panacea that people were hoping for of the agent drops in. Just automates your life. Like you have to basically re engineer your workflow to get the most out of agents. And that's just going to take multiple years across the economy. Right now it's a huge asset and an advantage for the teams that do it early and that are kind of wired into doing this because you'll see compounding returns. But that's just going to take a while for most companies to actually go and get this deployed.
15:41
I love pushing back. I think that that is what a lot of technology consultants love to hear this sort of thing. First, to embrace the AI to get to the promised land, you must pay me so much money to adopt the prescribed way of conforming to the agents. And I worry that you will be eclipsed by someone else who says no, come as you are and we'll meet you where you are.
17:06
And what was the thing that went viral a week ago? OpenAI probably is hiring FDEs to go into the enterprise and then anthropic is embedded at Goldman Sachs. So if the Labs are having to do this. If the labs have decided that they need to hire FDE and professional services, then I think that's a pretty clear indication that there's no easy mode of workflow transformation. So to your point, I think actually this is a market opportunity for new professional services and consulting firms that are like agent pilled and they kind of go into organizations and they figure out how to re engineer your workflows and to make them more agent ready and get your data into the right format and you know, reconstruct your business process. So you're, you're not doing most of the work, you're telling agents how to do the work and then you're reviewing it. But I haven't seen the thing that can just drop in and, and kind of let you not go through those changes.
17:30
I don't know how that kind of sales pitch goes over. Like, you know, you're, you're saying things like, well, in my sort of nice, beautiful walled garden, here's, here's this, here's this beautiful box account that has everything.
18:22
Yes.
18:32
And I'm like, well most, most real life is extremely messy and like poorly named and duplic outdated shit.
18:33
100% and so. 100%. And so this is actually. No, so, so this is, I mean we agree that the getting to the beautiful garden is going to be tough.
18:39
Yeah.
18:47
There's also the other end of the spectrum where I just like it's a technical impossibility to solve the agent is truly cannot get enough context to make the right decision in the incredibly messy land. Like there's no AGI that will solve that. So we're going to have to kind of land in somewhere in between which is like we all collectively get better at documentation practices and having authoritative, relatively up to date information and putting it in the right place. Agents will certainly cause us to be much better organized around how we work with our information. Simply because the severity of the agent pulling the wrong data will be too high and the productivity gain that you'll miss out on by not doing this will be too high as well. That your competition will just do it and they'll just have higher velocity and we see this a lot firsthand. So we build a series of agents internally that they can kind of have access to your full box account and go off and you give it a task and it can go find whatever information you're looking for and work with and thank God for the model progress. But if you gave that task to an agent nine months ago, you're Just going to get lots of bogus answers because it's going to say, hey, here's here are five documents that all kind of smell like the right thing. But you're putting me on the clock because my system prompt says be pretty smart, but also try and respond to the user and it's going to respond and it's like it got the wrong document. And then you do that once or twice as a knowledge worker and you just never again. Never again. You're just done with the system.
18:48
Doesn't work.
20:19
It doesn't work. And so Opus 4.6 and Gemini 3.1Pro and you know, whatever the latest 5.3GPT will be like, those things are getting better and better and it's using better judgment. And this sort of like all of these updates to the agentic tool and search systems are. We're seeing very real progress where the agent kind of can almost smell something's a little bit fishy when it's getting, you know, we have this process where we have it, go fan out, do a bunch of searches, pull up a bunch of data. And then it has to sort of do its own ranking of what are the right documents that it should be working with. And again, the intelligence level of a model. Six months ago it would be just throwing a dart at like, I'm going to grab these seven files and I hope that that's the right answer. And something like an opus verse 4. 5. And now 4.6 is like, oh, it's like, no, that one doesn't seem right relative to this question because I'm seeing some signal that is making that's contradicting the document where it would normally be in the tree and who should have access. Like it's doing all of that kind of work for you, but like it still doesn't work if you just have a total wasteland of data. Like, it's just not, it's just not possible. Partly because a human wouldn't even be able to do it. So basically, if a really, really smart human could not do that task in five or ten minutes for a search retrieval type task, your agent's not going to be able to do it any better. You see this all day long.
20:20
This touches on a thing that Jeff's passionate about, which is context engineering. I'm just going to let you ramble or riff on context engineering. If there's anything you did really good work on context rot, which has really taken over as the term that people use and the reference 100%.
21:42
All we think about is the Context route problem.
21:57
Yeah, there's certainly a lot of ranking considerations. Gentrific search, I think is incredibly promising. Yeah, I was trying to generate a question though. I think I have a question right now.
22:00
Yeah, no, but I think there was this moment, I don't know, two years ago, before we knew where the gotchas were going to be in AI. And I think someone was like, well, infinite context windows will just solve all of these problems because you'll just give the context window all the data and it's just like, okay, I mean, maybe in 2035 this is a viable solution. First of all, it would just simply cost too much. We just can't give the model the 5,000 documents that might be relevant and it's going to read them all. And I've seen it enough to start believing in crazy stuff. So I'm willing to just say, sure, in 10 years from now, never say never, in 10 years from now, we'll have infinite context windows at a thousandth of the price of today. Let's just believe that that's possible. But right, we're in reality today. So today we have a context engineering problem, which is I got 200,000 tokens that I can work with, or probably I don't even know what the latest graph is before massive degradation. Okay, I have 60,000 tokens that I get to work with where I'm going to get accurate information. That's not a lot of tokens for a corpus of 10 million documents that a knowledge worker might have across all of the teams and all the projects and all the people they work with. I have 10 million documents, which maybe is times five pages per document or something like that. I'm at 50 million pages of information and I have 60,000 tokens, like, holy shit, this is like, how do I Bridge the 50 million pages of information with the couple hundred that I get to work with in that token window? This is such an interesting problem. And that's why actually so much work is actually just search systems and the databases and that layer has to just get so locked in. But model's getting better. And importantly, knowing when they've done a search, they found the wrong thing. They go back, they check their work, they find a way to balance sort of appeasing the user versus double checking. We have this one test case where we ask the agent to go find 10 pieces of information.
22:08
Is this a complex work eval?
24:17
This is actually not an eval. This is sort of just like we have a bunch of internal benchmark kind of scenarios. Every time we update our agent, we have one which is I ask it to find all of our office addresses and I give it the list of 10 offices that we have. And there's not one document that has this. Maybe there should be. That would be a great example of the kind of thing that, like, maybe over time companies start to, you know, have these sort of like, what are the canonical, you know, kind of key areas of knowledge that we need to have. We don't seem to have this one document that says, here are all of our offices. We have a bunch of documents that have, like, here's the New York office and whatever. So you task this agent and you, you get, you say, I need the addresses for these 10 offices. Okay? And by the way, if you do this on any public chat model, the same outcome is going to happen, but for a different kind of query. You say, I need these 10 addresses. How many times should the agent go and do its search before it decides whether or not there's just no answer to this question? Often, and especially the, let's say, lower tier models, it'll come back and it'll give you six of the 10 addresses and it'll just say, I couldn't find the other four.
24:18
It doesn't know what it doesn't know.
25:24
It doesn't know what it doesn't know. So the model is just like, like, when should it stop? When should it stop doing? Like, should it, should it do that task for literally an hour and just keep cranking through? Maybe I actually made up an office location and it doesn't know that I made it up and I didn't even know that I made it up. Like, should it just keep, should it read every single file in your entire box account until it exhausts every single piece of information?
25:25
Expensive.
25:49
These are the new problems that we have. So you know something like, let's say a new opus model is sort of like, okay, I'm going to try these types of queries. I didn't get exactly what I wanted. I'm going to try again. At some point I'm going to stop searching because I've determined that no amount of searching is going to solve this problem. I'm just not able to do it. And that judgment is a really new thing that the model needs to be able to have. When should it give up on a task because you just don't. It can't find the thing. That's the real world of knowledge work problems. And this is the stuff that the coding agents don't have to deal with because it just doesn't. You're not usually asking it about. You're always creating net new information coming right out of the model for the most part. Obviously it has to know about your code base and your specs and your documentation. But when you deploy an agent on all of your data now you have all of these new problems that you're dealing with.
25:49
Our follow up research to context Rod is actually on agentixearch and we've sort of stress tested frontier models and their ability to search and they are not actually that good at searching. So you're sort of highlighting this like
26:38
explore Debbie Donner Everything doesn't work well.
26:52
Somebody has to be.
26:56
Can I throw out one more thing that is different from coding and the rest of the knowledge work that I failed to mention? So one other kind of key point is that at the end of the day, whether you believe we're in a slotpocalypse or whatever, at the end of the day, if you build a working product, at the end of if you've built a working solution, that is ultimately what the customer is paying for. Whether I have a lot of slop, a little slop or whatever, I'm sure there's lots of code bases we could go into in enterprise software companies where it's like just crazy slop that humans did over a 20 year period. But the end customer just gets this little interface they can type into it, it does its thing. Knowledge work doesn't have that property. If I have an AI model go generate a contract and I generate a contract 20 times and all 20 times it's just 3% different. That kind of slop introduces all new kinds of risk for my organization that the code version of that slop didn't introduce. So how do you constrain these models to just the part that you want them to work on and just do the thing that you want them to do? And in engineering you can't be disbarred as an engineer, but you could be disbarred as a lawyer. Like you can do the wrong medical thing. In healthcare there's no equivalent to that of engineering.
26:57
Do you want there to be?
28:15
Because I've considered civil engineering.
28:16
There is, right?
28:19
Civil engineering, sure. Oh yeah, for sure. But like in any of our companies, you'll be forgiven if you took down the site and we will do a rollback and you'll be in a meeting but you have not been disbarred as an engineer. We don't change your, you know, your computer science degree post Mortem. Yeah, exactly, exactly. So, so now maybe we collectively as an industry need to figure out like what are you liable for? Not legally, but like in a, in a management sense of these agents. All sorts of interesting problems that, that, that, that have to come out. But in knowledge work, that's the real hostile environments that we're operating in.
28:20
I do think like a lot of the last year's 2025 story was the rise of coding agents. And I think 2026 story is definitely knowledge working.
28:54
Yes, a hundred percent.
29:01
Right.
29:02
Like that would. And I think open clinical work are just the beginnings.
29:02
Yes.
29:05
Like it's. The next one's going to just going to be absolute craziness.
29:06
It is. And it's going to be, I mean again, like this is going to be this wave where we are going to try and bring as many of the practices from coding because that will clearly be the forefront which is tell an agent to go do something and has an access to a set of resources. You need to be responsible for reviewing it at the end of the process. That to me is the kind of template that I just think goes across knowledge work. And Claude coworker is a great example. OpenCloud's a great example. You can kind of sort of see what codex could become over time. These are some really interesting kind of platforms that are emerging.
29:09
Okay, we touched on evals a little bit. You had the report that you were going to go bring up and then I was going to go into boxes. Evals. Go ahead, talk about your AJAX search thing.
29:44
Yeah, mostly. I think a few of the insights is like number one, frontier model is not good at search. Humans have this natural explore, exploit, trade off where we kind of understand when
29:54
to stop doing something.
30:02
Also humans are pretty good at forgetting actually and pruning their own context, whereas agents are not and actually an agent in their context history. If they knew something was bad and you could see in the trace the reason you trace, hey, that probably wasn't a good idea. If it's still in the trace, still in the context, they'll still do it again. And so I think pruning is also going to be really.
30:03
It's already becoming a thing.
30:21
Right. But letting models self prune their context windows.
30:22
Yeah, so don't leave the mistake, don't leave the mistake in there. Cut out the mistake, but tell it that you made a mistake in the past and so it doesn't repeat it.
30:24
Yeah, but cut it out so it doesn't get distracted by it again.
30:31
Because really, you know, what is so it will repeat its mistake just because it's in the context. In the context.
30:33
So that's a few shot examples.
30:38
It's like, oh, this is a great thing to go try even if it didn't work.
30:41
Yeah, exactly. So there's like a bunch more just
30:44
Groundhog's Day inside these models. I'm going to go keep doing the same wrong thing.
30:47
That makes sense, right? I feel like, you know, some crude analogy. You're trying to fit a manifold in latent space, which kind of is doing break program synthesis, which is kind of one way to think about ll be doing Right. Certain facts might be overly pitting it Certain sectors of latent space instead of plug latent space.
30:51
Our editor adds a bell every time you say that.
31:07
You have to remove those. You should have a gong CBPN or something.
31:10
You have to remove those links to kind of give the freedom to do what you need to do. But yeah, we'll release more soon.
31:14
That's awesome. Yeah, that'll be cool.
31:20
We're a cerebral podcast that people listen to us and sort of think really deep. So we try to keep it subtle.
31:22
Okay, fine.
31:29
You guys do have EVAs. You talked about your office thing, but you've been also promoting APEX agents and complex work wherever you want to take this just how you.
31:31
APEX is obviously mercures kind of agent eval. We supported that by sort of opening up some data for them around how we kind of see these data workspaces in the kind of regular economy. So how do lawyers have a workspace? How do investment bankers have a workspace? What kind of data goes into those? And so we partner with them on their APEX eval. Our own eval is it's actually relatively straightforward. We have a set of documents in a range of industries. We give. The agent previously did this as a one shot test of just purely the model. And then we just realized we need to based on where everything's going, it's just got to be more agentic. So now it's a bit more of a test of both our harness and the model. And we have a rubric of a set of things that has to get right and we score it. And you're just seeing, you know, these incredible jumps in almost every single model in its own family of, you know, Opus, you know, Sonnet 4:6 versus Sonnet 4:5.
31:38
Yeah, we have this up on screen.
32:38
Okay, cool. So you're seeing it somewhere like I forget the. It was like 15 point jump, I think on the main. On the overall.
32:39
Yes.
32:46
And it's just like you know, these incredible leaps that, that are starting to happen.
32:46
And anthropic doesn't know like any. It's completely held out from anthropic.
32:50
This is not in any. There's no public data which has, you know, benefits and this is just a private eval that we do and then we just happen to show it to, to the world. So you can't, you can't train against it. And I think it's just as representative of you know, it's obviously reasoning capabilities, what it's doing at at kind of test time, compute capabilities, thinking levels, the context rot issues. So many interesting kind of capabilities that are now improving.
32:54
One sector that you have that's interesting people are roughly familiar with health care and legal, but you have public sector in there. What's that like? What is that?
33:22
Yeah, and we actually test against I don't know, maybe 10 industries. We end up usually just cutting a few that we think have interesting gain. So public sectors won a lot of government type documents.
33:29
What is that? What is a government type document?
33:39
Government filing tax return. Probably not tax returns. It would be more of what would the government be using as data. So think about research that type of data sets and then we have financial services for things like data rooms and what would be an investment prospectus that one you can. Dog food. Yeah, exactly. So we run the models in now more of an agent mode but still with kind of limited capacity. And just try and see on a like, for like basis what are the improvements. And again we just continue to be blown away by how good these models are getting.
33:41
Yeah, I mean I think every serious AI company needs something like that where like, well this is the work we do. Here's our company eval and if you don't have it, well you're not a serious AI company.
34:14
There's two dimensions, right. So there's like how are the models improving? And so which model should you either recommend a customer use? Which one should you adopt? But then every single day we're making changes to our agents and you need to know if you regress, if you know. Yeah. You know, I've been fully convinced that the whole agent observability and eval space is going to be a massive space. Super excited for what brain trust is doing. Excited for you know, Langsmith, all the things. And I think what you're going to, I mean this is like every enter like literally every enterprise right now. It's like the AI companies are the customers of these tools. Every enterprise will have this, you'll just have to have an eval of all of your work and you'll have an eval of your RFP generation, you'll have an eval of your sales material creation, you'll have an eval of your invoice processing. And as you buy or use new agentix systems, you are going to need to know what's the quality of your pipeline. So huge market with agent evals.
34:24
Yeah. And I'm going to shout out your team a bit. Your CTO Ben did a great talk with us last year and he's going to come back again for World's Fair. Just talk about your team, brag a little bit. I think people take these eval numbers in pretty charts for granted. But no, I mean there's lots of really smart people at work doing all this.
35:22
Biggest shout out is we have a couple of folks, Aditya Siddharth, that kind of run this. They're like a kind of tag team duo on our evals. Ben, our CTO heavily involved Yasha, head of AI. A bunch of folks and evals is one part of the story. And then just like the full kind of AI, an agent team is core to this whole effort. So there's probably, I don't know, maybe a few dozen people that are the epicenter and then you just have layers and layers of concentric circles of okay, then there's a search team that supports them and an infrastructure team that supports them and it's starting to ripple through the entire company. But there's that kind of core agent team that's a pretty close knit group.
35:39
The search team is separate from the infra team.
36:23
I mean we have every layer of the stack we have to kind of do except for just pure public cloud. You know, we store, I don't even know what our public numbers are. But like you can just think about it as like a lot of data is stored in box and so we have, you have every layer of the, of the stack of, you know, how do you manage the data, the file system, the metadata system, the search system, just all of those components. And then they all are having to understand that now you've got this new customer which is the agent and they've been building for two types of customers in the past. They've been building for users and they've been building for applications and now you've got this new agent user and it comes in with a different subtitle property sometimes like hey, maybe sometimes we should do embeddings, an embedding based kind of search versus your typical semantic search. It's just like you have to build the capabilities to support all of this. And we're testing stuff, throwing things away. Something doesn't work and not relevant. It's like just total chaos. But all of those teams are supporting the agent team that is kind of coming up with its requirements of what do we need?
36:25
Yeah. We just came from Fireside Chat, where you did and you talked about how you're doing this. It's kind of like an internal startup within the broader company. The broader company is like 3,000 people, but this is a core team of like, well, here's the innovation center, and every company kind of is run this way.
37:28
I want to be sensitive. I don't call it the innovation center only because I think everybody has to do innovation. There's a part of the company that is sort of do or die for the agent wave.
37:44
Yeah.
37:55
And it only happens to be more of my focus simply because it's existential that we get it right. Yeah. All of the supporting systems are necessary. All of the surrounding adjacent capabilities are necessary. Like, the only reason we get to be a platform where you'd run an agent is because we have a security feature or a compliance feature or a governance feature that some team is working on. But that's not going to be the make or break of whether we get agents. Right. Like, that already exists and we need to keep innovating. There's. I don't know what the right exact precise number is, but it's not 1,000 people and it's not 10 people. There's a number of people that are the kind of startup within the company that are the make or break on everything related to AI agents, leveraging our platform and letting you work with your data. And that's where I spend a lot of my time. And Ben and Yash and Diego and Thierry, these are just people that kind of across the team are working.
37:56
Yeah. Amazing.
38:49
How do you think about. I mean, you talked a lot about, like, kind of read workflows over your box data.
38:50
Right.
38:54
You know, gentic search questions, queries, etc. But like, what about, like, write or like, authoring workflows?
38:54
Yes. I've already probably revealed too much, actually, now that I think about it. So I've talked about whatever you can. Okay.
38:59
Yeah. It's just us.
39:05
It's just us.
39:05
Okay.
39:06
Of course, of course.
39:06
So I guess I would just. I'll make it a little bit conceptual because again, I've already. I've already said things that are not even ga. But. But we've. We've kind of like danced around it publicly, so. Yeah. Yeah, okay. Just like, hopefully nobody watches this episode.
39:07
It's TIDB for the highly engaged to go figure out what exactly is your line of thinking. They can connect the dots.
39:20
Yeah. I would say that as a place where you have your enterprise content, there's a use case where I want to have an agent read that data and answer questions for me. And then there's a use case where I want the agent to create something and use the file system to create something or store off data that it's working on or be able to have various files that it's writing to about the work it's doing. So we do see it as a total read write. The harder problem has so far been the read only because again, you have that kind of 10 million to one ratio problem, whereas writes are a lot of. That's just going to come from the model and we'll just put it in the file system and kind of use it. So it's a little bit of a technically easier problem. But the only part that's not necessarily technically hard, it's just like it's not yet perfected in the state of the ecosystem is building a beautiful PowerPoint presentation is still a hard problem for these models. These formats were not built for. They're working on it, they're working on it. Everybody's working on.
39:27
Every launch is like, well, we do PowerPoint now.
40:30
Yeah, getting a lot better each time. But then you'll do this thing where you'll ask, update one slide, and all of a sudden the fonts will be just a little bit different on two of the slides or it moved some shape over to the left a little bit. Again, these are the kind of things that in code, obviously you could really care about if you really care about how beautiful is the code. But the end user doesn't notice all those problems in file creation. The end user instantly sees it. You're like, paragraph three, you literally just changed the font on me. It's totally different font. And midway through the document. Those are the kind of things that you run into a lot in the content creation side. So we are going to have native agents that do all of those things. They'll be powered by the leading kind of models and labs. But the thing that I think is probably going to be a much bigger idea over time is any agent on any system, again using Box as a file system for its work. And in that kind of scenario, we don't necessarily care what it's putting in the file system. It could put its memory files, it could put its specification documents, it could put whatever its markdown files are, or it could generate PDFs. It's just like it's a workspace that is sort of sandboxed off for its work. People can collaborate into it, it can share with other people. And so we're thinking a lot about what's the right kind of way to deliver that at scale.
40:32
I wanted to come into sort of the sort of AI transformation or AI sort of operations things. One of the tweets that you want to talk about this is just me going through your tweets, by the way.
41:54
Okay. One by one.
42:03
You're the easiest guest to prep for because you, you already have, like, this is, this is what I'm interested in. I'm like, okay, well, are we going
42:06
to get to like, like February, January or something? Where are we in the, in the timelines? How far back are we going?
42:11
Can you, can you describe boxes, a set of skills? Right, like that. That's like, that's like one of the extremes of like, well, if you, you just turn everything into a markdown file, then your agent can run your company. Like, you just have to write, find the right sequence of words to. Yes, to do it.
42:17
Oh, sorry, is that.
42:33
I think the question is, what if we documented everything the way that you exactly said, let's get all the Fortune 500s prepared for agents, and everything's in golden and nicely filed away and everything. What's missing, what's left? Right. You've run your company for a decade.
42:34
I think the challenge is that that information changes a week later because something happened in the market for that customer or us as a company that now has to go get updated. And so these systems are living and breathing and they have to experience reality and updates to reality, which right now is probably going to be humans, you know, kind of giving those. Giving them the updates. And, you know, there is this piece about context graphs as they kind of went very viral. Yeah. And I was like, I thought it was super provocative. I agreed with many parts of it. I disagree with a few parts around. It's not going to be as easy as just, if we just had the agent traces, then we can finally do that work because there's just like, there's so much more other stuff that's happening that we haven't been able to capture and digitize. And I think they actually represented that in the piece, to be clear. But there's just a lot of work that has to. You just can't have only skills files for your company because just could be like there's going to be a lot of other stuff that happens change over time. Yeah.
42:52
Most companies are practically apprenticeships.
43:54
Most companies are practically apprenticeships.
43:56
Like every new employee who joins the team, like you spend one to three months like ramping them up.
43:58
Yes.
44:02
All that tacit knowledge is not written down.
44:02
Yes.
44:04
But like it would have to be if you wanted to like give it to an agent.
44:05
Right.
44:07
And so like that seems to me
44:08
like to be one is. I think you're going to see again a premium on companies that can document this much. There'll be a huge premium on that because you know, can you shorten that three month ramp cycle to a two week ramp cycle? That's an instant productivity gain. Can you dramatically reduce rework in the organization? Because you've documented where all the stuff is and where the answers are. Can you make your average employee as good as your 90th percentile employee? Because you've captured the knowledge that's sort of in the heads of those top employees and make that available. So like you can see some very clear productivity benefits. If you had a company culture of making sure your information was captured, digitized by Google, put in a format that was agent ready and then made available to agents to work with. And then you just again have this reality of like at a 10,000 person company, mapping that to the access structure of the company is just a hard problem. It's like, yeah, well you just not every piece of information that's digitized can be shared to everybody. And so now you have to organize that in a way that actually works. There was a pretty good piece this, this piece called your company as a file is a file system. Did you see that one?
44:09
Nope.
45:21
Yes. You saw it? Yeah. And I actually be curious your thoughts on it. Like an interesting kind of like we agree with it because that's how we see the world.
45:22
We have it up.
45:31
Okay.
45:32
Yeah. But it's all about basically we're already organized in this kind of permission structure way and these are the kind of natural ways that agents can now work with data. So it's kind of like this, you know, kind of interesting metaphor, but I do think companies will have to start to think about how they start to digitize more, more of that data. What was your take? Yeah, I mean like the company is
45:33
probably like an ACID compliant file system, which a convincing box is. Right. So. Yeah.
45:53
Yes. Yeah.
45:58
Which you have a great piece on.
46:00
But yeah, well, my Direction is a little bit like, I want to rewind a little bit to the graph word. You said that. That's a magic trigger word for us. I always ask, what's your take on knowledge graphs?
46:01
Yeah.
46:11
Because especially every data database person, I just want to see what they think. There's been knowledge graphs, hype cycles and you've seen it all.
46:12
So I actually am not the expert in knowledge graphs. So that you might need to.
46:19
You don't need to be an expert. I think it's just like, well, how seriously do people take it? Is there a lot of potential in
46:24
the h. Well, can I understand first if it's. Is this a loaded question in the sense of are you super pro, super con, super anti media?
46:30
I see pros and cons, but I think your opinion should be independ. Undermine.
46:40
Yeah, no, no, totally. I just want to see what I'm stepping into.
46:44
No, I know it's a. And it's a huge trigger word for a lot of people in our audience. And they're, they're trying to figure out
46:46
because like, why is this such a hot item for them?
46:51
Because a lot of people get graph religion and they're like, everything's a graph. Of course you have to represent it as a graph. Well, how do you solve your knowledge changing over time? Well, it's a graph.
46:54
Yeah.
47:03
And, and I think there, there's that line of work and then there's. There's a lot of people who are like, well, you don't need it. And both are. Right. Yeah.
47:04
And what do the people who say you don't need it, what are they arguing for?
47:11
Markdown files.
47:14
Oh, sure, sure.
47:15
Simplicity versus it's structure versus less structure. Right. That's all it really is.
47:16
The tricky thing is again, when this gets met with real humans, they're just going to their computer. They're just working with some people on slacker teams. They're just sharing some data through a collaborative file system and Google Docs or Box or whatever. I certainly like the vision of most knowledge graph kind of futuristic kind of ways of thinking about it. It's just like it's 2026. We haven't seen it yet kind of play out as, I mean, you remember in like actually, I don't even know how old you guys are, but to show my age, I remember 17 years ago, everybody thought enterprises would just run on wikis.
47:20
Confluence.
48:02
I mean confluence actually took off for engineering for sure. Like unquestionably like this was like everything would be in the wiki. And I Think based on our general style of what we were building, like, we were just like. I don't know, people just like, want a workspace. They're going to collaborate with other people.
48:03
Exactly. So you were anti in all this graph.
48:20
Not anti. Not anti. I'm not anti because I think your search system. I just think these are two systems that probably, like, I'm not in any religious war. I don't want to be in anybody's YouTube comments on this. This is not a fight for me.
48:22
We love YouTube comments. We're get into comments.
48:36
Okay. It's mostly just a virtue of what we built and we just continued down that path and that was what we pursued. But this is not a kind of. This is not a.
48:38
It's not existential for you.
48:51
Great.
48:53
We're happy to plug into somebody else's graph. We're happy to feed data into it. We're happy for agents to talk to multiple systems. Not our fight. But I need your answer.
48:53
Graphs are. Nerd snipes is very effective. Nerd snipes.
49:06
This is one opinion, and I've.
49:10
I think that the actual graph structure is emergent in the mind of the agent in the same way it is in the mind of the human. And that's a more powerful graph because it actually evolved over time.
49:12
I'll figure it out myself.
49:20
Exactly.
49:21
Okay.
49:21
All right. And what's yours?
49:22
I like the wiki approach. I'm actually like, obviously, I spend some of my time at cognition, which you know very well, and they've had a lot of success with deep wiki. It powers a lot of dev and brain. Super powerful and it's useful for humans, but it's. Oh, my God. It's useful for agents.
49:23
Yes. Tell me if you think I'm wrong on this, but not much of an access control structure issue. No, it's like the whole. You get the whole code base and everybody gets.
49:38
Before I speak too much about that, there may be some enterprise controls on the enterprise deep wiki offering that I'm not familiar with. But, yeah, I don't have anything on the public side, but I think almost every agent should have its own wiki that it's updating, and that's persistent memory, and that is a very weak knowledge graph. And you could strengthen it if you want more structure, but you may not need it.
49:48
Markdown files having links in wiki style.
50:10
Right.
50:12
Very effective. Right, Lindy?
50:12
I like that as a general pattern. Okay, so last couple of questions.
50:14
Sure.
50:18
But feel free to jump on in or if you want any Rants. I see you as a very interesting and unusual founder, where you've been in a business and you're both like, you're of two worlds, you're of Silicon Valley, but you're also of the Fortune 500s. And I feel like your kind of founder mode is very different from the Brian Chesky founder mode. And I'm just kind of curious, if you have reflections on how you operate as a founder, what would his founder mode be? Don't delegate.
50:18
Ah, right. And how would you put me?
50:44
You do delegate.
50:46
I see. I don't know that Brian and I would be that far removed from each other when you get to the specifics. So there's a whole bunch that I delegate. 90% of the work that happens at Box is fully, you know, fully delegated. We've got great leaders running, running all that stuff. It's just too much for my brain to handle. And probably 70% of the work, I'm going to make up all the numbers here. Probably 70% of the work at Box or 70, 80% of the work at Box. I only need to really look at about 5% of that for like some high leverage decisions to be involved in. You know, what's the marketing message that we think is going to resonate with customers? So that's a little bit of high leverage thing that we do in marketing, but most of marketing activities I don't get involved in. What's our sales pitch? Maybe I'll be involved in that a little bit. Or like what's roughly the investments or push we're going to do in certain verticals? You know, that's about 5% of like the total bandwidth of the key areas of sales or go to market. Okay, so like 70, 80% of the company, I can just do about 5%. And then just like operationally we've got great leaders and they're gonna execute on that. And we collaborate on the 5% anyway. It's not like I'm just like making up a decision and saying to go and do it. Then there's this part that is like the existential part of the business, which is if we don't do this right, we're out of business. And by virtue of just being a founder, you get kind of sucked into that part of the work because you can feel it. Like, this is like, you can just see how the AI tsunami could wipe you out if you make just 2, 3, 4, 5 wrong decisions in this space. Like couple wrong architecture decisions, couple wrong AI feature decisions, couple wrong API platform decisions. And you might be out of the game in a year from now. And you just feel it in your bones. You know this like, it's just like we feel this all day long in this space, given what's happening. And so in that area, you can't kind of delegate in a classic sense. You still need to make sure you've got great leaders and strong hires and people that have high agency because they want to be able to own part of the strategy and the roadmap, or else you can't hire good people. But there's going to be a lot of little micro forks in the road that they will compound to determine whether you succeed or fail. And so your kind of founder energy just like, automatically draws you into those because they are the determining decisions of your company's future. And that's kind of where I spend my time. And you have to kind of do it in a collaborative way again, because if you are only dictatorial and just. You just won't eventually be able to hire the best people because they won't want to work on that environment. But you also just can't abdicate all the responsibility because the risks are just simply too high. You have to somehow, obviously add some value. And so the value I add is I've seen 20 years of this business. I think I can kind of piece together what I expect the value propositions are going to be and how customers will react to certain things. So that's what I can bring to the table. And then you have this kind of existential fear of, if I get it wrong, it's all on me anyway. I don't get to blame, you know, the engineer that was working on that project, like, it's all. It's. It's my fault. Right. Like, at the end of the day, it'll be my fault if it doesn't work. So by virtue of. Of that liability responsibility, you just get pulled into needing to make sure, like, it's all going according to kind of how you think it needs to end up. I don't know. I don't know how Brian would answer that, I guess. But, like, I. Yeah, it's a long essay.
50:48
It's an interesting essay. People should go and compare and contrast your answer versus his. I do think that systems have a way of letting entropy get to them.
54:24
Yep.
54:32
And if you step away for too long, you need to have a way to, like, check in and go, like, well, do I need to come back in? Or are we good? And people are going to tell you things are good, but they're not Good.
54:32
Yes, yes, 100%.
54:41
Yeah.
54:44
And that's actually, I'm a fan of actually process for that 70 to 80%. So that 70 to 80%. The process is you're going to do a quarterly business review and you're going to have a brand check in and you're going to do those. You're going to make sure that you're seeing all the right episodes of what's changing and how it's kind of evolving and make sure it's kind of going the right direction. And then there's some areas where she's like, no, it's 24 7. I guarantee after this podcast at 11pm, I'll be doing a zoom with Ben and probably some other people because we're going to be talking about agents and new platform features. And that's your just in the cauldron kind of grinding on that side.
54:44
Yeah, that's extremely realistic, what it's like. And I just want to have people hear your perspective on what.
55:30
And this is this. You read the post about everybody having agents running in the weekend, and it's like, you know, you, you just, I mean, first of all, anybody crazy enough to come to Silicon Valley, like, we don't bring good news about the sort of, like, healthiness of our environment right now. You have to know what you're signing up for. But like, like, you know, there, there's a real issue, which is like, shoot, do I have enough agents running?
55:38
And yeah, I made a meme that was like, semi viral for me, but about this, like, exactly. That's it. You can't even enjoy a party these days, working with your tokens.
56:06
There's compute out there that you're not utilizing.
56:15
I know what the. I paid for the $200. I'm going to spend the $200. Yeah, I'm gonna spend $6,000 out of $200.
56:17
We need to make anthropic very unprofitable.
56:25
Yeah, yeah, we're not doing a good enough job. Cool. I have a closing question, if you. Unless you have a question.
56:29
I've asked this question in private before, but I ask it again, which is. It's a question that Tyler Cowan asks his guests on his podcast, which is, what is the Aaron Levy production function? And I love this question because there are so few people that I think are good at both executing, but also distilling and just putting good ideas into the ether. You put a lot of good ideas into the ether. And so what is the Aaron Levy production function that allows you to do that versus others.
56:33
How do I get that information?
56:59
Or I can give you a variant, which is what goes into Aaron Levy and what goes out and how does it turn inside?
57:02
I'm just trying to think of. Because there's some very. I just read a lot of Twitter as well.
57:09
And you spend a lot of effort, too.
57:15
In contrast, you don't see, like, great mini essays from Brian Chesky every day.
57:16
But you do from you. Oh, yeah.
57:21
And you're kind of weird in that way.
57:22
Maybe he's healthier than me. Actually, we should just, like, we should just text him to see if, you know, he's got a. I think he does work out. He got bigger muscles. Well, that's the thing. I work out less than him and I tweet more than him. So that's how we're balancing things out. Mostly the way I just think about it is just, you know, there's lots of work that's happening in the business. I'm getting to see all the problems that we are running into constantly. And I'm trying to be a little bit of a. Create a flywheel between what we're doing internally, what. Then we talk about getting a feedback loop on that and seeing other people's experiences of what they're doing bring that back into the business. And so I just see my job as hopefully being able to kind of connect the dots of what's going on in the world with what's going on in box. And then I just happen to tweet about that along the way because it's all you.
57:24
There's no editor. There's no.
58:25
There was a funny. I tried to get an internship between freshman and sophomore year of this company, and it was a film student kind of production company in New York. And I got the internship, and then I emailed my liaison kind of guy who sponsored me for the internship, and I said, hey, I'd like to do a blog of my summer internship where I blog about being an intern at a production company in New York. And about, like, half a day, day later, they emailed me back saying they've rescinded the internship.
58:30
No.
59:04
Yeah, because I showed a lack of judgment on professionalism or whatever. Just even the idea that I would ask that question, red flags went up of, who the fuck is this guy? So anyway, I only say that to say that to me, just building in public is just a natural thing. And so I just go through the day. We deal with interesting problems. I tweet about them, I get information back in the process. I see your work, I see your work, you know, see a bunch of folks and try and you know, kind of incorporate that back in the box. My job is to try and connect all these things together and, and make, make it useful.
59:05
And you're, I mean you're the number one spokesperson, right? So you do have to be out there.
59:44
Yeah, I, but I, I kind of would be doing it whether or not like it's like, I don't really think of it as a job requirement as much as like, I just like, I like social media.
59:47
You're so good at it.
59:54
Yeah.
59:55
It's so hard to believe. So like.
59:55
Okay, sorry.
59:58
Do you get up at 5am with coffee? Is that your secret?
59:58
It's like how do you work?
1:00:02
Do you actually do this like in the back of way? Do you do it that way? How do you do this? It's mostly that though. It's mostly there's. I have a commute home each night. I try and see my kids most weekdays before I have to hop back online. So there's a 20 minute window there where I can kind of distill the information that's happened and be like, is there anything I learned today that would be interesting to throw out there? Or anything that I saw. And then probably somewhere between 7:30 and 9pm I finally get a chance to look through the feed and see did anything crazy happen in AI. And then that will also kind of catalyze something that's the best I can. Kind of respect. Yeah.
1:00:03
Okay, thanks.
1:00:50
Now I know you're cut off as apm. I will try to get AI news out before APM so I can help him.
1:00:52
Yeah.
1:00:57
Do his thing.
1:00:57
Basically if, if I don't see it before 8 to 8:30, I'm not gonna.
1:00:58
Yeah.
1:01:02
I'm not gonna go like quote tweet or something cuz. Cuz then I'm back on Zoom after that.
1:01:03
So I wasn't gonna plan on asking this, but you've mentioned. Yeah, you mentioned the film stuff.
1:01:08
Yeah.
1:01:13
And I know from one of my favorite parts of doing a research on you was that you got the idea for Box from like the, the Paramount lot pushing paper. You're you film guy. You're big.
1:01:13
I would say I used to be more of a film guy.
1:01:24
Yeah. What's your favorites? If you want to list off any
1:01:25
kind of the classic wannabe film student classics, are we talking Scorsese, Pulp Fiction, Magnolia, Requiem for a Dream? Basically like if there was an art house film in the 90s to early 2000s that was my genre that got me into like, wow, wouldn't it be cool to do film? And then I thought maybe I could connect digital into it. Could you do film online? That just seemed too hard from a licensing standpoint. And then obviously Netflix kind of existed. So I never quite was able to fully connect the dots on these things. But the internship at Paramount was one kind of catalyst for starting Box because we were using just traditional enterprise software. And I was like, wow, it's a, like, really hard to share data, just like files going back and forth. But the same thing was happening in school as well. And so that all led the box, basically.
1:01:29
Well, a 24 is kind of giving back this sort of resurgence of the independent film, I guess, 100% in the face of all the Marvel slop.
1:02:26
I was thinking about this the other day, and a 24 is certainly the best example, I'm sure of this today. But, you know, they just don't, you know, it's hard to make a film like, you know, no country for Old Men or There Will Be Blood. Like, like, what is that movie today? Yeah, like, what is a brand new movie that is just like, original? You just watch it and you're like, what, what did I just watch?
1:02:37
My, my, you know, Six's movie Bench is Forrest Gump.
1:03:03
Okay.
1:03:06
Which iconic in its time.
1:03:06
Yep.
1:03:08
100, never again.
1:03:08
Yeah. Yeah, we. We did not make. We don't know how to make Forrest Gump anymore.
1:03:10
They will try with the sequel, though, at some point for sure.
1:03:15
I would be fine with it. No, like, Forrest Gump has a kid. Yeah, he's still right. Exactly. I think Forrest Gump has a grandkid would be like a good movie. Like, what is the grandkid of Forrest Gump doing in. In 2026 goes tropical. Yeah. But yeah, I definitely, like, I want to see more movies out there. You know, I'm a little bit conflicted on AI and film because.
1:03:19
Oh, let's see that.
1:03:42
Well, because I. The world does not need more slop on AI entertainment, but I'm kind of like in a mode where I think that AI is going to be generally a pure positive, because if I was me 25 years ago in high school, for sure I would be making a full production film that had explosions and car chases, but then there'd be, like, people that would show up there. So, like, I think that ability to just. You get to be Spielberg, you know, is, you know, completely amazing and democratizing that is incredible. And, you know, I'm concerned about, like, how do you make sure that we still get PT Anderson along the way and can we make sure that those guys exist? And then interestingly, and I never saw it, but Darren Aronofsky, I believe has either put out or going to put out an AI film. Even some of the best artists are starting to adopt this. But yeah, I definitely don't want to. What I don't want to do is just be in this TikTok feed of just films and it's just like, like, oh, this film about the car chase that does this thing and it says like we don't need that. This should be a form of entertainment and art. And let's use AI to accelerate the production process. Do the really hard CG work that you just, you had to spend way too much money on previously to do the kind of like, let's use it to test out all new kind of plot ideas.
1:03:43
Yeah. Previz.
1:05:17
Yeah. Backgrounds and that's incredible. And all those things are super incredible. I still like the, it's very nostalgic but I still like the idea of like this is a camera and a person and a person that says action. And let's hopefully surround AI around that. But we'll see how that plays out.
1:05:18
Yeah. I think one of the things that stability AI made an impression on me was well, at least now we can remake Game of Thrones Season 8. Like it was meant to be, not rushed.
1:05:36
Yeah. And then you watch I have a six and a half year old and you see a lot of these kid movies and you're like, yeah, that probably will be AI. I don't totally know the job math because I don't know how many animators there are today, but I actually think weirdly, I think we could be producing more high quality, maybe even slightly educational kids entertainment. And so maybe that's a positive is like we could just have more like you could just have a Pixar for things where kids learn stuff. And it used to be these very lo fi kind of lesson things.
1:05:48
I mean we had Teletubbies so we
1:06:21
could have way more of that. And maybe every animator that today is making a Pixar film is now we fragment that out, but now they're responsible for more content and they've got AI agents running. So I think there's some optimistic scenarios on the entertainment side is there's a lot of great use cases for being able to do generative media edutainment as well.
1:06:24
I guess one question, it's kind of a self serving one and almost an advice side of the question. One of the things I just really enjoyed researching you was that Michael Arrington had some influence in the box journey because he went to his house party.
1:06:47
Yes.
1:07:01
And that's how you got funding.
1:07:02
Yes.
1:07:03
One of Leighton Spaces. That's a deep cut, right?
1:07:04
Yeah, very deep cut. That's a.06 deep cut.
1:07:06
Yeah. I mean, do you want to tell that story? I don't know if you've told it.
1:07:09
It's not much of a story.
1:07:12
It's like a random intro.
1:07:13
Right? Well, it was just. He used to have house parties. Tech Run Chat, had these house parties, and it was probably no different than somebody's doing a house party in SF.
1:07:14
Just go.
1:07:24
You just go and you meet the VCs and founders, and I'm going to make up examples. So I don't want to. There'd be like, Chad Hurley over there pitching his YouTube to people. And that's just how it worked. And it was just like, wow. That was this era where all these new companies were emerging. And I met our first investor in Silicon Valley at one of these house parties, Emily Melton, who then brought us into dfj. That became our Series A. So that was all because of Arrington's backyard party.
1:07:24
One of my aspirations for Linspace is to be as helpful, influential, whatever, as TechCrunch was back in the day. What would a new TechCrunch today look like? What should I do? There used to be TechCrunch Disrupt. I could do that with my conference, but I haven't done it yet.
1:07:54
It.
1:08:11
Well, I mean, I think.
1:08:11
I don't know.
1:08:13
Well, you know, actually, interestingly, I would. I would argue Disrupt came after the period that was the. Was that deep cut period. So. So I think disruption, you know, ended up being, you know, you know, catalyzing. I don't even. I think cloud flare launched Disrupt.
1:08:14
Yes.
1:08:29
Is that the story? Right.
1:08:29
They were runners up.
1:08:30
Okay. Okay. So, like. So, like, I think anytime. Anytime you can be in a. A launch pad is just great because it draws in people that are trying to do in that creative moment. And whether it needs to be a contest or just everybody gets five minutes and you're fundraising, who knows? But for what it's worth, I don't have that much advice, because I think you're already doing it effectively. I just watch the YouTube videos late at night from the events. I haven't been to one of your events, but from the. From the camera angles, it looks like everybody's there. So what's great is that people are going to be in the audience as like two random people and they'll be like, you know, the next, the next big AI company will come from, you know, people coming to a meetup because they were like, I came in from Chicago and I'm from, you know, Poland and let's go do a startup. Like that's the magic of the valley.
1:08:31
Big smarty found his co founder at aie. Oh, and I know if at least one marriage, that's, that's.
1:09:24
Wow, you have marriages already? Yeah, I don't. I never heard that about.
1:09:28
That's my favorite KPI.
1:09:31
Wow. We have AI marriages at the AI engineer conferences.
1:09:33
These are both humans, to be clear.
1:09:36
Okay, that's a very good clarification.
1:09:38
I like that. You have to check.
1:09:40
Yes, that's a very good clarification.
1:09:41
No, but I think you have, you're, you're insightful business leader with like a lot of thoughts on media. So I just figured I would.
1:09:42
I mean media is such an interesting space right now because, because you know, with the Go Direct model, every company is going to have to be a media company.
1:09:47
You are going to, you are the OG GO directly.
1:09:56
Yeah, but, but, but you know, we're, we're still like, like I think what, what you guys are doing and I don't even know all the overlapping relationships but like I watched your guys videos of your events, watch your event videos. But like it's clearly like this is the new format. Right. Companies have to become channels to communicate with audiences. I think the resurgence. Resurgence maybe is a bad word because it implies decline. But like, like Devrel is hot. Like the hottest thing of all time right now. If you could produce a freaking factory of Devrel people, there's just like unlimited jobs right now on the other end of that because everybody needs their services and APIs to be used by agents and so we have to all find a way to like, hey, look at me, please come over here, agent. And that's a content game. Like how do you get the agents to see your stuff.
1:09:58
Yeah.
1:10:50
And know your APIs and like this is like a new world that, that we are in. And it's going to be a, it's, it's going to completely be a digital marketing, you know, kind of world that we're in.
1:10:50
Yeah. For what it's worth, I'm trying to help by doing little writing boot camps and basically turn into a devil boot camp where, you know. Well, it's a demand and supply problem. There's this huge demand, there's no supply.
1:11:02
Why is there no supply the one.
1:11:15
The really good ones work for themselves. Creator economy screwed you over.
1:11:17
So I see. So substack and YouTube payouts and that's. Is that they're making Patreon.
1:11:21
Yeah. The most talented guys are making, you know, millions and just working for themselves.
1:11:28
We don't want them to make that much money. We need to be able to hire people.
1:11:33
I mean I think, I think like you know, do do what some companies are doing. You know, not saying it's my situation exactly, but like give them equity and like it probably would be worth more just like sort of helping them out.
1:11:37
Well, they are getting. Oh, sorry. As full time employees or not.
1:11:47
I'm part time.
1:11:49
You need full time.
1:11:50
I'm part time.
1:11:51
Yeah, but, but you're, you're. You're NF1. Like we need like also people that are full time.
1:11:51
Yeah. My classic joke or like observation was like this was when HubSpot bought like their. They bought like a newsletter business and then they bought the. My first million. Like the podcast. You must know Dharmesh are.
1:11:57
Etc.
1:12:08
So he's like obsessed with this guy. So my conclusion was like every company must either build or buy a media company.
1:12:09
Yes.
1:12:14
Right. And until you, unless you realize that you have to take it that seriously that you are running a media business in your company.
1:12:14
Yes.
1:12:21
You will never be good at it.
1:12:21
Yes. 100. Yeah. Yeah. No, we're very much taking that seriously. But, but still. And yet Devrel, I mean I got to do one plug. I don't. We're hiring in Devrel.
1:12:23
Yeah. Like, no, these are all engineers here.
1:12:31
Like, yeah.
1:12:34
And like you've made it. Like. And I just said every, every agent needs a box. Like let's go, let's go.
1:12:34
No, that's the headline. We are hiring Devrel to make that happen. But yeah, I think Devrel is like the future job. So we're all just going to be doing Devrel in some form.
1:12:38
Okay. Yeah.
1:12:47
I mean what is fd?
1:12:47
Developers are ruling the earth. Yeah. What is fde? I don't know.
1:12:48
No, it's Devrel.
1:12:51
Yeah. Okay.
1:12:53
No, you just, you're going.
1:12:54
Isn't it just like glorified consulting? That's this.
1:12:55
Sure. I mean, I guess nobody can like actually, you know, fully define this, but I think it's, it's micro Devrel. Like you're in the company, you're helping them with the services, you're doing a little bit extra implementation. But yeah, so I think we're all the thing that's gonna happen on the ledger of Software is we're gonna produce far more output of code and thus features per dollar. But on the other end of this, we're gonna actually end up spending probably just as much on how do you get all of that stuff to the customer. And that's going to create a new set of roles that we are all doing, partly because either because there's so much choice now, you have to kind of fight for attention there, or because the stuff is just changing so quickly that you have to technically help your customers along the journey. So I just think, like, this is why I always laugh when people say, you don't need to be an engineer, don't do computer science. I actually think that is still one of the most protected job categories, because things are only getting more technical, things are only going to get harder, and anybody in a technical position is in the best position to get agents deployed, get them built, get them adopted, build the custom code software for the IT system, all of that.
1:12:57
Yeah. My classic founding story of why I picked AI Engineer as a title and as a theme for this podcast, as a theme for my conference was back in early 2023, someone nonsense came to me and said, like, I'm all in on AI. What should I do? And I was like. I just looked at her, I was like, God damn it, there's nothing you can do. Like, engineers are about to get so much more powerful than you. You don't even understand.
1:14:16
Tell me that it's a good. Should she go and then learn?
1:14:38
No, I didn't. I didn't say any of that to her. I'm not that honest.
1:14:40
I hope somewhere out there she went to some online academy.
1:14:45
Exactly. Learned to code. Yeah, but there's a lot of people, like, there's a lot of people who believe AI too much, and then they're like, well, you don't need to learn to code, so I won't learn to code. And then there's a bunch of us who are just in that sweet spot of we can code and we can wield AI a thousand times more effectively than you can. And, well, who's going to win here?
1:14:48
I think this was another tweet, but it was the observation that really, software engineering for the past 30 years was the primary career track for technical high agency people that wanted to have a large outsized impact on the world. And software was a means to do that Right.
1:15:06
Effectively.
1:15:20
And so, yeah, with AI, is it like that AI could eat software engineering, or software engineering could eat all their
1:15:21
kind of domains and discipline those same principles, then get applied to every other
1:15:27
and those same people, right?
1:15:31
Yeah, exactly.
1:15:32
I mean, GTM engineering is that.
1:15:33
Well, this is the. You know, anybody who believes that an enterprise, and I'm mixed on this is. But if you believe that an enterprise is going to build its own software for all of its problems, then you must be the most long on computer science as a discipline of all time. Because guess what? Most of the economy does not have enough engineers to then maintain all those systems, to update all those systems, to figure out the relationship between the business problem and what the code needs to do to go and actually manage that. And so that's a very pro engineering job argument of what the future is going to look like. I go back and forth on are you going to really build all these things versus no prepackaged software, but no matter what, there's going to be 10 to 100 times more code. So I think you can be very long engineering right now as just purely on the dimension of software is going to become increasingly more important once agents are, you know, turning everything into software.
1:15:35
Yeah. All right. Three software guys say software
1:16:38
not biased at all.
1:16:42
But Aaron, your inspiration, it's such a pleasure.
1:16:44
All right, good to be here.
1:16:46