Inside OpenAI’s Agentic Browser, Atlas

56 min

•Feb 11, 20264 months ago

Summary

Ben Kovach and Darren Vengroff, veteran browser engineers from Netscape, Firefox, and Chrome, discuss ChatGPT Atlas, an agentic browser that integrates AI assistance directly into web browsing. They explore how Atlas changes user interactions with complex web applications, the technical challenges of building a modern browser, and how AI coding tools like Codex have accelerated their development velocity.

Insights

Agentic browsers represent a shift from browsers as invisible conduits to active assistants that help users navigate ambiguity and complexity in web applications
The primary unlock of integrated AI in browsers is automatic context awareness—users no longer need to copy-paste content between tabs to get AI assistance
Building a differentiated browser requires substantial engineering investment beyond Chromium's baseline, including complete UI rebuilds in modern languages like Swift
AI coding assistants have fundamentally changed prototyping velocity, allowing engineers to validate ideas in hours rather than days, though human judgment remains critical for complex systems
The future web will likely remain bifurcated between delegated agent tasks and human-driven exploration, rather than fully automated experiences

Trends

Agentic AI shifting from chat-based interfaces to deeply integrated browser experiences with automatic context awarenessIncreased adoption of modern languages (Swift, Rust) over legacy C++ for browser UI development due to talent availabilityAI-assisted code generation becoming standard practice, with majority of new code potentially authored by AI tools like CodexComplex web applications (Google Docs, AWS) becoming primary use cases for agentic browser capabilities due to navigation complexityProgressive disclosure of AI capabilities in consumer products to balance simplicity with power without overwhelming usersDeclarative web standards (HTML, semantic markup) gaining renewed importance as models interact with web content visuallyUser agent strings and web standards evolving to accommodate non-human but personalized AI operators distinct from traditional botsBrowser engineering becoming more accessible through AI-assisted development, reducing barriers to entry for new browser projectsHybrid human-AI workflows emerging where AI handles routine tasks while humans focus on judgment-based decisionsTab management and workspace persistence becoming increasingly important as users maintain complex multi-task browsing environments

Topics

Agentic Browser Architecture and Design AI Context Integration in Web Browsers Complex Web Application Navigation (AWS, Google Docs)Browser UI Development in Swift vs C++Chromium Architecture and Customization AI-Assisted Code Generation and Prototyping User Agent Strings and Web Standards Evolution Progressive Disclosure of AI Features Tab Management and Workspace Persistence Declarative Web Standards and Machine Readability Computer Use Models and Visual Web Interaction Settings Panel Automation Through AI Web Memory and Context Retention Cursor Chat and Contextual AI Assistance Browser as Operating System Paradigm

Companies

OpenAI

Developed ChatGPT and Atlas agentic browser; created Codex AI coding assistant used extensively in Atlas development

Google

Chrome browser mentioned as reference point; Google Docs discussed as complex web app challenging for agent interaction

Amazon Web Services

AWS dashboard cited as canonical example of complex settings panels that agents can navigate more efficiently than hu...

Mozilla

Firefox browser mentioned as part of Darren's career history working on open-source browser development

Netscape

Early browser where both Ben and Darren began their careers together before Firefox and Chrome

The Browser Company

Founded by Josh and Hirsch; Darren previously worked there; developed Dia browser as predecessor to Atlas

AugmentCode

AI coding assistant for enterprise teams; sponsor providing context engine for understanding large codebases

People

Ben Kovach

Head of Engineering at OpenAI Atlas; veteran browser engineer from Netscape, Firefox, and Chrome

Darren Vengroff

Member of Technical Staff at OpenAI Atlas; browser engineer with experience at Netscape, Firefox, Chrome, and Browser...

Josh

Co-founder of The Browser Company; mentioned as friend of host Dan Shipper

Hirsch

Co-founder of The Browser Company; mentioned as friend of host Dan Shipper

Dan Shipper

Host of AI & I podcast; daily Atlas user; runs media company using Atlas for editorial workflows

Quotes

"The big unlock that I had with Atlas is I realized I never need to look at a settings panel ever again."

Dan Shipper

"It's not just the pace of development, because I think to get a feature to work right, it's always going to take a few iterations. It's how quickly you can decide that something is worth pursuing."

Ben Kovach

"The thing that you know, like, what do I do with this? This is something that we we hear a lot from people. But then also we hear some aha moments as they go on the same journey that you have and begin to figure out some use cases for it."

Ben Kovach

"I think as people use ChatGPT for more things in their life, they realize that maybe they should start more of their queries with ChatGPT, right? You start to learn that for yourself at a certain point."

Darren Vengroff

"A browser really is. But you want it to also be approachable and easy. And you got to think about like, what are the patterns people do and how can we meet them in those moments, right?"

Ben Kovach

Full Transcript

I think one of the things that has excited me about this world is it's not just the pace of development, because I think to get a feature to work right, it's always going to take a few iterations. It's how quickly you can decide that something is worth pursuing. The big unlock that I had with Atlas is I realized I never need to look at a settings panel ever again. You're not alone. If you work in large codebases, you know what it means to hold too much in your head. Which file imports what, what service depends on what database schema, and what will break if you change this one line. The bottleneck isn't writing code, it's holding the entire system in working memory long enough to make a decision. AugmentCode is an AI coding assistant that offloads context. Its context engine understands your whole codebase, including what's in your current file and the architectural shape of your entire system. It works with multi-language interactions, legacy code, and the dependencies that aren't documented anywhere. The system is what makes it work, and Augment documented it in their AI-powered engineering at scale playbook. Inside, it includes how to assess your current state, the four-phase framework for moving from individual experiments to team-wide deployment, ready-to-use checklists, and the specific workflows that produce 30% faster PR velocity and 40% shorter merge times at companies working on code where mistakes are expensive. This is designed for enterprise teams working on high stakes production systems. It's built for compliance, correctness, and maintainability. It's built for the moments you're not just prototyping, but shipping code that millions of people depend on. And teams are seeing measurable results. 30% faster PR velocity and 40% shorter merge times. Download the AI-powered Engineering at Scale playbook at augmentcode.com slash resources slash AI-powered Engineering at Scale. That's augmentcode.com slash resources slash AI-powered engineering at scale. And now, back to the episode. Ben and Darren, welcome to the show. Hey, thank you. Great to be here. Yeah, likewise. It's awesome. So for people who don't know you, you are both building ChatGBT Atlas, which is an agentic browser. Ben, you are the head of engineering. Darren, you're a member of the technical staff. I believe you both worked on Chrome originally. Is that true? That's right. That's right. We worked on a number of browsers together and for a long while. Oh, that's really cool. So I didn't realize that you're, this is like a, it's an evolving partnership through many different products and companies. That's really interesting. We worked together first at Netscape, then on Firefox together for a few years, and then with Chrome, and now Atlas, which is super exciting. Absolute OGs. Okay, this is really cool. So I'm using I'm a daily atlas user and I switched from dia which I know darren you used to work at the browser company I'm good friends with josh and hirsch so if you're if they're listening you know maybe there's a way to get me back but atlas is atlas is is pretty good what's really interesting to me about using atlas and using just just really agentic browsers is ice for the first couple days I was like I have no idea what to do with this like I know it has this power but But I can't think of a time when I might want to use it. And now I'm just like every single day, there's like 50 different things that if I had to like click through another fucking form or settings page, I would like blow my head off. But isn't that kind of the journey that people have with AI tools in general, like ChatGPT or these coding tools? You kind of don't really understand the power until you get into it. I think that is true. I didn't quite have that experience. Like the first time I just saw it, like writing, GBT3 writing stuff, I was like, whoa, this is crazy. But yeah, I guess that is true. I guess I'm curious from both of your perspective, like if someone is listening and they're like, I know that agentic browsers are a thing and maybe I've tried it, but I actually don't even know why I would use this or like what it's useful for. What is the sort of vision for agentic browsers? And let's try to be more specific than like, yeah, it just does everything for you. You know, like what is the like what are the real day to day things that agent browsers change about how you might use the web? Yeah. So I think that, you know, maybe the future will get to a place where like more and more of your workload can be can be automated. And I think we're making progress in that direction. But but today we wanted to design Atlas with this idea that you could bring ChatGPT with you wherever you go on the web. And so, yeah, I mean, I think the thing that you know, like, what do I do with this? This is something that we we hear a lot from people. But then also we hear some aha moments as they go on the same journey that you have and begin to figure out some use cases for it. This is something that we actually want to take some of that that learning that we have from how people are using it and help offer more proactive advice to people. I can product to help them figure out how to optimize use of the tool. But I think today, like one of the things that I noticed when I use Atlas versus when I go back and use a sort of pre-AI browsing environment, I find myself just able to ask just a lot more questions and just be more knowledgeable about a topic. If I'm doing online shopping, I can feel confident that I'm getting the best deal or I have the right coupon code or I have all that sort of stuff. If I'm like researching a topic that's of interest to me, I can sort of brainstorm different viewpoints on it. I can just sort of have this sort of friend or advisor that sort of comes with me and I can just like have this conversation with it. And that's just made the web a lot richer and more dynamic. Can you make that more concrete for me? Because I think some of those things, someone might be listening and being like, well, yeah, I could do that with ChatGPT now. That's what ChatGPT does for me. So what does it mean to have that in the context of your browser? It just means that you don't need to go. I think for anyone that's had ChatGPT in a tab, you probably have the experience of going and taking some content from another tab and pasting it in and asking a question about it perhaps. Whereas when you have a browser that's built with this at the core of it, that context is provided directly to the model. So you kind of don't need to keep repeating yourself. ChatGPT will just see what you're looking at and be able to offer its thoughts on that. I think that's really the big unlock and the power of this whole thing. It's like I think as people use ChatGPT for more things in their life, they realize that maybe they should start more of their queries with ChatGPT, right? You start to learn that for yourself at a certain point. You're like, why am I doing things the old way? That was very manual. But instead, I should ask this AI model. It will help me save some steps. And this browser puts that at the center of it. That's what the URL bar will guide you towards for your queries, right? It helps you get into ChatGPT with a lot lower friction. And as Ben was saying, you know, if you're on a web page and you're scratching your head about something, ask ChatGPT is right there. You can ask it. It has the context. You don't have to copy-paste and say, can you now answer this question? So it's just a lot more streamlined. That's kind of the core value proposition of this whole thing. And on top of that, we build features that people can opt into around web memories. So if the agent or the model is there on your journey, you can also query it later about things that it knows. And that can be very powerful to you as you're trying to get back to things or trying to make sense of just all the things in your world. And, you know, whatever kind of journey you're on, whatever research project you're on, whatever work you're trying to do, having it there sort of passively can be very powerful, too. I got to tell you, like, and hopefully maybe this can be like a little bit of a user research session, too, because I feel like I'm doing something with this that I'm very excited about. And I'm curious if you guys are doing it, if you're seeing other people doing it, how you're building for this. So the big unlock that I had with Atlas is I realized I never need to look at a settings panel ever again. You're not alone. Yeah. And that is such a refreshing feeling. I think it's both refreshing for users and for software developers. I think it's refreshing for software developers because you don't have to worry about adding another knob because someone like the agent is going to do that. So you can make software more customizable more easily. But for users, like I think the canonical example for me is looking at the AWS dashboard. I don't know if you like I assume you guys have both logged into that and it's like 50 different services. And then like you're the settings, the like the permissioning system is like it's like launching a nuclear like missile in order to like do anything. And I run a company and we have like 20 people. And so I'm sort of constantly being asked, hey, can we like add a seat to this? Or like, can you change the permission on this thing? And it's like some account that we set up five years ago that I don't even remember. You don't do these things so frequently. And so, yeah, it's like not top of mind how to do it again. Yeah, my example of this that I've been using was I used it to help me create Google Forms to do user research. and you know google form builder i think is maybe less complicated than the aws control panel but still it's not something i use every day and so i think for me to be able to ask the agent to go off and do that and have it do that in a few minutes and come back and i can just submit uh you know certainly allowed me to get to the the meat of the problem much quicker yeah it's it's one of those tasks where like there's a certain amount of activation energy and you don't have to spend the activation energy anymore. Darren, what were you going to say? I was just saying it's that time of year to go into workday and figure out how to get my year-to-date pay stub so I can share that with my tax advisor. And I'm like, where do I go again? They moved it again, you know, and I don't go there often enough. So I think it's super powerful for navigating, like, web apps, especially complex ones, like you said, with AWS. And it's just that's that's that's one of the definite superpowers of these things. So how are you seeing that evolve with your user base? Like what percentage if you can share what percentage of people have actually figured that out? Because I it's super powerful, but I also imagine it's not necessarily a daily use case. It's like a couple times a week. It is a lifesaver. But other than that, I may not use it for this. Like I'm only going into settings like a couple times a week. So I'm curious, is that one of the use cases you can hang your hat on and are people really discovering it or is it still sort of nascent? I don't know if we have the exact stats on the agent browser drives kind of thing, but we do know that just in general, people interacting with that side chat is a main use case for the browser. And I think, you know, like probably most people are using that on a regular basis just because it is kind of the main value add, so the main surface. In terms of what tools or capabilities people use from that, I don't think we've got that broken down quite the same way. You know, what we see is sort of what you'd imagine is that people are, you know, when you first come to these tools, you don't know all the things that it can do. And that's definitely a topic for us. How do we introduce people to things, but not also overwhelm them at the same time? You know, you want to have you want to balance something that's familiar, simple, seems approachable, but also is powerful under the hood. So you get rewarded as you discover further. You know, I think that's that's kind of the nature of UX development. Right. You can have a very powerful, complex tool. A browser really is. But you want it to also be approachable and easy. And you got to think about like, what are the patterns people do and how can we meet them in those moments, right? What are some of the decisions that you've made to like to do that, like to enable the sort of progressive disclosure of complexity so that Atlas is really intuitive? But yeah. One of the features that is pretty powerful, but or relatively, or I should say we struggle with how to expose it, is this feature called cursor chat. if you're interacting with a form field in the browser, you'll see a little icon, a little chat tpt icon, and you can hover over and then interact with the model and interact with the model in the context of that specific form field. We struggle with how in your face to make this, right? We want people to be aware of this power. It's actually really powerful. The people who use it are people who rave about this, helping them compose and that sort of thing. But actually, a lot of people don't discover it, even though we have this little hint. And so it's always a question, how big do you make that hint? How do you introduce this to people? Certainly during onboarding, we already have a lot of things we try to tell people about because this is an AI browser. There's new things to learn, fundamental things like web memories and capabilities like side chat and so on. But we can only tell you about so many things at once. So that's been a challenge for us from a design perspective, for sure. That makes sense. I've seen that that that little icon I have not clicked it so now I need now I feel like I need to click it it's one of those things where it's like it's a another advantage of having this sort of fully integrated with with your browsing environment as opposed to just having chat gpt in a tab is that you can kind of summon it into the specific text field So this is a feature that my wife uses quite often She has to write emails She's involved in a number of different things, and it just helps, like, speed up her workflow quite a lot having it there. And the thing is, is it's not just – it is like your ChatGPT there. So it has your personalization, your, you know, custom instructions, all that kind of stuff, you know, behind it. So it, you know, writes the way you want it to write and all that. So that's pretty cool. For all these speaking, we're really interested in the whole idea of like how the model can interact with the web and the ways that we interact and how can it dovetail with what you're already doing. So, you know, the agent is that you invoke with slash agent inside chat is like a very all in sort of manifestation of that where you're asking it to take a task and go interact directly with the page and all the push the buttons and do everything for you. and that's that's sort of like maybe the grandest representation of this kind of idea but there's all these sort of smaller in the moment kind of versions of that you know like we said with cursor chat or just the fact that side chat has the context and when you ask a question it can understand what you're doing yeah that's i think that's one of the the most valuable parts of it is because it's in my browser it's logged into all my websites and it can act as me on any number of websites. And so even though it's not me, it's like it has all the same affordances and all that kind of stuff. And I'm curious about your opinion on how the web will evolve for that. Because right now, it's really designed for this bifurcation between bots and humans. And there's a human experience and there's a bot experience that you're presumed to be crawling. and there's, you know, bots that text and robots that text and all that kind of stuff. And this is sort of this sort of in-between thing where it's personal and it's driven by you, but it is not you. And like, yeah, how do you think the web should evolve for that kind of thing? Yeah. So, I mean, this is a super interesting one. And I do think over time, like there'll have to be some like notion of, you know, maybe a non-human operator that is acting nonetheless on behalf of human like for a specific request because I see these things as quite different to example to for example web crawlers you know web crawlers are out there traversing websites and you know synthesizing across that for the benefit of many whereas like a you know this is you could do the same thing you know admittedly much more painfully if you were to write a you know local shell script that would go off and obtain the content of a website maybe issue like the direct HTTP requests to the resources that you wanted and so on and And this is much closer to that where there is like your own personalized intent behind it. So I think just from how we think about these things conceptually, that's how I look at it. In terms of how things evolve, I think one of the most interesting things about it is at some level, like stuff doesn't need to evolve because we have, you know, computer use models that can just go off and like read the screen and click and do all that sort of thing. I think a lot of the evolution here will come from, are there ways to make that more seamless? Are there ways to make that higher performance so that we can do many things at once, just basically support sort of scaling this up? Because I think what we really want to do is have something that can do many things on your behalf simultaneously over the course of time. And that will just require a lot more sort of interesting evolution of the platform. And I think there are probably a variety of different ways to do that. One of the wonderful things about the web is that it's a very declarative medium. And so this is something that we've begun to tap into, but I don't think we've fully realized the potential of that interesting property of the web yet. Can you explain for you who are listening what declarative means and then why that is an interesting and important property? Yeah. So the web, you know, powering the web is this technology called HTML, Hypertext Markup Language. and it's a way that all of the web pages have built, all of the UI that you interact with on the web today is a combination of just text formatted in this specific manner. There are these things called tags. So a button might be a button tag that encloses the text that is rendered on the button. And so what the browser does is it reads all of this and it knows that if it sees a tag that says button or input or something like that, that there is a specific meaning to it. And then what's interesting, for example, with forms is that a form is the way that you do effectively like a call to a remote function with some data that the user provides. So when I fill out a form, for example, to run a search, I take values that I there's a text that I type into this this field and then I call some remote function with that text and then I get another page. And so there's all of this sort of, this is sort of inherent to the way the web is designed. And it allows, you know, the browser itself is referred to as a user agent, specifically for this reason, in that the browser is designed to go and read all of these tags and figure out how to present it to the user in a way that is satisfactory to them. And so Atlas is sort of a user agent agent. That's right. Do you think we need different user agent strings for, is that like a, is that one potential solution or extension of the HTML standard? I'm not sure. I think just like looking at the way the web works, you know, there's a lot of, I know, just thinking back to various browsers that we've worked on, there's a lot of subtlety to user agent strings. And also like the situation I don't want to get into is where websites don't work because we've changed something about it. And sometimes there are sites that will check for very specific parts of that string and it will be they'll say something, you know, they'll trigger behavior based off of that. And I know early in the Chrome days, for example, we would see, you know, the behavior like that where it would cause sites not to render properly. And so from that sense, you know, with Atlas being predominantly Chromium, we feel like it's just, you know, from a developer perspective, they should perceive it. They should build for it the same way that they build for any Chromium based browser. But there's probably other signals or stuff like that that we will need to come up with over the course of time. It's just it's very early for us to figure out what that looks like. But your original question about like how the web might change is a really interesting one. I think as more and more of the user agents are perhaps driven by agents or models, you know, bought that not end up having some bearing or impact on how developers create their content. I think at some point maybe there is an inflection point there. If, you know, it would be interesting to see how the ecosystem evolves. Right. You know, people create content for human consumption. In the past, we've always we've had moments when we were pushing heavily semantic web. Make a web that's more understandable. Look at all the benefits that come from that. Screen readers will work better. Websites will be more machine understandable. What's happened in this now with these models is they're able to make sense of the websites that aren't very ordinarily machine understandable. But because these models are interacting with it in the way humans do, they're able to glean the information just as humans do. And that's kind of a big unlock for the computer to help you because it can understand these websites. But as that unlocks more and more computer models, based models, driving these systems, these websites, maybe who knows, maybe the websites start changing as well. I mean, it reminds me of discussions about what happens when all the code is being created by coding agents and the coding agents are directing the coding agents. And, you know, where does everything go and what programming language ought they use and all these kinds of things? You start to wonder maybe there's some sci-fi stuff there to kind of dream and imagine how things might evolve. I would be lying if I told you I know, but I can imagine things changing. totally that's the kind of the interesting thing is like right now browser uses it's a really good way to bootstrap this because you don't have to change anything for it but once like once you've bootstrapped and everyone's using agents i'm sort of curious if um that is actually the most efficient way for example um ben you were you were talking about um you know having it do multiple things for you at once you know having a watching atlas scroll through websites is kind of slow and there there may be a more agent native way to allow agents to interact with websites like mcp for example are you guys thinking along those lines or is are you still really just focused on the core stuff we're thinking through like a whole host of different different technologies to help us drive drive you know web browsing um i think as well like beyond the the atlas team just to think broadly about what ChatGPT is doing. Like we've also launched this app ecosystem around the product. And I think that that's sort of a very direct way in which we're encouraging developers to build for a more dynamically composed world. But that's of course in the browser, but it's maybe not like part of Atlas in particular. So I think some of these things are, we're gonna try a few things and see how it works out. Plus, you know, a lot of the technology that's powering the ChatGPT Atlas agent now, you know, has its roots in the original operator tech preview that OpenAI put out. If you rewind the clock back to then and compare the performance then to now, you start to see, you know, sort of the rate of improvement. There's been like leaps and bounds improvements for the quality and performance. And we're kind of on that curve of figuring out how to optimize and make these things work a lot better. And I think there's a lot of exciting work ahead and opportunity ahead. And this was a meaningful step to share with people. And I think it opens the door to imagination and possibilities and for people to have some real things that it can help you with, like what you were talking about earlier. But there's so much more to come. you know do you think agentic browsers will make the web unnecessary and by that i mean uh do you think there's a chance there's a there's a future state where it actually becomes just better to stay inside of chat gpt and your agent is going off and doing all the browsing and then um you know maybe it's maybe it's building like a custom website for you in real time based on what the brand or the writer has wants wants you to see but you're not actually like seeing rendered HTML in the same way that you would have been, you know, five years ago? I don't think so myself. And maybe, maybe this is just me not, not a mad, not being imaginative enough yet about like where chat GPT will go. But I do think that there's an aspect of, I think we will see people delegate a lot more to these tools, especially as they grow more powerful and they're going to get like amazingly powerful over the next 12 months. But I still think that there's a lot of stuff that people want to do themselves. And whether it's, you know, even just things like entertainment or like, you know, there's aspects of, you know, shopping or trip planning that I'd like, I do want to be deeply involved with. And it's probably going to start at least with some like curiosity that I have. And I'm going to go out there on the web and find it. And I think one of the most exciting things about the web is it has so much stuff on it. And so I'm always like excited to explore it. And I don't think that will ever go away. Maybe it will be different. Maybe there will be folks that, you know, maybe they're like the kids today that haven't sort of lived in a world without some of the stuff. Like they may have a different view on it, but that's just mine. I don't know about you, Darren. I know. I think people like window shopping. I think people like browsing. I think people like that sort of thing. Or, you know, I love taking Waymo, but I also love driving my stick shift car. And, you know, there's going to be moments when both are you know important there there's moments when I want the Waymo moments when I want to be just driving myself you know and I think that's kind of the future is gonna always be that way and also it depends on what you're trying to do you know I think that these models can be just incredible it's synthesizing things for you that might lead you on to the manual mode part of it right and and and you're probably going to just incorporate these things in a very natural way in your life. You're going to go between them where it makes sense to you and people are going to figure out that. But there's always going to be a need to interact with web apps, if you will, or applications. And the web is a tremendous medium to distribute those things. E-commerce, the web is an amazing medium for that. Yes, you could ask your model to please prepare you a, you know, a shopping cart of items, but you're going to want to go look at it and you're going to want to go see things yourself. You're not just going to be like, yeah, buy that for me without seeing it in most cases. And so, you know, I think there's kind of this blended world that we probably There an aspect of the AI as a you know actually like a workmate or a or something that you can delegate to And then there an aspect of the AI as a thought partner or collaborator in that sense And I think that these worlds sort of are actually elegantly, you know, it's neither one or the other. It's kind of both. Yeah. Yeah, definitely as a thought partner. This is already the case for these models. You know, when you're researching something at home, you're asking the chatbot about it. It saves you some time. Figure, you know, just as an exploration, bouncing ideas off when I'm coding, I'm doing it that way. So many things I'm bringing, you know, ChatGPT in my life to help me sort through my thoughts and what kind of problem I'm working on. And I think that's sort of what I imagine, you know, I can imagine lots of parallels to that in the future. So here's something I'm curious about. When Josh and Hirsch were first starting the browser company, I'd been friends with them for a long time. And so I actually talked to them for a little while about being the CEO. And so I spent a long time thinking about browsers for this specific thing. And one of the things that I was kind of interested in is the role of browser in someone's life. and it seemed to me at that point that mostly a browser was sort of like a taxi it's like it takes you from one place to another and it's supposed to get out of your way it's it's very utilitarian and that we might be moving to a place where maybe it's more of like a tour guide like it helps you figure out where you want to go and what you want to do and then does some of it for you but there's this interesting tension there where inserting yourself between the user and what they want to do sometimes is like super frustrating and your tour guide super annoying and people think of browsers as being I think in a lot of ways like I think of it as like a it's an invisible window pane like you don't even realize the browser is there most of the time that's the point how do you guys think about those do you think that that like dichotomy is is useful or interesting how do you think about it and how do you think about the trade-off of fulfilling the sort of expectations that browsers are more or less invisible versus helping the user get more of what they want, even if they didn't necessarily know that they wanted that thing? Well, there's a sort of duality here present in Atlas. And I say this not as a punt maybe, but just to observe that we have tried to make our browser UI like fairly streamlined and minimal so that you can focus on the thing that you're looking at. But But then ChatGPT is sort of at the heart of the experience. So it is there. And then you can choose how much you want to engage with it. I think the value of it comes from, like, I think the big thing that most people struggle with in their day-to-day life is, like, ambiguity sometimes. It's like, what do I do next in this situation to achieve whatever the objective I have is? And that's where ChatGPT is just incredibly amazing at helping with that. That was sort of the original idea that I had for this, was when I would just ask ChatGPT in my existing browser tab, like, what should I do to solve this problem? And then, like a friend that would step through, you should do, like, these three things. And then my question was, well, could you just do some of those for me? And it kind of went. You know, sometimes, you know, it's still a lot of things it can't do today, but we can make it do more of those things. Yeah, we get reports from users asking, hey, I asked Atlas, asked ChatGPT through Atlas to do this thing for me and it didn't work. We're like, great, let us know. We will keep note of that and work on those things, you know. And so it is it is it is it is that kind of thing where you start to feel like I should be able to ask it anything. I should be able to ask it to help me with anything. And so, you know, that's that's a nice start. are. I think one of the things about this form factor though is that it's very, you know, it's very familiar to people. I think most people, you know, can kind of relate to a browser. They kind of know how to use it, that kind of thing. And so there, I think it's not a huge leap. I think if you go to a world where everything is intermediated to you by some other thing, you know, it's kind of hard to know what you can do with that. Whereas with the browser, you kind of know how to just start browsing the web and doing stuff with it. And then it's the opportunity presents itself in various points along the way, like that you can at your own choice, maybe with agent mode, or especially with agent mode, you choose when and how you want to use it. And then it's really on your terms. And of course, I think probably over the course of time, we'll find people will want to use it more and more. And so you want to help show them where that's going to work well. But yeah, our goal definitely is not to be annoying. I remember the sort of original mantra with Chrome was, you know, sort of trying to like really minimize the Chrome as it were and focus on the content. And I think we want to continue to have that be the case. But in this case, the content is whatever the user is trying to get done. It's got to be a good browser first and foremost, right? It's got to actually work the way people expect it to work. And that alone keeps us busy. And, you know, there's a lot of aspects to just that alone. I can imagine. Yeah. And then, you know, how do you sort of add on to that, right? What are the things that I might not realize about why that's hard? Because, like, I'm sitting here using your product all the time being like, yeah, browsers are basically solved except for this AI stuff. I guess that's true at some level. But, like, what makes it hard that if it's keeping you busy, like, what are the sorts of things that are keeping you busy? Oh, well, I mean, if you think about it, you know, browsers have definitely evolved over the years, right? You know, rewind back to Netscape and then think about Firefox and think about Chrome and think about when Chrome first launched. And then think about all the features that have been added since. And, you know, not everybody uses all of those features, but some people use them. And we hear from those people. And Atlas has a significant subset of those features from the get-go because we knew they were important. And building on top of Chromium meant that some of them we were able to expose. But many things we had to reimagine, rebuild, figure out how to build in a new way. And, you know, some things we have not yet done. So we're in the, for example, one of the things we heard about early on when we launched Atlas was, where's my tab groups? Right. And that's a feature that Chrome added a few years back, but it certainly wasn't there in the initial version of Chrome. And I know that when we first launched into Chrome, not that many people were excited about it or used it. It was sort of a small feature until eventually it's become something that maybe a good number of people actually do care about. And we hear about those. We hear from those people because they want to carry their workflows over. So one way to think about a browser is that it's kind of like an embedded operating system. And so in that sense, you might think of a browser as an app, but I think that's maybe not the right way to look at it. A browser is closer in complexity to an operating system. It has an app runtime. It has a window manager. It has, you know, no various notification surfaces and launchers and other stuff. And so there's just a lot of complexity in building all of that stuff out. Now, you can short circuit a bunch of that. I think Darren says maybe it's a solved problem. I think for a lot of browsers it is with including Atlas. Part of it is solved because of Chromium, like the fact that Chromium is open source. It presents this just amazing, incredible baseline upon which to build. and you could stand up a browser very quickly that looks more or less like Chrome. I think our product ambition ran a bit deeper than that. I think we wanted to differentiate a bit more in our product UX. And so that caused us to take a different path, which we've written about. But that does mean that there's a bit more of this legwork for us to go and, like, make sure all of this functionality that people expect works in the way that they expect. But we think that at the end of the day, that will give us a lot more ability to sort of shape the product in new and interesting ways. Yeah. There's some various for instances, but like we we if you're familiar with the blog post that Ben was referring to, we we run Chrome completely out of process. And so our app, the Atlas app, is a pure Swift app that presents all of the browser, familiar browser UI through UI elements that we had to craft. Again, they were not just using the implementation from Chromium for any of the UI components. What we leverage from Chromium is the fact that it's great at rendering web pages and all of the accessory support associated with that. You know, when it comes to various kinds of permission dialogues and whatnot, we hook into that and we present those dialogues, but in our own UI. And so there's just a lot of very table stakes kinds of components there that because of our choice to build the app wholesale in Swift environment, all the UI components, I should say. we had to rebuild a lot of different things and of course we had a prioritization there. Although the thing that the advantageous about this approach for that is you know actually a sort of fun fact about Chromium is that much of the UI is built using C++ as a programming language which is the thing that you did when you're building a Windows app back in 2006 era but it turns out to be hard to find engineers in this day and age that want to do UI development in C++. Why is that? I have no idea. Speaking as a long-time C++ developer, I'm very concerned. I love C++. What's the problem? But, yeah, there's a lot of iOS developers out there, it turns out, and iOS developers often know Swift and SwiftUI. And if you know Swift and SwiftUI, you can be a Mac developer. And so we take advantage of that, and it's worked really well. We've been very successful at building a team. And Swift's actually a remarkable language, very much like a modern alternative to C++. There's a garbage collector, so it's got a very streamlined sort of memory management sort of setup, kind of like if you were just being really straightforward about using smart pointers in C++ and that sort of thing. So at any rate, I feel like this has worked out very well for us, And we're leveraging this to also bring the product to Windows. What percentage of your code is written by AI? Oh, man, I don't even have stats on that. But I know everybody's leveraging Codex and ChatGPT heavily as part of this project. I would say just like finger in the wind, if you had to guess. Majority of it, I would say. I can't pick the precise amount. It wouldn't surprise me if it was north of 75%, just that like most people's PRs start with Codex. Maybe there's some like dialing in that you do through the process, but that just means in terms of raw volume, Codex is like probably authored, you know, well over more than half, safely more than half of the like net new code that we have at this point. You guys have been building browsers for many, many years. You started Netscape, you worked together at Chrome or on Chrome. how does it compare being able to build a browser with codex at your side in terms of team size, velocity, all that kind of stuff? Give me a sense for what's different, or maybe it's very similar, but yeah, how does it compare? Yeah, I was going to say we have a very small team, although we continue to grow to take on a bunch more possibilities. I think one of the things that has excited me about this world It's not just the pace of development because I think to get a feature to work right, it's always going to take a few iterations. It's how quickly you can decide that something is worth pursuing. And so there will be an idea that I'll have in my head even as like a team manager where I want to see if the juice is worth the squeeze, as it were. And I will just run off and do that in codex and I'll have a build and I'll see if I like the thing or not. And if I do like it, then it makes sense to go and invest in that area. And sometimes we spend a long time, like in the pre-Codex world, sort of wondering about if you should do this or that because it takes so long even to prototype. Whereas Codex just makes prototyping a matter of minutes or hours for a lot of things. And, you know, for as long as we've spent in the Chromium code base across our careers, man, that thing's complicated and it's grown. And so being able to ask codex questions about Chromium is just invaluable. And, you know, any kind of very large legacy code base is going to have so much complexity and layers to it. And so, you know, the ability to ask these agents questions about it is just unbelievably useful. But same thing goes for figuring out how to build certain kinds of UI effects, constantly probing ChatGPT for what's the right way to set this thing up, so I'll get a good animation or something like that. Just trying to learn some new strategies with core animation or something like this So we have like Ben said a lot of our code is able to be created by Codex because you know there a lot of straightforward aspects to what we doing But there's also very delicate aspects that we're doing. We have to get in there and really study it. But these tools can be tremendous companions as we're trying to figure out, well, exactly what's the right strategy here to kind of explore the solution space. I just can't believe how useful it is, but it's been such an accelerant for this project, for sure. On the topic of being able to prototype things more quickly, is there anything like weird or crazy that you have in your head that you've been wanting to try that isn't quite that, you know, you want to share with us? Oh, yeah. Let me tell you about something I've been working on. Yeah, just in the process of, so I'm like a heavy tab user and I nerd out on like the little details of how tabs work. So like the Chrome tab strip, like a lot of the way it behaves around like where tabs get inserted, what gets selected after you close them, how like the tab strip like reflows, animates when you move your mouse out of the way. So I worked on that like years and years ago, like, you know, it's almost 20 years ago at this point. And although I have had less of a direct engineering role in Atlas myself, I do like to poke at different things. And so one of the things I've been playing with, as Darren and the team work on tab groups, I have been exploring ways to just help make sure that the tab layout and scroll position remains stable as you switch back and forth between tasks. and you might be deeply buried down on the task. You might have like lots of tabs, lots of tab groups open. You might have scrolled your sidebar of tabs down to a certain position. And then I have this moment where I want to go back and check my Gmail and I like get a like a tracking link or something and I open it up. And all of a sudden my tab strip is flung back to the top, like it gets scrolled back to the top. And so this is what happens today in Atlas. And so I was able to go off and prototype a solution to that in Codex in about an hour. where I'm actually able to go and check on something without messing with the scroll position. And it's just like a transient world where I can go and look on something quickly. So that's the kind of thing where if you're interested in just making the app better, you can go off and just do a really quick exploration and determine that something makes sense. Isn't that the best? Yeah, a lot of times we get feedback from people too about like, hey, I wish this thing or that thing or what if this is possible. And then invariably, somebody on the team will have gone off and tried it. And it's because it's not that expensive to try to Ben's point. It's really great. Do you all have mixed feelings at all? Like, I know a lot of professional programmers, even people that work at every, even people who are super psyched about AI, who are also like, it also is kind of a bummer that, you know, a lot of code isn't being written by hand anymore. And there's a certain craft to it that is maybe, you know, you just sort of like writing code. How do you guys feel about it? I like writing code, but I think I would, I like the sort of crafting aspect. There's something, they're almost like therapeutic about it, you know, just sort of, it's like art or something, you know, but I still feel like there's a lot of elements of that. But the way I really view this is it's a tool that will accelerate the mundane parts of the work. For example, I tediously did a refactoring across the code base that was a little bit tedious because each time each part was different. And I didn't really quite know how to prompt it through all of that. But then once I had done it, I needed to do another one. I was like, Codex, just do that for me. Do the other one. And it was a similar scale and it knocked it out within an hour. Right. And it was because it could follow my pattern for all the, all the times when I worked through all the quirks, it could just follow those quirks, those patterns. I thought that was amazing. And then, you know, like I said, if I'm crafting some animation or something like this, I'm, I, you know, Codex is going to be really useful to give me ideas, but I got to get in there and try it and see. And sometimes that, yeah, That's just how I work. But I find that it still is accelerating me quite a bit. And I still get that satisfaction of getting in there and crafting. I think maybe there's some version of this that will know that we've achieved some level of maybe even super intelligence with this stuff. If it can just go off and build something like Chromium or WebKit or that sort of thing of that scale with very minimal prompting. But I think we're a bit from that point. So I do think that there's an element of individual engineers have judgment that comes from experience that can sometimes see things that aren't evident in the code. Because what a coding agent is doing is it's reading the code and it's oftentimes making really good choices about things. I'm surprised sometimes at how elegant some of the solutions that Codex can come up with are. but it doesn't always hit because it doesn't always know some of the context that isn't stated there. And so that's why I think, you know, to a lot of extent, you know, Darren talked about asking Codex questions about Chromium. I think, you know, people would, you know, I remember being on the Chrome team when everyone would ask Darren questions about how Chromium worked and Darren's asking CODIS questions. There's still a need in many, especially more sophisticated, more subtle places, for that judgment to be applied. But then once you have that judgment, you just go so fast because you just tell it, like, I think you should create a cache in this format and you should put it in this place, in this package, and then it just goes off and does it at much faster than you could have. And at least myself, I don't feel precious about typing that code. You know, it's more like the idea, right? One thing that's been an interesting phenomenon is that thanks to Codex, actually, we have a lot more unit tests because it doesn't – the overhead of creating a unit test is greatly reduced when you can just prompt for what you want to have tested. And even the model is able to go and, like, consider cases I didn't prompt for because really I'm saying, can you unit test this API for me? art. I've been really impressed with this because that's a mundane task, creating unit tests for crafting the API. It's an interesting task. I'll work on that. And then once I have it, hey, Codex, can you create a bunch of tests for me? It's been a fabulous friend in that regard. And I think we've seen a lot of benefit from that. And tests are, of course, super valuable. Those tests help us not make further mistakes. So, you know, it's just been really, that's been definitely a sweet spot. Well, you were talking about getting feedback from users, asking you for things to fix things. I have a quirk that I would love to know if there's a way to make it better now, like just for me prompting better, or just to put it out in the ether, if it was fixed, it would change my life. So I run a media company. So we publish articles all the time. and there's a lot of copy editing going on. And so I have an article that I wrote that's coming out tomorrow and like it's full of edits. And the editor who does it, like some of it is, it really requires a lot of editorial judgment, but some of it is the equivalent of writing unit tests. It's just like, you know, the capitalization is wrong here and there's a comma missing here and there's a bunch of copy edits basically that are constantly being made. And we do it in Google Docs. and I've tried we have a whole style guide and I've tried to have Atlas go through and suggest changes on the Google Doc according to the style guide and it it kind of happens a little bit but then it just gives up um and says I did it and it definitely did not did like one thing you know and I think partly it's it's it's sort of the the structure of Google Docs is so complicated It requires a lot of dexterity. But I'm curious, what do you guys think? And is that something that you could fix? Yeah. Yeah. Quick question for you. Are you using the agent mode to do that? Is that what? Yeah. There's been a known issue with our agent. We call it laziness, where sometimes you'll see it say things like, oh, this task is too time consuming. I, you know, basically I give up and it's not just Google Docs, but it would, you know, for a variety of sites where the task might take like very many steps or an extremely long time to run. And especially if it's like having to like scroll multiple times to get through, like, you know, if you're imagine you could be tens, hundreds of pages, even it may give up under those conditions. So that's something that the team has been working on as improvements to that. But you're also right that Google Docs is a fairly complex web app. It's not something that it's a bit different to a lot of web content. I talked before about declarative web where there's just a, you know, a tag that you can read through and see everything. Whereas Google Docs is much more like a traditional app. It uses a canvas. It just renders text directly. When you scroll, it is the one drawing, not the web runtime. And that makes it a bit more challenging to get all of the context out. And so, yeah, I think the agent is maybe the right way to do complex things there. But the sort of laziness fixes will should eventually help with that kind of thing. There's issues when the agent has to know if it should scroll, you know, things of this sort, which can be critical for a web app. That's it's not just straight up HTML. But I have seen it excel in some cases like this elsewhere. I've been impressed watching it tediously close ads in order to reveal the content below in order to then complete my task. So I can sort of see on the horizon where it's going to, you know, those are definitely cases of complexity where it has to interact. ad-based businesses are quaking in their boots hearing hearing about uh chat gpt atlas agent clicking like x on ads to get to the actual content that you want well again it's doing what i would have done i agree i agree i'm i'm here for it uh we only have a couple minutes left I think the one big thing that's left in my mind is you guys have been doing this together for many years and been working on browsers for many, many years. Like, why do you care so much about this problem? Oh, God. It's the most interesting app in the world. Like I said, it's like a mini operating system and it's all of this amazing content. Like when I was so I got into the Web when I was a teenager. I lived in New Zealand, which is like the other side of the world. And I felt very disconnected from the world of tech, like at least at that point. I think New Zealand has grown a lot in terms of its technological prowess over the years. And the web was amazing because it felt egalitarian and that anyone anywhere could get involved in it and they could like publish a website. And then eventually, you know, when I got involved with Mozilla, that you could actually go and help shape the thing and like open source and all of that. it's all kind of tied together. And I just love it. I wouldn't, I wouldn't work on anything else. Yeah. I think for me, I have somewhat of a different, but similar origin story of getting involved in all this stuff. Found myself in college using Linux and feeling like, man, this system would work a lot better if the browser worked better. So I took a job at Netscape to try to make that browser better, you know? And, but it was so liberating. I remember that when I did things through web, it meant that it didn't matter what computer I had, I could still do those things. I think it's sort of a fantastic idea. And it's sort of fantastic that we've had this thing. And it can be better. It's kind of like this thing where web and browsers, they've been good and powerful and we depend on them, but you can all point to crafty aspects to them, the things that could be better. And so it just sort of feels like it's not done yet. It's felt that way for a long time. And so, you know, that kind of keeps me going because there's more stuff to do. There's more, more to make better. Ben, Darren, this is awesome. Thank you so much for joining. Really appreciate all the work that you've done through the years and thanks for making Atlas. It's great. Awesome. Thank you. Thanks for having us. oh my gosh folks you absolutely positively have to smash that like button and subscribe to ai and i why because this show is the epitome of awesomeness it's like finding a treasure chest in your backyard but instead of gold it's filled with pure unadulterated knowledge bombs about chat gpt every episode is a roller coaster of emotions insights and laughter that will leave you on the edge of your seat craving for more it's not just a show it's a journey into the future with dan shipper as the captain of the spaceship so do yourself a favor hit like smash subscribe and strap in for the ride of your life and now without any further ado let me just say dan i'm absolutely hopelessly in love with you