AI For Humans: Making Artificial Intelligence Fun & Practical

GPT 5.5 Just Dropped. OpenAI Accelerated The AI Race (Again).

29 min

•Apr 24, 20263 months ago

Summary

OpenAI released GPT 5.5, a faster and more efficient flagship model designed for long-running agentic tasks with improved reasoning capabilities. The episode covers benchmark comparisons with Anthropic's Claude, new Codex features, GPT Image 2 capabilities, and practical demonstrations of projects built with the new models.

Insights

GPT 5.5 excels at long-horizon tasks and iterative agent work rather than raw benchmark dominance, suggesting a shift toward practical capability over theoretical performance metrics
The competitive landscape now features specialized model philosophies (OpenAI vs Anthropic) similar to iOS vs Android, where different companies optimize for different use cases and user preferences
Speed-to-demo has become a critical competitive advantage—developers can now prototype functional games and applications in under an hour with minimal coding
Image generation models integrated with code generation unlock new workflows for solo developers and small teams, collapsing traditional design-to-implementation cycles
Iterative model development cycles are accelerating across the industry, with companies shipping incremental improvements faster than traditional quarterly release schedules

Trends

Shift from benchmark-driven model evaluation to real-world use case performance and user preference differentiationAcceleration of iterative development cycles in foundation models, moving from annual to monthly or weekly improvementsIntegration of multimodal capabilities (text, image, code) enabling single-developer full-stack application creationAgentic AI systems becoming practical for consumer and business applications with improved reliability on long-running tasksOpen-source model community matching or exceeding commercial model release velocity with distilled and optimized variantsCost optimization becoming competitive differentiator as token efficiency directly impacts end-user pricing and adoptionBrowser-based and computer-use capabilities maturing to production-quality for web automation and UI interaction tasksDemocratization of AI application development through improved model reasoning and reduced technical barriers to entry

Topics

GPT 5.5 Model Architecture and CapabilitiesBenchmark Testing and Model Evaluation MethodologyLong-Running Agentic AI Tasks and ReliabilityCodex Features and Browser AutomationGPT Image 2 Image Generation ModelAI-Assisted Game DevelopmentIterative Model Development CyclesToken Cost Optimization and PricingMultimodal AI Integration (Text, Image, Code)Open-Source Model CompetitionAI-Assisted UI/UX Design WorkflowsComputer Use and Browser ControlShared Agents in ChatGPTSpeed-to-Demo Application DevelopmentClaude Opus 4.7 vs GPT 5.5 Comparison

Companies

OpenAI

Released GPT 5.5 flagship model with improved reasoning, efficiency, and long-task reliability; also released Codex u...

Anthropic

Competitor with Claude Opus 4.7 model; addressed user issues with cost overages and rate limits; positioned as altern...

Sponsored episode by providing ZBook Fury workstation with Intel Core Ultra V9 Pro and NVIDIA RTX Pro 5000 Blackwell ...

Intel

Co-sponsor providing Core Ultra V9 Pro processor technology featured in HP ZBook Fury workstation for local AI inference

Vercel

Deployment platform used to host and share AI-generated web applications and demos created with GPT 5.5

People

Kevin

Co-host discussing GPT 5.5 features, benchmarks, and practical applications throughout the episode

Gavin

Co-host demonstrating GPT 5.5 capabilities including animal tournament game and image generation workflows

Jakob Pachoky

Discussed iterative development approach and medium-term improvements in GPT 5.5 development philosophy

Sam Altman

Posted tweets about iterative development, democratization goals, and competitive positioning against Anthropic

Peter Gostev

Created toy train set demo comparing GPT 5.4 vs GPT 5.5 capabilities showing significant quality improvements

Sebastian Bubeck

Created SVG unicorn demonstration using GPT 5.5 to showcase code generation capabilities approaching TickZ benchmark

Dan Shipper

Provided early user feedback on GPT 5.5 performance in creative writing and long-horizon task use cases

Jeff Ladish

Created Where's Waldo style image prompt demonstrating GPT Image 2 detailed generation and joke integration capabilities

Prins

Reported that GPT 5.5 thinking heavy delivers better answers in 2 minutes than GPT 5.4 heavy in 10 minutes

Quotes

"GPT 5.5, sometimes I become lazy and I kind of give it a very ambiguous task, but then it will figure it out."

Gavin•Early in episode

"We see pretty significant improvements in the short term, but extremely significant improvements in the medium term."

Jakob Pachoky (via hosts)•Benchmark discussion

"We love you and we want to win. We want to be a platform for every company, scientist or entrepreneur."

Sam Altman (via hosts)•Competitive positioning

"The speed to demo idea is pretty phenomenal. This was about a one paragraph prompt and I sent it away."

Gavin•Animal tournament game demo

"It's almost quicker and easier for me to make the full thing, have the designer annotate, make their adjustments."

Kevin•Image 2 workflow discussion

Full Transcript

GPT 5.5 has arrived. OpenAI's new flagship model has officially entered the chat. Smarter, faster, cheaper, thinkier. GPT 5 is better on long-term tasks, begins a new stage of iterative learning, which means much faster rollouts, and might just make us lazy as hell. Yeah, the updates are causing that. Before previously, a lot of my prompts have to be very detailed or very instruction-y kind of. Whereas with GPT 5.5, sometimes I become lazy and I kind of give it a very ambiguous task, but then it will figure it out. We will show you some incredible projects that people have already built with 5.5 and unveil Gavin's latest animal death match arena thing, which we haven't even looked at yet. That's right, there are new crazy ways that you can integrate GPT image two, the image model into codex, into the new model, and I did it and I'm gonna show you all right here. It's a ton of fun and it's GPT 5.5 day, and this is AI for humans. That's my brain dying at this point. I got a little Superman curl. Oh, that looks good. Yeah, I know. It doesn't look like a tapeworm at all. Welcome, everybody, to AI for Humans, your twice-a-week guide to the world of AI news. And Kevin, what a week, what a crazy month we have had. The AI world continues to kind of get nuts again and again, and it's not going to slow down anytime soon. We have a new flagship model. finally spud the giant potato has landed okay uh gavin hold on a second where i go to my codex and i click a check for updates and uh it's okay sorry hold on let me it's not there let me go to my chat oh and let me go to my chat gpt app let me go to my chat gpt app a new file uh check for updates and it's uh it's not it's not there it's not i have an enterprise account and a pro account some of us can't some of us can't be as early as others kevin i did get access i do have access You found a magazine on your brother's floor under his bed? Yes, I did. It's rolling out to everybody today. I got it. Kevin does not have it yet. We are recording this on Thursday. But it is a very cool new model. We need to dive in and really talk through some of this stuff. Kevin, I think the big thing that I was expecting from the get-go with this was there's a lot of hype around this. There's a lot of vague posting, as they say in the AI world. We saw the mythos, like kind of mythical benchmarks, the model that Anthropics says is too dangerous for anybody to use. And then we saw Opus 4.7 come out last week. So this is kind of OpenAI's answer to that. Now, just the basics right now, some very important things to know, and we're going to dive into the specifics. The number one thing that they are touting here is that this is much more reliable on long-running tasks. And later on the show, I'm going to talk about a thing that I just did an hour ago that I literally only had to kind of input a couple of things, and I got something useful out of it and it ran for about an hour. The other thing that I have seen a lot of people talk about is that it thinks more for cheaper and better. And I know that's kind of a lot to unpack there, but one of the things that was going on with both the move from 4.6 to 4.7 on Opus was this idea that they were going to try to find a way to control the token costs and that it was going to get better thinking. And I'm curious to know what you think about that kind of idea now that we're at this place where things are getting better, but also these companies are a little bit trying to maybe control their costs on the other side. Well, I mean, they need to control costs for the users. Obviously they need to do it for themselves, but you know, the, the meme was the GPT 5.5 nail in Anthropics coffin. And people were, you know, posting and reposting that and sharing that because with the latest 4.7 opus, there was a whole bunch of users felt regressions, right? It was, it was costing more, uh, their limits were being eaten up and 4.7 was supposed to help with that. So when you're running these long-term agents that need to spawn sub-agents and go out and read documentation and explore the web and write things and test things, all of that takes compute. It takes tokens. And so it behooves the companies that serve this in some ways to make that more- It behooves them. It does. It behooves them. If they are minotaurs, if they are horse-like, this is, yeah. They behooved. They behooved. It's in their best interest. It's in some ways in their best interest to make these models more efficient, right? Because they can serve more, they can serve it faster. On the other hand, in some ways they're not incentivized because the more tokens these models take, the more the end user has to pay. So it's this delicate balance of trying to extract as much from the end users and their corporate bank accounts as they can while not extracting too much that people say, hey, Anthropic, we're done with you. We're now leaping to open AI. So this has been a huge issue. In fact, by the way, like not to get too in the weeds, but about 30 minutes after 5.5 was announced, Anthropic posted a big, hey, are bad. Y'all said- Oh, they did? I didn't even see this yet. Yeah, they said Cloud Code was getting rough. A bunch of engineers that I chat with were like, hey, look at this. They basically found three major issues. So instead of gaslighting users, they said, actually, yeah, you're right. We had some issues on our end. We're gonna reset all of your limits, even though some of you might have already paid out the nose because of these errors. I digress. Let's talk. It's 5.5's day. let's give opening eyes some flowers because yes, this model should be more optimized. Yeah. I mean, I think this is all part of the big conversation right now that we have to talk about as we talk about these new models is you would really have these two companies kind of neck and neck and getting into this. And I think, Kevin, it might as well, we might as well jump into it now. It's time for some benchmark boy conversation. Benchmark boys! Benchmark boys! There was a lot of people last time who were confused that it's benchmark boys and not bros. and we said benchmark bros. So just to clear that up from everybody's perspective. I think they're two separate warring factions, Gavin. Yeah, yeah. But let's get the boys off the bleachers. Let's get them in. You're a good game, but Pats, let's talk benchmarks. So I want everybody to know, first and foremost, benchmarks are a weird thing in that these are, as many people in this audience probably already know of, in case you don't, benchmarks are these numbers that are released that are testing these AI models on various specific tests and what they're good at. And every time they come out, they release a series of these numbers. and Kevin, the GPT 5.5 benchmarks are good. They are not as good as Mythos, right? And I think just to talk about this as a whole, Mythos had some higher numbers. The one thing I think that I was expecting a little bit from this, kind of how people were talking about this, was a larger jump. Because Mythos, when you saw the Mythos benchmark numbers, and again, all of this is about what it really feels like to use the model, which we'll talk about in a bit. These numbers are still very strong. We are talking about the GPT 5.5 thinking number is at an 82.7 and Opus 4.7 is at a 69.4. So that is a significant jump over that particular agentic terminal use number. But in some of the other benchmarks that have come out, like even on the one that opening I released, like the CS World verified, the number is almost the same, right? So anyway, this is a long way of saying it's another step. It is not the kind of thing where you're like, it's going to do everything for me, but I do think it's important. And again, I'll get to this later on to kind of talk about what I did with the model already. today that the idea here is that you can give the model more stuff to do that harder and it can go away and do it on its own That is the life change that we all looking at now So a couple of things like on the benchmark front we talked about this before there bench maxing which is where companies overfit their model to crush the benchmarks And it typically takes a couple of days or a few weeks for the vibes to come through and people say, oh, this is what it excels at. And here's where it falls short. Looking at the benchmark numbers, as you said, there's a couple of places where even the mythos is whatever. It's not out. Yeah, it's not out. So who knows? So comparing against Opus 4.7, which I have open right now in a terminal window, Opus bests this new model in some benchmarks. Yes. The early vibes coming out from like Dan Shipper and Every and whatnot, like the early vibes are that this thing is the best model in certain use cases. Yes. That for creative writing, it got a lot better. For longer term horizon tasks, which are more specific, it got better. But that for being a generalist, some people still prefer Opus. And so what I think is happening here is that companies have their own philosophies with how engineering should be done in general. Forget the way these models should work, right? And they tune the models to their preferences, to their tastes. And so we're getting like a Pepsi Cola or an Android iPhone sort of existence where it's like, look, iPhones are amazing. Android phones are amazing. Some people absolutely hate Android. Ooh, I hate Android. I hate Android. I don't want to ever see Android in my face ever. So there you go, Kevin. Oh, wow. That's right. Gavin will kick a clanker. If you're getting tacos delivered in a little rolly butt, he will kick it. He doesn't want an Android in his face. He said it. He's a clanker kicker. You're right, Kevin. Hashtag clanker kicker in the chat. No hashtag clanker kicker. I love clankers. Put it in the comments. Open AI CTO Jakob. I think his name is Jakob. Let me make sure I understand. Jakob Pachoky. Jakob Pachoky. had something really interesting to say about this. And Sam kind of reiterated this in a couple of tweets. Basically, they are saying we see pretty significant improvements in the short term, but extremely significant improvements in the medium term. I would say the last few years have been surprisingly slow. So everybody at OpenAI is kind of saying this is a new way that they are developing. They're going to be much more iterative with rolling this stuff out, which we've also seen from Opus. And Kevin, there's been a little there's a little piece of this in the blog post. But like this is another model that did a lot of work on itself. And I think this is just the speeding up of stuff. And as we've seen Opus ship all those features for Cloud Code and other stuff, I suspect we're about to see the same thing with OpenAI as well. Please, let's go. Let's take off, friends. Let's do it. I mean, look, we even see it with like in the open source model community, right? A new Quen model will drop and then you wait 30 minutes and then there's a distilled or fine-tuned. And then a couple of minutes later, there's another one that's optimized for a different operating system or a different processor entirely. Like the pace of the evolution here is getting faster and faster. And it would make sense that as their foundational models get better, they're better at improving themselves as well. Yeah. And I want to call out a couple other tweets that are really interesting. Prins has had it for a little bit and said that the GPT 5.5 thinking heavy, that's there's a different version of this, delivers better answers in two minutes than GPT 5.4 heavy delivered in 10. So like that's a little bit of what's going on here. The other thing I do want to shout out is Sam wrote a longer tweet, which was this idea about iterative development. But then he also then said, we believe in democratization. We want people to be able to use lots of AI. We aim to have the most efficient models, the most efficient inference stack and the most compute, blah, blah, blah. So this is definitely a shot that feels like it's being taken anthropic. And then I love at the end of this, he says, we love you and we want to win. We want to be a platform for every company, scientist or entrepreneur in person. My whole career has been largely about magic of startups. And I think we're about to see that magic and hyperscale, but we love you and we want to win. So we have a combination of things going on here. This is a little bit of interesting stuff that's happening overall. The other thing we should talk about is Codex, right? So not only is this new model out, but Codex actually dropped a bunch of new features, which is really cool. And I just used Codex with these features. I'm sorry, Kevin, I know you don't have it yet. Keep updating and see if it arrives. Better browser use, better docs. One of the experiences I had with this, Kev, was in Codex in the past, I don't know if you've had this experience, when I'm trying to build something, the browser is kind of funky and the in-browser, which just came out like a week ago or a week and a half ago, sometimes pops up, sometimes it doesn't. This time it was really solid. Like it popped up. It showed me as it was working. I saw the little arrow moving around within the codex window, all very clean. So to me, that's a pretty big deal. And this also follows up on the announcement that kind of didn't get enough hype earlier this week, which was about the shared agents in ChatGPT. Did you see this? Yes. Yeah. Yeah. Yeah. So that's another way that like, you know, you can open the door to specific agents that have use cases within either Codex or ChatGPT. This whole world of like having things that can be spun up, it feels like to me there's a little bit of like a setting of the table for things like an open claw like world where you can go out and get all these agents that can do stuff for you, but maybe living within the open AI world itself. Well, that's exactly what it was. That's the open clawification of the Codex app was adding these agents. So if you want an agent that just does email triage for you, now you can easily set that up. If you're running a small business and you need a dedicated agent to look at your CRM and check the status of your AB testing of your ads and your marketplace, like now you can have all of these dedicated agents that can talk to each other and be shared in the ecosystem. The browser and computer use, specifically the computer use on the Mac version of Codex is incredible. I think it bests the Anthropic Cloud plugin. It definitely does. I think it 100% does. Yeah. It seems way faster, seems way more capable. Odd to me that Sam Altman got on a live stream this week and it wasn't for GPT 5.5. It was for image generation. So that just goes to show you how powerful Tuesday's announcement was, how powerful the new Image 2 model is. Every day I'm seeing people generating wild stuff with image to like generate a birthday cake that has code on it that when rendered actually makes an image of a birthday cake was one of the ones that i saw that kind of blew my mind or or complex mathematical functions integrated into like children's rugs like that they would play on like weird weird stuff and when you start pairing that with a model like 5.5 now you start unlocking some really incredible capabilities i'm very excited to talk about that and show off some really cool examples of what's been made with 5.5 with the image model. But first, a message from a new sponsor. I'm about to do something I never thought I'd be able to do with a laptop. And that's because I have this HP ZBook Fury workstation to work with. There are powerful computers. And then there is this. We are very thankful to HP and Intel for sponsoring AI for humans this week and sending us this absolute beast of a PC. This thing is powered by an Intel Core Ultra V9 Pro processor and it came ready to go right out of the box. I been using it for everything local AI AI video running cloud code and even spinning up local LLMs for my own private research It that powerful I going to spin up Compy UI for local AI image gen right now So I've installed a bunch of local models like QuentinFlex, which are free to download and free to generate. And I'm going to start making something really important. Images for my new AI series, the Raccoon Bachelor. Here's why this matters. Because I'm doing this locally and the models are open source, I'm not paying per generation. I'm not waiting in a cloud queue and I'm not sending anyone to anyone else's server, that's at least a subscription or two I'm saving per month and I can just make a lot more. And because this bad boy has an NVIDIA RTX Pro 5000 Blackwell GPU, you can see just the size of it, it's crazy. It can handle the bigger models and it has 256 gigabytes of RAM, a crazy powerful Intel CPU. I am running stuff that used to require a dedicated desktop computer on my laptop, which is pretty incredible. And now thanks to this computer, I've got all the images I need to make that little raccoon bachelor or break the raccoon lady's hearts. Check out the link in our description if you want to spec out the ZBook Fury. And thanks again to HP and Intel for sponsoring AI for Humans. Well, as much as I love words from sponsors, Gavin, I love words from our dedicated followers. And you can leave them as a comment below. And if you don't want to say anything, I guess that's chill too. Just like and subscribe, leave a five-star review. And if you want to back us on Patreon or buy us a coffee, you can do all that too, AIforhumans.show. That's our site. But sincerely, thank you to our sponsor and thank you to everybody who helps grow this operation each and every week. We appreciate your time. That's right. And last week, thank you to everybody who said Kevin is beautiful at the end of the show. I see you YouTube commenters. There were a lot of them, Kevin. You're very happy. Okay, let's talk more about 5.5 because there are some really cool examples I've seen already and I'm going to show off my 64 animal tournament game. First and foremost, Kev, there was a really interesting demo from Peter Gostev, which he made. He asked 5.5 to make a toy train set in GPT 5.5 heavy, kind of crushed it. What was really interesting here is seeing he compared it to what 5.4 did. And you can really get a sense of like, okay, these are the kind of different quality sets of the model. Like if you're not watching it, it's just very, very detailed. It's all being done like in a browser. He can kind of spin around it. And it's just a much less detailed version in the 5.4 version. And I don't know, it's one of those cool things that lets you see what the differences are a little bit. Yeah, I love these same prompt tests. And for those that are just getting the audio version, the 5.4 is cool, right? It's like a table with a model train set literally chugging along. And then you can jump into like the conductor seat and look first person through it. But it, you know, it looks a little primitive. It looks like an old Roblox type game. When you jump to the new 5.5 High, the town that the toy track is going around is fully flushed out. There's buildings, there's trees, there's a little river with a boat going through it or whatever. and when you jump to the first person mode you have controls that make sense and they're labeled appropriately and it's like just staring at it and going like oh that's a cool prompt i like that comparison it it makes my head spin about what this test is going to look like in a year from now gavin or six months from now right sure yeah sure but like the whole room is going to be modeled and you'll be able to go in and take full control and it will be multiplayer and it will run in browser. And it's just, I like, I I'm so excited for this near future. I know I had a moment of that this morning thinking about like a year, year and a half ago when you and I would be excited about what these new models would look like. And the fact that we can just spin up these things so much faster is crazy to me right now. Another cool thing from Sebastian Bubeck, who actually works at OpenAI, put a unicorn together with an SVG. And he said, basically, he says, GPT 5.5, not fully saturating the TickZ unicorn test yet, but getting awfully close. He says, this is actual TickZ code. I find it so unbelievable that I'm putting the code below for anyone to verify for themselves. So what you're seeing here is a code-generated unicorn that kind of looks like a My Little Pony, but it's definitely a few far steps from what we used to see with code-generated graphics. Like, even the unicorn looks a little demure. Like, it's kind of, like, sadly winking at us. Or maybe not winking. Maybe it's closing its eyes. It could be sleeping. I don't know what you think, Kevin. Is it winking? We don't see the other eyes. so who knows? Gavin, I actually don't want to explore this. This is a weird unicorn Rorschach test for you. And you're like, move on to your thing. I actually love the way the unicorn is playing coy and it's subtly kind of just, it's a little wink towards me and letting me know, Gavin, everything you're doing is working these days in the gym and you're really looking. I came across a UFO tank game by, in the world of AI. And this was like supposedly like a one shot and there is a 3D tank that you can drive around a map as little UFOs whiz about and shoot at you and you can shoot at them. And when you make a collision with a bullet, pew, pew, UFO go bye-bye. This is just like, again, like the new grounds of gaming. I'm sure there's a thousand startups that are going after it, but the games are going to start being good enough. They're going to actually want to participate in them and create them and remix them. Yeah, so let's talk about that. The project I gave 5.5 this morning was a classic project that I have given lots of times to AI models. Kevin will remember this well. You in the audience may be new. You may not. I had an idea forever ago. I think it was two and a half years ago, which was I wanted to make a March Madness tournament of the world's most dangerous animals. You take 64 of the world's most dangerous animals and you fight them one by one until there's a champion. The goal here is you as the player play one animal and then you go through this. And this morning, literally this is 45 minutes ago. I gave it two additional prompts for this. I said, go make this as a card battler. I gave it a pretty complicated prompt to start just so it had the information on it. But Kevin, the big difference here is I gave it the image gen tool in Codex. So what I said to it was like, hey, don't just give me, because often what happens with this when you try to get it to make a game, it'll give you like some sort of, almost looks like a website. I said, don't do that. Pull up images. So you're going to pull it up for the first time right now. I've pulled it up earlier. It's not great, but it's also like amazing that I made this in 45 minutes. Okay. So I'm at the dangerous animal madness site. I love that there's some particle effects going on in the background or whatever, right off the rip gap. Nice. Okay. When six ridiculous fights with the animal, the wheel gives you, I'm going to spin for my animal here. And, uh, Oh, I got chaos in turn, which is, uh, Oh, I got, which is a chimpanzee. Parking lot menace is a goose. So again, play with yours and we'll keep you. I'm going to enter the bracket here. So I see the dangerous animal madness bracket. I'm using chaos intern versus the buzzkill committee, which is a Titsi fly swarm. So let's see if I can win. I'm going to zoom to the match here. Chaos intern versus buzzkill committee. I'm entering the match. Opponent intent clamp down, attack eight, block nine. Let's go. Come on. Oh, I have to choose my hand, right? You have to choose your hand. Yes, you have to choose your hand. It kind of plays out like Slay the Spire or another game like that. well i guess i'll brace for weirdness which is a defense move and then i'm gonna do a wild swing okay yeah yeah take that teatsy fly swarm okay i guess i gotta end my turn now all right this is um this is actually too complex for me to just shoot from the hip and start clicking yeah like dude i don want to actually lose here no well so here an interesting thing about this so basically again it the first time i testing it or seeing it what very cool about this is it the speed to demo right like that what we've been talking about here before the idea that you can get from zero to like this is probably i'd say maybe 25 to 50 through a game but the idea that you can play it right away makes a huge difference and oh dude i'm op yeah i'm op sorry yeah yeah no no you go ahead you go ahead i'm just I am crushing this Titsi fly. But you get the sense of what it means to like be able to demo something quickly in your brain and just drop it out. This was about a one paragraph prompt and I sent it away. It worked for about a half an hour for the first time. It came back and I said, do it a little bit better. Make sure you're using the Imogen tool. It worked then for 45 minutes and came back with this. Now, it's not perfect yet, clearly, but the speed to demo idea is pretty phenomenal. Chaos Intern survives. Choose one card. I can choose an evasive flop, panic geometry, or double tap dance. Woo, Gavin. So all of that was stuff that was just kind of prompted in. Now, again, there's going to be a lot of balancing in a game like this. I'm playing a lot of Slay the Spire 2 right now. That's part of where this inspiration came from. But like, you get the sense that like you, the person at home, I am dummy. I do not have coding abilities. But the fact that you could spin up a demo like this very quickly and actually get it playable and get it so, I mean, it's not pretty yet. but like it's not ugly right like this idea that like it's not just like a prototype that looks like you know boxes knocking to each other that sort of thing the fact that uh there's any graphics on screen this early in in what would be a development cycle is wild the fact that that's deployed and playable and you can share it is also wild and i'm assuming you just told it hey go put this website up on versell or whatever and it deployed it for you yep that's exactly right so i even while it was working, I said, hey, I steered it. You know, I said like, hey, throw this up on Versailles so I can share it with Kevin in the middle of this conversation. So again, speed to demo, capability, long form agents, like all of this stuff is finally coming together. Let's focus in on GPT Image 2 as well because it has only been out for a few days. I am amazed at how good it is at certain tasks to the point where like it has disrupted my usual workflow, which, you know, I'm working on a feature for telly right now. I typically make a PRD. I talk with our designer. I make some mock-ups, whatever. But now the speed with which I am iterating is it's almost quicker and easier for me to make the full thing, have the designer anointed, like make their adjustments because they're, they're better at design than me. But then I go and implement it as well like that. And that just changed this week. I had a crazy moment. I'm consulting with a friend of mine on some stuff for him and he had an idea so i spun up me not coder i spun up the demo i spun up the design and one of the things you can do with image too is so fascinating it's like you get hey give me a website what this might look like right so you get a file back but the thing that i did kevin which i was kind of blew me away because when it tries to implement that file sometimes it's better or worse at it knowing all the different elements on the screen you can ask gpt2 image to to send you just the elements on the screen so that like in my thing, it had a really good logo and it had a couple other things that were cool. I said, give me all that stuff as individual elements. And then you put that in your file and you let it build. You can do it all. It's like a one person shop. It really is shocking. So I had it do the mockup of this, like a product that I'm building basically. Uh, and then I said, Oh, uh, go ahead and install hyperframes or use remotion. In fact, use both and then make the mock-up move like this i want the icons to come in i want things to highlight animated and then give me like a 15 second video it went off this was 5.4 but it went off uh and and did all of that using the gpt uh the image 2 uh image and it looks great it like just it looks like a fantastic little mock-up and i mean that's like okay oh whatever that's me being actually productive let's get to the where's waldo games yeah well that's there's a bunch of people making Where's Waldo versions with this because one of the things it can do is very detailed, very specific, larger prompts. There's a good example from Jeff Ladish who made a University of Berkeley anti-AI Where's Waldo sort of thing where there's a bunch of jokes. And then I stole his prompt and used it to make a thing about the NFL draft. Today, if you're a football fan, you know the NFL draft happened. So I had it make one of these things. And what was interesting for me was going like what we said last time on the show, like the little jokes and little things that ads are so interesting. And this image is NFL draft image I made is so complicated. There's so much stuff going on in there. And now not all of it's perfect. There's a few things that are wrong, but like it's making jokes like a mad magazine sort of thing, right? Like it almost feels like it's like this giant thing that somebody drew and wrote a bunch of stuff on. So it is a shocking moment when it comes to what's possible with that. And then when you compare it and contrast it with what you can do with the code, those two things together just like overpower a person i feel like yeah i saw your draft image and i zoomed around and was like looking at things i don't understand like this is me looking at like actual code i don't understand half the references but i can tell that every little frame is packed like every little pixel is playing some sort of joke or being part of uh like referential humor i don't what is i don't even know what some of these things are well the funny thing about it is like there's a couple things it gets wrong like it like one of the teams it gets the wrong team but stuff like that. But it goes through, there's 10 draft picks in the middle and it's the actual people. I asked it when I, when I created the image, I said like, go find who these draft picks are and put low jokes about each of them in. And some of them have very specific jokes, but then all around the edges, there are other jokes about what happens during the draft or things like that. So anyway, this is a very fun prompt to try for yourself for whatever world that you live in. Like it's probably a good thing if you're a corporate person, like you could do a thing where it's like, make it about my company, like it probably knows a fair amount of stuff, you know, and you can make these little jokes. It's a very cool thing to show off. I do want to say one more thing, Kevin. I, I sent this image to my daughter last night because my daughter was like, Oh, the opening eyes new image model is interesting. And she made a picture of herself and did some stuff with it. When my daughter was a kid, hopefully they don't kill me for telling the story. There was a character that she created called Mr. Brewster, where she wore this kind of white wig and she went around and was like an old man character that she made. She was very embarrassed of that character. We loved it. My wife and I thought it was one of the funniest things in the world at the time. She was probably eight or nine. She's always had this thing of like, oh, you guys thought Mr. Brewster was so funny. It was stupid, but I think it's funny. Anyway, I sent her back this image and I said, hey, you wouldn't believe what I saw at Whole Foods. And I made an image that was Mr. Brewster's wonderful concoction. Like it was kombucha. And she's like, wait, what is that? And I was like, did somebody take our name? It looks like a real end cap with all the different, you know, different kombuchas available in a Whole Foods branded appropriately. That's amazing. She actually thought that someone made Mr. Brewster's for a moment? Yeah, she thought so. Yeah, so my daughter said, I thought you saw this in the store. And my other daughter said, is this AI? Like, it's just an interesting thing at large. So this is where we're at right now, folks. All right, everybody, that is it for now. We will see you all next week. Thank you for joining us in Play Round with 5.5. Oh, I still don't have 5.5. Kevin still doesn't have it. He'll have it soon. All right, bye, y'all. We'll see you next week.