
GPT 5.4 Just Changed Everything
The hosts analyze OpenAI's GPT 5.4 release, which combines coding and knowledge work capabilities in a single model, positioning it as direct competition to Claude. They demonstrate a sophisticated content creation workflow that scrapes competitor content and mines additional insights from Reddit, YouTube, and Twitter to create higher-quality SEO articles.
- GPT 5.4 represents OpenAI's strategic shift from targeting developers to knowledge workers, a much larger market
- AI companies are building user lock-in through tooling and workflows rather than just model superiority
- Creating quality content requires going beyond Google's top results to mine insights from platforms like Reddit and YouTube
- The AI subscription model works like gyms - profitable because most users don't maximize their usage limits
- Companies are shipping buggy features quickly and crowdsourcing QA to users for faster iteration
"AI is completely taking over work. So we tested it against Claude code for real world work use cases inside our online business."
"I see a lot of people who just don't see it right now who don't take this with gravity and don't realize that they're going to be useless if they don't do this."
"The models are getting significantly better. GPT 5.4 is a thorough model, much more thorough than Claude in many ways."
"It's not optional, but yeah, overall this is going forward. What's impressive is we're also not hitting a wall."
OpenAI stopped selling a chatbot GPT 5.4 launched with native computer use, million token context and agentic workflows. Baked in the first heading of the announcement though was what caught my eye. It wasn't coding, but knowledge work. This is the hot topic of 2026. Right now. AI is completely taking over work. So we tested it against Claude code for real world work use cases inside our online business. We'll also go through a decades old content quality problem and the AI skill that finally solved it. Plus Google's Nano Banana 2, while not perfect, rethinks AI image generation from scratch. My co host today is Gail Breton, my co founder at Authority Hacker where we cut through all the hype and noise and share what business owners and knowledge workers actually need to know about AI. So Gail, welcome back. You were on vacation for three and a half days and you said to me that so much has happened and you don't know how people keep up.
0:00
I was no, a little bit more like four and a half days maybe. Yeah. I mean during the weekend, sure, yeah. It's just going fast. It's just, it's hard to test everything and I do this full time.
1:01
How the hell is everyone else supposed to manage to keep up when they have, you know, businesses to run and things like that?
1:13
Well, that's why we do accelerator, right? So that we can summarize it for people. But yeah, it's like it's changing fast and the depth of everything is kind of going deeper. Like you know, if you go into image generation, video generation, like terminal stuff, coding, et cetera, each thing kind of like is its own field developing deeper. And so it feels like when computers came to the market you'd be like a IT expert and then IT expert disappeared because everyone uses a computer for a million different things. It kind of feels like AI is going in that way right now where it's like it's going to be very hard to be good at absolutely everything it does. And you probably will have to pick your fields you can touch a little bit on everything. But yeah.
1:19
So let's talk about GPT5.4 because this has been a change worth noting. Like it's changed how you work day to day basically. So they announced, OpenAI made this announcement. GPT 5.4 thinking and 5.4 Pro models were being released. They claim it brings advanced reasoning, coding and agentic workflows into a single model. And they said something which caught my eye as well. They had GPT 5.3 codecs. Can you just tell us the difference between a Codex model and a non codecs model.
1:56
Codex model is optimized for coding and it's not very good outside of coding. It's not a very good chatbot. It's like it's smart and everything, but it's kind of like they make the model smaller and they focus it on the coding training and it's missing like real world general knowledge, etc. Like it has some but has less. So it's cheaper to run.
2:31
My understanding is that developers were using this inside a tool like VS Code to code. But we've seen this kind of rise of knowledge workers using VS code and CLAUDE code. And that's like the hot topic among certainly small business owners in basically every group and community I'm in at the moment. So my interpretation of this is it seems like they've realized there's a big market there and they want to kind of make their model more adapted towards that target group of people.
2:49
It's like actually GPT 5.4 is a codecs model and a normal model and it seems to be a bigger model as well because the API costs are higher, they increase the price of the API. So it's like I think it's 2.5 dollars per million token versus 1.5 or something. Not sure for that. And then it's $15 versus $12 versus GPT 5.2. So it's a noticeable increase and it's just a bigger model that know. Yeah, codecs was this thing for devs, right? It's like it was like I would tell people, use closed code if you're like a general worker and use codecs if you are a coder and it's pretty good codecs and the limits are higher. And now with this GPT 5.4, two things have changed. One, you can interchangeably use CLAUDE code and codecs for pretty much all the knowledge work stuff. Arguably Claude is still a better writer. So if you write copy, if you write emails, et cetera, I still think CLAUDE is ahead for everything else. Not sure. I actually get better executions on some scales with GPT 5.4 now.
3:20
And can you explain that what you mean by better executions there to give people a taste of what you mean because everyone understands like better copy or worse copy. But what's a better execution?
4:18
Example? I have a Skill that generates LinkedIn carousels that I'm working on and it's quite interesting.
4:28
Those are the kind of like generating a sequence of images. So you do those kind of mini slideshow type things.
4:33
Yeah, and it's like, it's quite hard to generate with image models like nanobanana because when you slide and they're very copy heavy, it's very hard for the image model to kind of like keep the heading at the same space, et cetera. So when you slide they're kind of like the. It's janky when you use image models. So my friend Laura gave me the solution she uses, which is very smart, is she generated the slideshows in HTML and then had the model take a screenshot of them to make an image. And so like that's the approach that I've been doing. It's working pretty well in terms of layout shifts at least. But then in terms of generating using this tool better, GPT 5.4 has been way better, like generating the Same prompt on 5.4 and Opus 4.6. I consistently have better results with 5.4 now, for example. So that's an example of a process that I've built into one of these coding agents for marketing. I executed on both models. And 5.4 comes ahead almost every time, even though it doesn't write as well. It's like it its understanding of the whole thing and the way it thinks of the hook, etc. And even the way it processes the technical side of things is just better.
4:40
They've also said that it now has a 1 million token context size. Does that even matter for most knowledge workers these days?
5:48
Yeah, it can matter, but you get charged for it, right? You get extra usage used better than entropy because if you want to use 1 million context window on Claude, you have to pay extra money on top of your plan regardless of how much usage limits you have left. Whereas OpenAI will use your usage limits.
5:57
I think it's worth talking about how people are actually paying for this because you talked about the input output API costs, but is this available to your $20 a month ChatGPT user, for example?
6:16
It is, but it's going to use your usage very fast. That's one thing as well. There's another story that I couldn't develop earlier, which is the consensus is you get much higher limits with codecs than you get with Claude. However, since the release of GPT 5.4, the limits use a lot faster on codecs than they used before. And so I think that story might be equalizing a little bit right now and the usage limits tend to be going a lot faster. I just have a $20 plan on ChatGPT because I use mostly cloud and it's like I used to not be able to use codecs a lot, and I was like, I don't ever need to upgrade. I. I used 50% of my limits in one session or two sessions of codecs of my weekly limits.
6:27
Why is that, though? Is that because you're just utilizing the model more? Is it because of context?
7:13
Windows, it's more expensive. It's a more expensive model if the API cost is up. They also draw, they reduce how many tokens you get for your $20 subscription, and it's quite a bit more expensive.
7:17
The subscriptions are basically like they're hiding the true cost behind it. They just say, you know, you pay this amount and you get like certain amount of usage for it.
7:28
Yeah, you get a certain amount of tokens, right? They're like, you get this many tokens that you can use across the week, and then per session you get this many tokens and just you get a discount, but you don't get to use it exactly how you want. You have all these session systems, et cetera, and then that's how it works.
7:37
I think most end users are happy with that, like fixed price, you know, it's predictable. There's not going to be any runaway.
7:51
So much cheaper as well. You know, it's like if you pay a $200 plan for anthropic, I think it's $3,000 of ATI, like, if you use the whole thing. So it's crazy subsidized.
7:57
So are we going to head to a world eventually where this is no longer subsidized and we are actually paying $3,000 a month for what we currently get for our couple hundred bucks in
8:10
cloud, somewhere in between. Right. Because obviously there's this thing, it's kind of like gyms. It's like gyms make money from people who don't go to the gym, and these subscriptions make money from people who don't use them? And it's like, how often have you maxed out your weekly usage?
8:20
Never. But you have. Many times.
8:34
Yeah, but the point is, you know, for one user that maxes out, there's five that don't. And that kind of like, it's kind of like an insurance system almost. And it works for most people. So I don't think you'll end up paying API price. Will you end up paying $1,000 for what you pay $200 today? Yeah, probably one day. You know, you still pay like, you still have a big discount, but is it going to be as big of a discount? Probably not, but they're hoping that's also
8:36
assuming we don't get more efficient so as well.
9:01
That's the thing. There's efficiency gains and then price drops overall. So it's like there's also kind of like the level of intelligence you have today, you'll probably never pay more, you'll probably pay less, but to get access to the best possible models, you will probably pay more. Both of these are possible at the same time.
9:03
They made a big thing about this having native computer use abilities. Is this a direct kind of reaction to openclaw, Claude code, that sort of thing, or what?
9:19
I mean, they bought it, right? They bought openclaw, so it's like it's theirs right now, but I think it's more like multimodality in general. And the idea of the model does the work. Well, the idea we're going for is instead of getting text back from a model, it does the work now it uses the computer, it does the thing. And ideally what I want is I want to walk in the forest and talk to my phone and it just does the walk. That's the world where we're going, basically. And then I never have to sit in front of a big screen again. Or maybe I can sit an hour a day just reviewing what I did while I was talking to my phone. That's where this is going. For this to happen, the model needs to be good at using a computer, using a browser, using all of that. There's many interfaces that are not made for agents, or even if you were pulling the HTTP or downloading the code, it doesn't work. How do you filter on Amazon or interactive interfaces, et cetera?
9:29
I read something that previous OpenAI models that lacked this ability, and this was, I think quite speculative, that lacked this ability, had some other tool or agent that was doing the computer controlling. So it wasn't like baked into the core model. So it was essentially calling a service to do the interaction request.
10:21
Yeah, I mean, they always have these tools, etc. It's just like the more you reduce all the layers, the more token efficient it becomes and the faster it becomes. So one thing that's really good with GPT 5.4 is it's really fast at doing all these things and it costs less money because it uses less tokens to do the thing and it's faster as well. So it's like when you see it use a browser, for example, it's just a lot, a lot smoother. But overall, right now, GPT 5.4, in my opinion, just the main story is they're back on track. It's better than Claude on some things and it's cheaper in the API. It's quite token efficient compared to Claude model as well. It's cheaper than Sonnet on the API for the same tests. And it's like we're back to a world where you can mix both. Where the beginning of this year was very much like cloud everything. Because GPT is a bit shit I as a chatbot. And if you're not coding but now it's like, yeah, I like it a lot. It's like I'm considering getting a $200 subscription on ChatGPT just for that, actually.
10:39
And if you're using VS code to run CLAUDE codes, it's pretty easy to switch over, right? What's the process there?
11:34
There's no process because these things just read your local files, right? It's like once you. You can install the extension the same way. Most people that I know don't run in the terminal, they run in the extension. So it's like a plugin on WordPress for VS code and then you just connect it and it just reads all the files or the folder you have open on VS code. Cloth does it. The only thing that you need to migrate is you have this Cloth MD and then you have this Cloth folder and you just need to rename or like duplicate your Cloth MD and call it Agents md. And it works the same way, basically. And then Codex from my tests just picks up the skills from the cloud folder anyway. So it's like when I open codecs on VS code, it will just like if you do the dollar sign on codecs to call a scale. So it's not the slash comma, it's a dollar sign. But you can just call the code scales and it just works for me.
11:42
Interesting. So if you're using VS code, it really is like no hassle at all to switch and you can use both interchangeably when your limits run out and stuff.
12:25
OpenAI is remote to launch like a $100 plan soon. So I think like hundred dollar on clothes, $100 on codecs might actually be a very good setup for most people rather than just going all in on one or the other because there's no extra features for going from like the $100 to $200, at least on the clothes side. And it's exactly double. It's like you get four times the limits per session where you get double the weekly tokens. So it's like there's no extra discount and if you can get codex on top, then it might be a good way to mix both, at least for now. It's like give it two months and the situation might be completely different. But you know who's left behind right now is Google.
12:35
Actually we have some interesting news about some potential releases this week coming up later in this episode, so we'll talk about that then. One question I had for you is when reading through the OpenAI announcement of this, they had their kind of intro spiel, some benchmarks to compare it to previous versions of I think GPT 5.2. The next big heading on the page was knowledge work. And this is the first time I've seen them really move in that direction. It's always been about coding or you know, just general purpose, like AI can do this or it can make images or whatever. This seems to me to be like the big shift we're seeing right now. That's who they're targeting with this update. All of the small business owners, professionals, knowledge workers that are chatting about using Claude code in all of these communities and groups right now and how it's just transformed everyone's day to day work. Is this then playing catch up to those people specifically? Is that why this model's here?
13:11
Well, first of all, coding is figured out at this point. It's not finished, but this market is already moving towards maturity. It's accepted that coding will be done by AI at this point. And so it's still not enough for these labs and their valuation. They need to get more people to pay for them. And so that's the next frontier. Right. It's like anyone using a computer and so that's why they're going after.
14:14
This is a much bigger market as well.
14:37
Yeah, it is a much bigger. I mean it depends on how you. I mean it is a bigger market, but because the developers were the highest paid employees, like if you look at the actual worth of the developers, it was probably like quite significant still.
14:39
Sure. But there's, I don't know, tens of millions, maybe 100 million developers. I know, but like the multiple billion knowledge workers.
14:52
I understand, but the ability to penetrate that market is a lot more difficult because devs are more tech forward. So it's easier like the value of the market against how easy it is to penetrate. It was actually better to go after and now they kind of like go down the ladder and go after these audiences and they're also different. Right. It's like the marketing and the HR people, you talk to them differently and
15:01
that I Think is an interesting discussion in itself because we are using VS Code which is a development tool like it's been around for a while. Developers use it to write code. We don't know how to write code really but AI does. So we use it to do it in there and we use it to do our knowledge work. Surely if they want to get 3 billion knowledge workers or however many there are out there, they need to have this figured out outside of scary looking apps like this. And it needs to be all on your phone, in the ChatGPT app, et cetera.
15:19
Cowork, I think so Cowork is the kind of in between of cloud code and the chatbot for cloud. And actually Microsoft just licensed it and launched it to their office 365 basically with anthropic which is really interesting because they're heavily invested in opening but still Microsoft just launched Entropic powered Cowork inside their working suite. So it's like that's the way this is going. I think the notion agents as well is the thing that's going after knowledge workers where it's a little bit more user friendly and so on. So yeah, that's why we're going this year. These interfaces are moving to these other things now. It's never as flexible as the terminal. There's still value learning VS Code because you can do anything you want and there's no limitations of that pre made software. But I also know plenty of people who just will not do it regardless of how good this is. And these people, they will have these interfaces coming up. But honestly what I find most challenging is changing people's mind and thinking that they need this all that is going to improve them. It's like I see a lot of people who just don't see it right now who don't take this with gravity and don't realize that they're going to be useless if they don't do this.
15:54
But you know what, like you can't really blame them because they for the last three years they've had rammed down their throat by every influencer on the planet. Like oh this new AI thing is insane. It's a game changer.
17:04
So you're blaming Goldie, right?
17:17
People are numb to the like what's, what's the hype? The hype? Yeah, people are completely numb to it. So when something comes along that legitimately is insane, like it's understandable that people don't get excited. Like it's almost like they need to feel the pain or like experience it Firsthand, at least for a certain group of people. Like, you're very much in the, you know, innovators, like, try, try everything, new type crowd. I think I'm like the next step back from that. But there's a whole, like, you know, 80% of people that I don't know, like, haven't even heard or paid attention to this stuff yet that I think are about to get a rude awakening very soon.
17:19
Yeah, it's not optional, but yeah, overall this is going forward. What's impressive is we're also not hitting a wall. The models are getting significantly better. GPT 5.4 is a thorough model, much more thorough than Claude in many ways. So if you want thorough checking, lots of documents, et cetera, this is a step. I've never seen this in a model before. This level of quality and thoroughness at the same time. Codecs is also the app I'm using, the Codex app, but for knowledge work as well. It's something that people should probably try and, yeah, quite excited. And I'm happy Entropic has competition because otherwise the thousand dollars per month plans are going to come faster than you think. And so it makes me very happy that another company is, at least on their level, preventing them from going there too soon.
18:00
So to switch, you said you could. You basically just use it out of the box inside VS code, or switch to the OpenAI app, the Codex app.
18:45
Codex app is good, actually. It's kind of like it's a simpler interface and then you get a UI, but it's very GitHub based. So if you don't know how to use GitHub, it's not for you. Okay.
18:54
Do you think these companies are trying to build like moats around their users because, like, the fact that it's so easy to switch, you know, we had that big thing with the Department of War, as they're now called with Anthropic. And then it was a couple weeks ago now, but ChatGPT, like, kind of had a bit of a fall from grace. Anthropic became the most downloaded app in the App Store and they built a thing to pour over your memories from ChatGPT.
19:03
Just a prompt. It's just a prompt. It's just a prompt you copy paste in ChatGPT that just makes it print.
19:27
This whole thing concept just seems like these companies have very little moat. You just move to whoever has the best model. If a better one comes along, you switch to it.
19:31
Anthropic is doing a good job at this. So Anthropic There's a lot of history. So with Codecs, you can connect it to pretty much any coding app. Like, there's coding apps that just will use codecs behind the scene, but will use your subscription but not be codecs. And OpenAI is cool with that, but Entropic has banned any app doing that, for example. So if you try to use your cloud code subscription outside of the cloud app or an official cloud code plugin, you will get banned. That's what happened to a lot of people who use it for cloudbot, for example. So Entropic is trying to get you addicted to their tooling. That's why there's so many code updates, right? We talk about that. They have scheduled tasks that came out, et cetera. And so this is not the model, this is the tooling. And scheduled task is one thing that will start anchoring you, right? Because I need to move all my scheduled tasks if I want to change the provider, especially if it's built into
19:41
the hassle of it more than the capability.
20:27
And now you'll be willing to just deal with a model that's 5 or 10% worse just because you can't be bothered moving your shit and you keep paying. And so I think Entropic is doing a pretty good job at trying to get you sticky with that, whereas OpenAI is not doing a good job like ChatGPT. I feel like nothing has changed for ages. It's kind of like the normie chatbot and they're not really trying to get you to stick to it. And Codex is full open source and you can connect it to anything. So they will report high usage. But many people use it outside of their products. And as soon as the model is 5% better, all the devs are swapping, as you're saying. So if I had to put money down in terms of stickiness, I think Entropic is doing a good job and OpenAI not so much, and they need to do a better job. The Codex app is the start, but since there's many clones and I can connect my codec subscription to it, it's like I'm not necessarily attached to it the way I am to cloud code and what they're doing with all the desktop app functions now.
20:30
Okay, let's talk about the scheduled recurring task thing in a second. But I first want to talk about a problem that we've been battling, I guess for almost two decades now, if you can believe it, like how old we are the. And that is high quality content, like creating it. So for those who don't know. In our previous several businesses we had an SEO agency for a number of years we had authority hacker. Before we start talking about AI, it used to focus on, on SEO training. And the biggest challenge really in that space is like how do you make good content that is better than everyone else? Because you know, not always, but usually that's what Google wants to surface. And so you know you're ranked top of Google, you get all the traffic and you know you'll win. So we've spent literally two decades trying to make really good content and it's always difficult to do that. Then when AI came along there was this whole wave of SaaS tools I guess which are just really wrappers for models which attempted to do that. You know, you had your one click article generators and then you had versions that tried to make like improve on it and do better. But at the end of the day all they were really doing was just rewriting other people's articles on Google, maybe trying to take the best things from a few different articles out there and not really expanding on that much, much further. To me it felt like a huge waste of time. I know some people used it temporarily to have some success but yeah, I wasn't a big fan of that. You have since developed a skill which can be used in cloud code and now also in codecs which I also
21:24
do it for an A10 actually like initially that was an NA10 workflow and I think if I show you something, I'll show you the NA10 version just because it's more visual. Skills are not very visual. But the point is that's the challenge with SEO content, right? It's like there's this concept of information gain and so how do you. But at the same time you still need to model after what's ranking because Google shows you what's working. So I'll show you the NA10 but this works also in plot code and I'll show you some examples after. And the idea is how do we create content that's both good quality but will also rank and it's actually three automations in a way. The first one is actually the competitor Analysis. Competitor analysis classic. It will scrape your competitors. I use a tool called Firecrawl in an 8N but closed code can do it on its own. And I have an agent that role plays as the user and thinks like ah, imagine I googled these keywords and I had these questions like brainstormed the questions, et cetera. I read these top articles, what am I Still frustrated with what? Am I not happy with what's missing? Right.
23:10
I just want to call attention to what you said there because it's really interesting. You have a sub agent role playing as a user in this case. But that is a concept which you can apply to like a great number of workflows, skills, automations for great effect.
24:09
Yeah, it works well. So the idea is like the other user, they read the article and it's like, am I happy with this, what's missing, et cetera. And so the idea is like, okay, well then after that, what needs to happen is you need to go get information that obviously is not on the top Google results to satisfy these needs because obviously we've read the top articles and I'm still frustrated as a user. So the automation. If I go back to actually the graph here, I've created a sub agent here and this agent is connected to a tool called apify that has a bunch of scrapers. And so the idea is like, well, if the information is not on Google, it's probably somewhere else on the Internet. And we now have a list of frustrations so we know where to look. Right. So that subagent is going on YouTube, it's going on Reddit and on Twitter. But you could put anything you wanted. If you wanted to scrape like Instagram reels, for example, for newswalls and stuff, or TikToks or whatever, you could do that.
24:26
That's essentially mimicking the fact that there's a lot of regurgitated slop on Google. So if you're constantly just going there for information, you're not really getting that information gain. But the real people providing that one nugget of information is somewhere down a Reddit thread, basically.
25:14
I'm sure you do that as well, right? It's like, let's say you have a problem probably quite often you go on Reddit or you go on YouTube or you'll go on Twitter or whatever and find information there. So it's mimicking kind of the user journey on Google Inc. I got frustrated. I still don't have everything I want. I go and find this missing information on these platforms. And apify is very good at that. And I have an agent output, basically a giant JSON file with all the findings related to frustrations. And I pass that to a planner agent. And that planner agent takes both the output from the content from the competitors so that we can mimic what's ranking. It takes the frustrations that were mined. It takes the output from the research agent that went on YouTube, on Reddit, etc. And it creates an outline. It's just the editor in chief of like, okay, we're going to make an article that both talks about the things that the competitors are talking about, so that we cover all these topics that are needed to rank, but also kind of goes further and uses all that extra information that we found. It outputs a big JSON file. Like each section has an outline. And so the point is you can break the outline into like 8, 9, 10 sections. And the reason we do that is because when we get into the writing, which is another sub workflow, by the way, this also works on cloak code, right? It's just more visual to see it here. So if I go in the writing loop here, actually this is what this automation looks like. And the way it works is let's say you wanted to write 5,000 words. With AI, quite often you can ask it to write 5,000 words or it will not. At least maybe with cloth code now you can. But the point is, in one LLM call, it's very difficult to get it to write a lot. So we write section by section. And so it's pretty simple. It just reads the article so far. And then I have multiple agents that are in this case, this is for list posts. So I have three types of sections. I have intro, list item and guide, and faq, Basically four types. Sorry. And so I write the intro first taking the part of the JSON that the outline for the intro writes, it edits it. And then this little thing actually counts the word count so that the editor knows if it needs to be longer or shorter and adds it to the Google Doc. Then it loops again to the next section. It reads the Google Doc so far. So now it has the intro and it's like it reads the outline for the first list item, writes it checks the word count, editor edits it, adds it to the Google Doc, then it loops again, second list item. Now it can read both the intro and the first list item and keeps adding to the article one section at a time. And you get the full compute of one API call for just one section rather than the whole article. But it is aware of everything it wrote so far, so it doesn't repeat itself and so on. And then the plan has been done by the previous agent. So the idea is the article writes itself slowly, section by section, and the model can just focus on the section that it's doing, kind of mimicking.
25:32
Again, what human would do is you write it and you kind of take stock of what You've written before, thinking about what to write in the next part?
28:08
Pretty much. So basically that's how the article is being written. It's being added to Google Doc. And then the way we operate this automation on an ATA and it actually operates on a Google Sheet. So you just put your keywords in a Google sheet and it will just do it. Like you can see the starter is a Google Sheet and after it's done writing, it will just write the meta. So your title tags, your descriptions, your social descriptions, that kind of stuff. Update the spreadsheet and keep rolling if you have multiple articles queued. But that's kind of the idea. The idea is like we have this, we have this on cloud code now. So it's like people don't like N8N because it's difficult to set up. It's more economical, to be honest. It's cheaper to do on N8N.
28:15
Is that because it's like step one, two, three, it always follows that process, whereas the agent's kind of deciding which thing to do next. And the reasoning for that is consuming tokens.
28:50
There's that and there's the fact that cloud models are expensive. And for some things you just don't need as expensive. Like you can see, I use, for example, to write the metadata, you don't need to use Opus, but if you are running your main thread on Opus,
29:00
you can't have it switch models within.
29:13
You can within the models of cloud. So you can have Haiku do it, for example, but you don't have to
29:17
do the whole thing.
29:22
But it's not so simple because if you want to have Haiku do it on cloud code, you need to actually start a sub agent. But starting a sub agent costs a lot of token because your main OPUS thread needs to write the prompt for the subagent, so you're still consuming tokens. So overall the efficiency is a little bit lower. And so if you were writing 1,000 articles, I would use the NA10 version. If you're writing like 10, 20 articles, sure, use the cloth code version. The point is we have.
29:23
If you're doing the Claude version, you can kind of interact with it a bit more and tweak it as it's going, versus you just get one output. And that's kind of it with nan.
29:49
Yeah, there's pros and cons to both system. But the point is this output pretty good articles. Like I have some examples like this one. I mean, it's content, right? It's not super beautiful. And so here's an example of one that we run that is called eight steps to optimize your meta ads with AI and it's content, right? It's not like there's no beautiful images or something. But first of all the formatting is decent. But most importantly, how good is the quality of the stuff that's talked about? I mean this one is quite technical, so I'm not going to take this one. But I liked this point for example, which is like make a two campaign loop. And the idea was like you make a campaign that is testing and then you have one that scales. So it's like you have one where you throw shit at the wal, you see what sticks and you have one where you take your winning creatives and then you kind of put more money behind them. This is the kind of advice I get at conferences when I go to talk to people who scale ads with AI, et cetera. And the reason why is because it went to read not the shitty AI article that ranks on top of Google, but actually went to read discussions on Reddit where people talk about on X YouTube videos of people who actually have a lot of views and engagement and so on. And that gives you much higher quality content in terms of how it does it. And you can see the writing is quite readable and nice. Again, you can steal that. There's a section in the automation where you can give example articles and it will just mimic that. But the point is the quality of the information from this is actually surprisingly good.
29:58
Actually it's a real game changer. I was about to say talk about
31:16
people hyping up AI.
31:22
I know and we are obviously biased because we create these things inside the AI accelerator and it's our product. It's like how people support us and how we make money. But like we were determined not to become one of those like memberships where, you know, people just spam out a thousand different prompts or agents or skills. That AI created them all. Nobody tested them and nobody uses them like the stuff we build people actually use. So, you know, if you're interested in this kind of stuff, you can head over to authorityhacker.com AI Accelerator and check it out there. We have a huge community in there with people building this stuff, using this stuff. We do Mastermind calls for some of the higher tier members as well every week. So yeah, if you're interested in that, go check it out. Let's talk though now, Gil, about scheduled tasks in Claude because this was one of their new features that they dropped quickly and we mentioned this before, it's not what I thought it was initially though.
31:23
What do you think it was?
32:22
Well, I had this idea of let's take an email marketing tool. When you schedule an email for Friday at 3 o', clock, you turn off your computer, you go on vacation and that email goes out at Friday 3 o'. Clock. Because everything is server side process. But that's not really possible with this.
32:23
Right?
32:41
Your computer has to be turned on for the process to work.
32:41
Yeah, because it works locally. It creates like a virtual machine. It's like if you open, you have it both in cowork and cloud code now by the way. So this is the cowork version and you go in schedule here and you can create. I'm not using it, but you can create a new task and then you can basically write your prompt and that could call a skill or something. And then you can just put the frequency like daily, hourly, weekly, etc. But your computer has to be on for it to work. And it's the same in cloud code. Like you can do the same thing. You can go in scheduled, run the task and then just run that basically.
32:45
And if it's not on when you turn it on, it'll just run everything from the backlog.
33:17
Yep, that's how it's going to work. And it's decent. It's kind of like a replacement for OpenClaw again. They're going after, after that. You know, it had one of the.
33:22
So everyone that bought like a stack of Mac minis is now going to get some value out of them.
33:31
I mean not really because as I told you, like if you have a MacBook Pro or MacBook, you can just plug it and there's a setting for your battery that says don't sleep. It's not going to ruin your battery because the power goes directly to the processor and the components without touching the battery. So if you're smart with it, like you don't need to buy a Mac Mini, just plug whatever MacBook you have and you can run this. But. But it's still decent. It's like a lot of people. Again, I was talking about N8 and people not using it because it's too complicated. It's kind of the problem. Right. It's a good tool, but it's difficult to use and this is much easier. You can just write a prompt, you can add skills if you need and it's really going to run whenever you need and do the thing that you want to do. I think it's already pretty good I think eventually you will be able to run it in the cloud. Like you can run code code in the cloud, but having a computer plugged in I don't think is a very, very difficult thing to do. Sure.
33:35
But at the end of the day it's a glorified client side cron job.
34:25
That's exactly what it is. It's just like you write a prompt, that prompt runs at the time that you set up, but you can do pretty cool prompts and then the logic can be built in the prompt itself. It's like, oh, if this happens, then do this. If this doesn't happen, then do that. So you can build the whole automation side of things and you can even say do nothing if this happens, for example, and then it stops and then it will just be like, like running, doing nothing. And then when there's a problem, do something. So you can do some pretty good stuff already. One thing as well is even for the connectors and stuff, you can connect it with an 8N right. You could have N8N for example, collect some data from a webhook or something like this that you could not connect directly, easily with the desktop app. And N8N just gets the data from the webhook and has, let's say a ticket on Notion or a ticket on ClickUp or whatever. And you have a task on your clothes desktop that says check the tickets every five minutes or every 10 minutes and then it will just process them for you. And then you can connect even more things to it and have a two step thing where N8N just does the collection and puts the thing wherever it needs to be to be processed as tickets by your clothed desktop. So again, it's kind of like skills, right? Skills came out and for two months nobody talked about them, nobody figured it out. And now we have this kind of new primitive which is this scheduled task and I'm telling you, nobody's talking about it now. And then in two months everyone's going to go crazy when the use cases are figured out.
34:30
I'm trying to think of what's the first use case I'd want to do for something like this. And the fact that your computer has to be on, it's got to be something internal. So making some documents or reports or something for me that I'm going to use on a certain schedule for some kind of content creation thing for, you know, internal meetings or you know, like we have this Friday meeting, like preparing some stuff for that, like perfect.
35:54
Every, you could do like. Yeah, every night, read all my call transcripts on Google Drive and prepare some social post ideas on notion based on it, for example. Like that could be a thing. And then on the morning you just go on notion and there's a bunch of drafts that are here that you can approve or refuse, for example.
36:19
And just to check. So if I go on vacation for seven days or I lose my laptop and buy another one, or maybe that's not a good idea if I buy another one, but if I'm not using it for seven days and I turn it on, it doesn't run seven times, it just runs.
36:38
I have no idea.
36:52
To catch up.
36:53
I have no idea. You try, you tell me. But yeah, these are the types of
36:54
things that I think about what could possibly go wrong. Whereas you're just like, oh, let's just try it.
37:00
Well, they're vibe coded, all these things, right. There's no way Entropic release is a feature there at this point. And it's not all vibe coded and it's not fully tested. So for example, they have this thing as well now that I'm putting in a new web coding your site course that allows you to visually edit your websites as you build them on a cloud code app. Very cool. But initially it was very buggy. When they released it on the first day it was horrible and now it's much better. And so the point is they don't even test that heavily, they just kind of ship it. They wait for people to report all the issues, like they crowdsourced the QA to users and then they kind of like pass the. They just have someone or probably an agent filtering which ones are relevant and then automatic applying the fixes. And now if I use it, it's good. So the way it behaves today is probably not great. And then there will be a bunch of complaints and then they will fix it.
37:05
That's the kind of new almost like release philosophy across a lot of businesses, not just AI companies obviously when you're dealing with like life threatening or mission critical applications or use cases like you gotta be a bit careful there. But you know, for a lot of the stuff we do, it's like it's marketing. It's not like, like if we get something a bit wrong, nobody dies.
37:51
Exactly. And it's like people, I mean look, Anthropic is just enforced it. They're a big company and they're willing to ship some slop just to see what sticks. And I think maybe we should do that more maybe. Obviously the market's okay with it. They're soaring. People love it. The reason why is because there is lesser feature. Like 90% of the people who glaze it and talk about it and share it and spread it, it don't use it.
38:13
It's the same way as talk about the feature that they released.
38:38
They've never used it. It's like the headlines on social media. People read the headline and make their opinion based on that. And it's like, same thing happens here. And so the sentiment is hugely positive because the way it's marketed, it's like they don't portray the bugs, obviously. And the small minority of users who are unhappy can have an easy report bug button, and then they just have Claude fix everything. And in the end, it's a net positive for them because the positive sentiment is massively overshadowing the few negative experiences of people who actually use it. Reminds me of that, I think as well.
38:41
There's this thing where people forget the negative. So as long as they fix it or they come back with something positive next, then they kind of just remember those things instead.
39:12
You know what it reminds me of? It's like a stupid story. I think I might have said it before. It's like story I heard when I was a kid. It's like there's a kid that's like, oh, buy a lottery ticket to win a horse, right? And then he goes around the village, the animal. Yeah, and then he goes around the village and he sells 100 tickets to win the horse, right? And then he draws one winner. The winner comes over, realizes the horse is dead, and then he's like, oh, I'm sorry, you can have your money back. And then that's basically what happens. It's like he got the money from 99 people who lost and thought they would get a horse that's alive. And the one unhappy person gets a refund and is not that mad. And in the end, the kid made $99 if he sold the tickets for a dollar. And so it feels like it works that way for features.
39:20
What's the moral of the story here?
40:12
The moral of the story is that the PR of Win a Horse for a Dollar ticket was an overwhelmingly positive thing. The same way as entropy releases their skill and the people who experience the negative side of it, which is it's buggy as hell. It doesn't work in its intro version. You can just make up for them to one person.
40:13
It's an intro version. Here's the fix, like two weeks later. And then it's like everyone else who got Something good or didn't test it, but thought there was something.
40:31
All the people who didn't try it and who didn't experience it, they just had an overwhelming positive experience. It's kind of the same thing. And so I think it works that way. And I think as a business, it's kind of like a thing that you need to leverage. It's like, it sounds tricky, but in a way it's like, it works for growth. It's kind of a growth hack, basically.
40:41
Like, people talk about vibe marketing as like, you know, you're vibing your marketing, but this is almost like the reverse of it. It's like vibe perception in a way. Like what? Well, I guess that that is like
40:58
what the world you're manipulating. You're manipulating people a little bit, but it just works. And it's like, let's be honest, look at the world in which we live. Like, it's like, this is how all communication is done. Like, that's kind of like a higher level lesson. We're not really talking about AI here, but that's how now Entropic is running their things. They're losing things that don't work. Like the site editing thing did not work on day one. I tried it and now it works. But they did exactly that and everyone got crazy and there was a million posts on social media of people who reposted the exact video of Entropic video. They didn't try it. So yeah, that's how it works. But overall, scheduled task is okay. I think it will get better. I think it would be nicer if you could almost have a cron job that would be powered by AI where it's like, you just talk to it and it's like a condition. You're like, oh, if the Temperature is below 10 degrees where I live and it's raining, then you run this operation. It just kind of checks with AI and then, you know what I mean? And decides to trigger.
41:08
It's like your home automation type thing.
42:03
Yeah. But you could just make natural language conditions for triggers. I think that would be a cool upgrade, which I'm sure they will do if they scrape this podcast and then run a DI generation on that.
42:05
There you go. You heard it here first. We haven't talked about Google in like forever because I don't know, it seems to have been pretty quiet this year. They haven't really.
42:17
They released the Gemini much. No, they released 3.1 flashlight, but it's a bad model. So it's their smallest models, but they tripled the price. So it's very expensive for what it is now.
42:27
Originally they just had Pro and Flash. Flash was a cheap model, but then it got more expensive, so they made Flashlight, which really was very cheap, but they've tripled the price. That's ridiculous.
42:40
So now the API, it's like a $1.5 per million output token, for example, and it's only half the price of Flash, which is way smarter. And so before you'd use this model to do basic categorization or things like that. It's still cheaper, but it's not that much cheaper anymore. And yeah, it just doesn't feel like I'm ever gonna use this model. So, like, the 2.5 flashlight was very good value. 3.1 flashlight, because there was no 3 not so good. So, yeah, the disappointing releases so far From Google, the 3.1 Pro was not very good. 3.1 flashlight, also, it's like, it's okay, but it's expensive. It matches 2.5 flash. Yeah, not that great so far.
42:50
There's a tweet this week from Logan Kilpatrick who works at Gemini, I think he's our head of product. So it's going to be a fun week of launches. Launches, plural ahead. So that could be Gemini 3.1 flash. I think 3.1 flash.
43:32
Yeah, not flashlight flash, the middle one. Okay, this one is not live yet, but 3.1 flash images, which is nanobana2 is live and came out. So if you check the API name, it's called 3.1 flash images. So there is a 3.1 flash model, therefore it's very likely it would release.
43:47
Okay, so sorry, just so I'm clear, 3.1 flash image is nanobanana2.
44:04
Yes.
44:09
Okay, cool. Why do they keep having this Nano banana naming term line with different model?
44:10
It's super confusing because it was an internal name initially. They always have fun names when they test the models before they tell you which model it is. And this leaked and people got crazy for nanobanana and it just came. Skipped it and that's it. But if you go on the AI Studio now, let me show you, they will actually call it. You'll see it's called Nanobanana 2, but if you look at the name under it says Gemini 3.1 flash image preview.
44:18
So Nanobanana 2 is a flash model as opposed to a Pro model. It's newer and it works in a fundamentally different way to the current edition of Nanobanana or Nanobanana 1 or Nanobanana Pro, whatever you call it it, which. And I had to research this this morning. It uses a fundamentally different approach to building the image to previous nanobana models, which worked on something called diffusion pattern generation, which is something I just researched today. But essentially it starts off with an image which looks like a blur or static. Right. And then it's running. It gets your image by running hundreds of rounds of, of prompt refinements to say like, hey, how can I make this look more like a cat surfing on Mars? Or whatever you're trying to do. And it just like gradually builds it to get better and better and better from there. And that's fundamentally why if you ever try and edit an AI image or ask it to change it in some way, it's so terrible at it. Like, it really feels like you're talking to like someone that's never created an image before. It's like you've just done this. Like, why, why can't you just move this fork over from the right to the left?
44:43
Because it's not so simple.
45:56
It's at it. Yeah, because like. And actually something you said at the very start when Nano Banana came out, it's like, it's much better if you don't get the thing on like the first or second attempt to just like start again, start again, rather than try and try and fix it. So what they're claiming with Nano Banana 2 is that instead of this diffusion pattern generation approach, they're taking a multimodal reasoning image model. So it was trying to kind of like distill what this all means from like a, a marketing versus real world perspective. But essentially it's working across text image inputs. It's taking into account everything you've said in your conversation history. With it looking at consistency here, it's looking at specific object relationships, where what goes where and how it's related to the other. So this is kind of. And again, a lot of this is above my pay grade, but it's like kind of baked into the model at a more fundamental level.
45:57
I'm sorry, but I think nanobana Pro did all of that already. I don't think it was different. There was also a reasoning model built on top of it.
46:53
This is what they're claiming.
47:02
Yeah, but it's like Apple, right. They just tell you again that there's a power button on the computer when they release it. But there was one on the previous one, you know.
47:04
Sure. And I will say as well, like, it's worse. So Nano Banana 2, you cannot generate as good images in most cases for certainly Things like human faces and scenes. There's many, many, many Reddit threads comparing these and almost universally people say Nanobanano 1 Pro is better than 2, but that's what to expect. Like it's a Flash model, right?
47:11
Well, except for text, first of all, it's better at text and it's most importantly half the price. So if you generate a lot of images, you know what I did, I love these skills. I was spending not a ton, but 10, $20 per day on API costs and it's faster as well. So I could test faster, but for text it's actually better. So I made a skill actually in AI Accelerator that create infographics because now the text is good enough. You could do it with Pro, but for example, I ran it on an Ahrefs blog post and I did a mini infographic. I can do some that are twice the height and there is one or two texts, for example, you see, don't bury the lead inside the lede. But literally I sent it to Tim from Ahrefs and he was like, yeah, that's pretty good, we'd probably use it. He didn't see this. I think he missed it. But the point is. Yeah, and it does also I made it do the social one. So if they wanted to post on it, LinkedIn when they make a new post, for example, and you can lift the branding, you can do all of that. It follows instructions very well. Like you can see even the Ahrefs logo is done properly and so on. Overall it's like, it's a good model and I would still pick it over Pro even if it's slightly, slightly, slightly worse.
47:33
We're expecting Nanobanana 2 Pro to come out soon.
48:44
Maybe this year, maybe this week.
48:48
Actually the next one.
48:49
Yeah, let's see. But yeah, it's good.
48:50
And despite us talking mostly about Claude and now OpenAI as the daily models we're using, we still use Nanobanana for all our image generation stuff, right?
48:53
Yeah, I mean, yeah, for like video stuff even like, I mean now there's this Seedens too that is better than Veo. But like all the multimedia stuff like Google is still very strong. So yeah, I use nanobanana for pretty much everything and that's most of my use cases for Gemini right now. Although Gemini's free Flash is still a good model for the price, I quite like it.
49:03
And if you want that infographic skill, it's also available in the AI accelerator. So head over to authorityhacker.com AI accelerator. Gael, any final words of wisdom?
49:23
Before we wrap up, look, GPT 5.4 is better than Claude in some aspects, but I think for most people you shouldn't care about it. I think most people need to. They will make more progress by optimizing their setup and using these tools for more use cases than switching every time there's a new model. Because for all I know, next week we get Sonnet 5 and it's better than GPT5.4. So it's like we're here to kind of report on that, but I want to kind of take it back a bit. I spend my days doing this and it's like I use it a lot. So there's some benefits doing that because I can handle it. For most people who listen, I think don't try to get the top 1% of everything, just try to get these workflows working, these agentic workflows. Try to build your first skills, try to build your first website with it, Try to do all this stuff and then once you're there and then you hit the limit of the quality of the models, then maybe consider it if you are an OpenAI user. Good news. You have now finally a good model. You can finally do most things most people using have been doing for at least since the end of last year. And yeah, this changes all the time. So don't panic too much. It's interesting to follow. The progress is still extremely fast and impressive. Like GP 5.4 is a good model, but yeah, that's why I would not want to trigger mass exodus with this episode. I think you'll be fine with Cloud.
49:34
Okay, well make sure you subscribe to this podcast if you're watching on YouTube, because we're most likely going to be covering the Gemini or Google releases if they are actually relevant, which I suspect they will be on the next episode. So yeah, we'll see you next week, hopefully for that one. And also if you're listening on the audio version, please go to whichever player you're on and leave us a review there because that really helps us to get more visibility and just reach more people. Something we're trying to push right now, now with this new AI automation push that we're on. So thanks everyone and see you next week for another episode.
50:56