The AI Daily Brief: Artificial Intelligence News and Analysis

Code AGI is Functional AGI (And It's Here)

24 min

•Jan 18, 20265 months ago

Summary

The episode argues that 'Code AGI' represents functional AGI and is already here, based on recent breakthroughs in AI coding agents that can work autonomously for extended periods. The host contends that coding ability serves as a universal lever for most modern work, making code-capable AI effectively general intelligence for practical purposes.

Insights

Coding serves as instrumental generality - if AI can program, it can create capabilities on demand across most domains
The gap between idea and execution has collapsed for those using AI coding tools, fundamentally changing competitive dynamics
Enterprise organizational models built around execution bottlenecks are becoming obsolete as AI removes technical constraints
The frontier of AI capabilities and median enterprise deployment have decoupled, creating divergent tracks of development
Competitive advantage is shifting from execution capability to speed of iteration and quality of ideas

Trends

Long-horizon AI agents working autonomously for 30+ minutes becoming standardShift from AI as conversational tools to AI as autonomous workers and colleaguesCollapse of distance between business idea and technical executionEnterprise AI adoption falling behind startup innovation at accelerating paceOrganizational restructuring required as technical bottlenecks disappearManagement roles shifting from resource allocation to taste and judgmentAI usage patterns moving from occasional use to continuous operationCustom software development becoming accessible to non-technical usersTraditional org charts becoming obsolete in AI-enabled environments

Topics

Artificial General Intelligence (AGI) definitions and thresholds Long-horizon AI agents and autonomous operation AI coding capabilities and programming automation Enterprise AI transformation and organizational change Competitive advantage in AI-enabled businesses AI agent persistence and continuous operation Functional vs scientific definitions of AGI AI reasoning and inference capabilities Startup vs enterprise AI adoption patterns Organizational bottlenecks and management restructuring AI tool integration and workflow automation Technical debt and AI development practices Business process automation through AI AI safety and reliability requirements Economic implications of AI autonomy

Companies

People

Quotes

"Code AGI will be achieved in 20% of the time of full AGI and capture 80% of the value of AGI"

Sean Wang•N/A

"AGI is the ability to figure things out. That's it."

Pat Grady and Sonja Huang•N/A

"AGI is achieved when it makes economic sense to keep your agent running continuously"

Dan Shipper•N/A

"I've done more personal coding projects over Christmas break than I have in the last 10 years it's crazy. I can sense the limitations, but I know nothing is going to be the same anymore."

David S. Holtz•N/A

"Your entire organizational model is built for a world where execution was the bottleneck and that world is over"

Host•N/A

Full Transcript

Today on the AI daily brief why code AGI is functional AGI and why functional AGI is here? The AI daily brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsor zencoder, Robots and Pencils section and Super Intelligent. To get an ad free version of the show go to patreon.com aidaily Brief and if you are interested in sponsoring the show, send us a Note@ SponsorsIDailyBrief AI so we are back now with another long read slash big Think episode, and this week we're getting into a topic that I have been kind of obsessing about for the last several weeks. It feels to me quite clear that something dramatic has shifted. Obviously, I don't mean some new model that changes everything, but more it feels as though we've digested what the latest round of models is actually capable of, we've had enough time with them for them to start to shift our behaviors, and the implication of all of that is, fundamentally speaking, some different new era in the story of AI and more broadly in the story of work. It is a shift which I am still trying to figure out how to put words around, but one that I am convinced has profound implications for how companies do what they do. To some extent, the shift is starting to come home to roost in a concerted conversation around whether we are finally at AGI. I will argue that we are, with some nuance, but what I'm going to do first is read some excerpts from a recent piece by Sequoia's Pat Grady called 2026 this is AGI. Follow it up with a more skeptical piece by Every's Dan Shipper called Toward a Definition of AGI. And then I'm going to add my own thoughts, steel manning both perspectives and trying to end with where I think is the most useful place to be. Let's start with Pat's piece. It's actually by Pat Grady and Sonya Huang, and begins Years ago, some leading researchers told us that their objective was AGI. Eager to hear a coherent definition, we naively asked, how do you define AGI? They paused, looked at each other tentatively, and then offered up what's become something of a mantra in the field of well, we each kind of have our own definitions, but we'll know it when we see it. The vignette typifies our quest for a concrete definition of AGI. It has proven elusive. While the definition is elusive, the reality is not AGI is here now. Coding agents are the first example. There are more on the way. Long Horizon agents are functionally AGI and 2026 will be their year. Now in the next section, Pat and Sonja make sure to qualify that they do not have any sort of scientific authority to propose this definition. And yet with that said, they offer what they call a functional definition of AGI. AGI, they write, is the ability to figure things out. That's it. A human who can figure things out has some baseline knowledge, the ability to reason over that knowledge, and the ability to iterate their way to the answer. An AI that can figure things out has some baseline knowledge, pre training, the ability to reason over that knowledge, inference, time compute, and the ability to iterate its way to the answer. Long Horizon agents the first ingredient, knowledge and pre training, is what fueled the original ChatGPT moment in 2022. The second, Reasoning and Inference Time Compute, came with the release of 01 in late 2024. The third iteration and long Horizon agents came in the last few weeks with CLAUDE code and other coding agents crossing a capability threshold Generally intelligent people can work autonomously for hours at a time, making and fixing their mistakes and figuring out what to do next without being told Generally intelligent agents can do the same thing. This is new. So what's an example of this new capability that they're talking about? They provide an example of a founder telling his agent that he needs a developer relations lead. He gives a set of qualifications, including the fact that this person needs to enjoy being on Twitter. The agent starts in an obvious place. LinkedIn searches for Developer Advocate, for example. Unfortunately, it finds hundreds of examples, so it has to iterate. It pivots, they write to signal over credentials. It searches YouTube for conference talks. From there it finds 50 plus speakers and filters for those with talks that have strong engagement. Next, because of that Twitter qualification, it cross references those speakers with Twitter. The total number is now whittled down to a dozen with real followings and posting real opinions. Honing in even further for who's been most engaged in the last few months. That total list, which was hundreds and then fifty and then dozen, is now down to three. Now it can hone in on those three. One just announced a new role. One is the founder of a company that just raised funding. The third was a senior dev rel at a series D company that just did layoffs and marketing. The agent they write drafts an email acknowledging her recent talk, the overlap with the startup's ICP and a specific note about the creative freedom a smaller team offers. It suggests a casual conversation, not a pitch. Total time 31 minutes the founder has a shortlist of one instead of a JD posted to a job board, this patents on your right is what it means to figure things out. Navigating ambiguity to accomplish a goal, forming hypotheses, testing them, hitting dead ends and pivoting until something clicks. The agent didn't follow a script. It ran the same loop a great recruiter runs in their head. Except it did it tirelessly, in 31 minutes, without being told how to be clear. Agents still fail. They hallucinate, lose context, and sometimes charge confidently down exactly the wrong path. But the trajectory is unmistakable, and the failures are increasingly fixable. So what? Well, soon, they say, you'll be able to hire an agent, which, with a hat tip to Sarah Guo, they call one litmus test for AGI. You can hire GPT 5.2 or Claude or Grok or Gemini. Today, more examples are on the way. In medicine, Open Evidence's deep consult functions as a specialist in law. Harvey's agents function as an associate. They go through examples in cybersecurity, DevOps go to market, recruiting, math, semiconductor design, and AI research. All of this, they say, has profound implications for founders. The AI applications of 23 and 24 were talkers. Some were very sophisticated conversationalists, but their impact was limited. The AI applications of 26 and 27 will be doers. They will feel like colleagues. Usage will go from a few times a day to all day every day, with multiple instances running in parallel. Users won't save a few hours here and there. They'll go from working as an IC to managing a team of agents. Remember all that talk of selling work? Now it's possible. What work can you accomplish? The capabilities of a Long Horizon agent are drastically different than a single forward pass of a model. What new capabilities do Long Horizon agents unlock in your domain? What tasks require persistence where sustained attention is the bottleneck? Saddle up, they say. It's time to ride the Long Horizon agent exponential. Today, your agents can probably work reliably for around 30 minutes, but they'll be able to perform a day's worth of work very soon, and a century's worth of work eventually. Ultimately, they write, the ambitious version of your roadmap just became the realistic one. Let's move over to Dan Shipper's toward a definition of AGI. Dan writes, when an infant is born, they are completely dependent on their caregivers to survive. They can't eat, move, or play on their own. As they grow they learn to tolerate increasingly longer separations. Gradually, the caregiver occasionally and intentionally fails to meet their needs. The baby cries in their crib at night, but the parent waits to see if they'll self soothe. The toddler wants attention, but the parent is on the phone. These small, manageable disappointments, what the psychologist D.W. winnicott called good enough parenting teach the child that they can survive brief periods of independence over months and years. These periods extend from seconds to minutes to hours, until eventually the child is able to function independently. AI is following the same pattern Today. We treat AI like a static tool. We pick up when needed and set aside when done. We turn it on for specific tasks, writing an email, analyzing data, answering questions, then close the tab. But as these systems become more capable, we'll find ourselves returning to them more frequently, keeping sessions open longer and trusting them with more continuous workflows. We already are. So here's my definition of artificial general intelligence is achieved when it makes economic sense to keep your agent running continuously. In other words, we'll have AGI when we have persistent agents that continue thinking, learning, and acting autonomously between your interactions with them, like a human being does. I like this definition because it's empirically observable. Either people decide it's better to never turn off their agents or they don't. It avoids the philosophical rigmarole inherent to trying to define what true general intelligence is. And it avoids the problems of the Turing Test and OpenAI's definition of AGI. In the Turing Test, a system is AGI when it can fool a human judge into thinking it's human. The problem with the Turing Test is that it sets up movable goalposts. If I interacted with GPT4 10 years ago, I would have thought it was human. Today I'd simply ask it to build a website for me from scratch and I'd instantly know it was not human. OpenAI's definition of AGI, which is AI that can outperform humans at most economically valuable work, suffers from the same problem. What constitutes economically valuable work constantly changes. We will invent new economically valuable work that we can perform in conjunction with AI. These hybrid roles then become the new benchmark that AI will need to learn to do before it counts as AGI. So the definition is an ever receding target. By contrast, the definition I proposed, AGI, is achieved when it makes economic sense to keep your agent running continuously is a binary, irreversible and immovable threshold. I like this definition because in order to meet it, we will need to develop a lot of necessary but hard to define components of 1. Continuous learning the agent must learn from experience without explicit user prompting. 2. Memory management the agent needs sophisticated ways to store, retrieve, and forget information efficiently over extended periods. 3. Generating, exploring, and achieving goals the agent requires the open ended ability to define new useful goals and maintain them across days, weeks, or months while adapting to changing circumstances. 4. Proactive communication the agent should reach out when it has updates, questions, or requires input, rather than only responding when summoned. It must also be able to be interpreted and redirected by the user. 5. Trust and Reliability the agent must be safe and reliable. Users will not keep agents running unless they are confident the system will not cause harm or make costly errors autonomously. While I've described these capabilities, I'm deliberately avoiding the trap of trying to specify exact technical criteria for each one. What precisely constitutes continuous learning or trust is difficult to pin down. Instead, my AGI definition entails that all of these capabilities are present to some extent, and these capabilities already are present in limited ways. ChatGPT, for example, has rudimentary forms of memory and proactive communication. The length of time during which an AI can run on its own is increasing gradually and consistently. When GPT3 first came out, the primary use case for AI was the GitHub copilot. The best it could do was complete the line of code you were already writing. ChatGPT lengthened the amount of time the AI could run from the amount required for you to press tab to complete a line of code to the time required to deliver a full response in a chat conversation. Now, agentic tools like Claude code, deep research, and codecs can run for between five and 20 minutes at a stretch. The trajectory is clear from seconds to minutes to hours and to days and beyond. Eventually, the cognitive and economic costs of starting fresh each time will outweigh the benefits of turning AI off. If you're using AI to code, ask yourself, are you building software or are you just playing prompt roulette? We know that unstructured prompting works at first, but eventually it leads to AI slop and technical debt. Enter zenflow. Zenflow takes you from vibe coding to AI first engineering. It's the first AI orchestration layer that brings discipline to the chaos. It transforms freeform prompting into spec driven workflows and multi agent verification where agents actually cross check each other to prevent drift. You can even command a fleet of parallel agents to implement features and fix bugs simultaneously. We've seen teams accelerate delivery 2x to 10x, stop gambling with prompts, start orchestrating your AI turn raw speed into reliable production grade output at Zenflow Free. Today's episode is brought to you by Robots and Pencils, a company that is growing fast. Their work as a high growth AWS and Databricks partner means that they're looking for elite talent ready to create real impact at Velocity. Their teams are made up of AI native engineers, strategists and designers who love solving hard problems and pushing how AI shows up in real products. They move quickly using roboworks, their agentic acceleration platform so teams can deliver meaningful outcomes in weeks, not months. They don't build big teams, they build high impact, nimble ones. The people there are wicked smart with patents, published research and work that's helped shaped entire categories. They work in Velocity pods and studios that stay focused and move with intent. If you're ready for career defining work with peers who challenge you and have your back, Robots and Pencils is the place. Explore open roles@rootsandpencils.com careers that's robotsandpencils.com careers here's a harsh truth. Your company is probably spending thousands or millions of dollars on AI tools that are being massively underutilized. Half of companies have AI tools, but only 12% use them for business value. Most employees are still using AI to summarize meeting notes if you're the one responsible for AI adoption at your company, you need section Section is a platform that helps you manage AI transformation across your entire organization. It coaches employees on real use cases, tracks who's using AI for business impact, and shows you exactly where AI is and isn't creating value. The result? You go from rolling out tools to driving measurable AI value. Your employees move from meeting summaries to solving actual business problems and you can prove the roi. Stop guessing. If your AI investment is working, check out section@sectionai.com that's S E C T I O N A I.com Today's episode is brought to you by Superintelligent. Superintelligent is a platform that, very simply put, is all about helping your company figure out how to use AI better. We deploy voice agents to interview people across your company, combine that with proprietary intelligence about what's working for other companies, and give you a set of recommendations around use cases, change management initiatives that add up to an AI roadmap that can help you get value out of AI for your company. But now we want to empower the folks inside your team who are responsible for that transformation with an even more direct platform. Our forthcoming AI Strategy Compass tool is ready to start to be tested. This is a power tool for anyone who is responsible for AI adoption or AI transformation inside their companies. It's going to allow you to do a lot of the things that we do at Super Intelligent, but in a much more automated, self managed way and with a totally different cost structure. If you're interested in checking it out, go to aidailybrief AI Compass, fill out the form and we will be in touch soon. So both good entries into the canon of what is AGI. But as I indicated at the beginning, what I think is actually most relevant about them right now is the fact that we are having this conversation right now. We are having this conversation because people have a sense that something big has shifted, but something big does not necessarily convey AGI. Indeed, one of the best Steelman arguments against what we have now being AGI is the need to separate WOW level competence from general autonomy. The argument would go along the lines of AGI isn't just about being able to generate impressive outputs across domains, it's about robust self directed competence. Under real world constraints, AGI could be dropped into novel situations, define success criteria, manage Long Horizon execution, and reliably converge without a human acting as the external executive function. And as much as things have changed, what both of the pieces we just read have in common is that more than anything else, they're disagreeing about which point we're on on an agreed upon trajectory. The funny thing, in fact about the this is AGI piece is that when you actually read it closely, it's not so much saying that this is AGI, it's saying that we're really, really close, that what is AGI, that is these Long Horizon agents are available ish now and just getting better. And that because we're now within call it months rather than years of AGI, you better start preparing. Dan isn't really disagreeing with that, although he doesn't get into timelines. Instead he's pointing out all of these things that need to happen to get to a certain point of indispensability, which he is arguing is the key thing. But what about what we've seen over the last couple of weeks? The sense among some of the most enfranchised and powerful users of AI that we really are in a fundamentally different moment. To take one example of a type of testimony we've seen lots of Midjourney founder David S. Holtz tweeted on January 3 I've done more personal coding projects over Christmas break than I have in the last 10 years it's crazy. I can sense the limitations, but I know nothing is going to be the same anymore. And honestly, this brings up a more interesting and nuanced take on it's not AGI yet. That argument would go something like yes, Claude code and similar tools have crossed the threshold for coding specifically, but generality is the whole point of general intelligence. There's still so much that current AI fails at, like novel reasoning, multi step planning and unfamiliar domains. These new big breakthroughs that everyone is sensing happened in a domain that's really well suited to LLMs, well documented pattern rich verifiable outputs. That's not the argument would go evidence of general intelligence. It's evidence of domain fit. This would in some ways be an argument about the jaggedness of AI, the idea that it can be superhuman in one area and infantile in another. And indeed it is the case that this sense of what has shifted is about AI's capacity to code. But I keep coming back to this essay from Sean Wang, AKA Swix, when he decided to join Cognition this line, which absolutely wins the award for the couple of sentences that have lived most rent free in my head since they were written. Shawn wrote the central realization I had Was this code AGI will be achieved in 20% of the time of full AGI and capture 80% of the value of AGI now. For him this was an argument to simply do code AI now rather than later. But I think what I would argue is that code AGI doesn't quote, unquote capture 80% of the value of AGI. I think code AGI is more or less just functional AGI. The argument here is that coding is effectively a universal lever in the modern world. Most economically valuable work to reference OpenAI's terminology has been computer shaped for a long time. If your job touches a screen, an API, a database, a spreadsheet, a ticketing system, a CRM, a repo, a dashboard or a docs tool, then in principle it's addressable by software. So if an AI can understand intent, translate intent into procedures, write and modify code, run tools, inspect outputs, and iterate until it meets acceptance criteria, then it has a meta skill that can simulate competence in many domains by building the missing tool. And in that framing, coding isn't one domain, it instead is closer to instrumental generality. Want data analysis? Write SQL or Python notebooks? Run them, interpret the results, generate charts and build pipelines? Want operations? Automate workflows across systems, tickets, approvals, audits, alerts? Want finance, Pull data, reconcile, generate variance analysis, draft Narratives want product spin up prototypes, instrumentation, AB analysis, telemetry pipelines. Basically, the idea is that if you can program, you can create capabilities. And if you can create capabilities on demand, you're not narrow, you're general in a way that matters. For real work, you could take this argument. Even coding doesn't just help you build general capabilities, it also is to be good in some ways. A test of general reasoning, non trivial coding forces, abstraction, decomposition, causal reasoning, adversarial thinking and iterative debugging. Those are indicators ultimately of general intelligence. And I would argue that a lot of what feels different about building with AI coding tools now, as opposed to six months ago, is in that set of general reasoning capabilities, rather than just how good it is at knowing a bunch of different coding languages. We recently had an issue come up where a company that we were producing an AI and agent readiness audit for found contained in one of the recommendations a tool or platform that was at contrast at the tech stack that they currently have. Now this is a problem that we are extremely conscientious of. It would be very easy to recommend a bunch of platforms that an enterprise is never going to use. It's much more difficult and much more valuable to let them know how to work with the band of tools they have or the things that they would consider to actually solve the problems that are clear and present for them. And so we spent a lot of time making sure that that type of recommendation doesn't make it through. But it did. And so as the team was talking about new processes and procedures for making sure this didn't happen anymore, I was following the conversation on Slack from a haircut. It struck me that this might not actually be all that difficult, at least if you removed it from the domain of the human. And so as I was sitting there, I fired up my mobile web browser version of Lovable, and by the time the haircut was done, I had a checker to run final reports through that would make sure to compare the tech stack of the company to all the recommendations in the report and in literally a matter of seconds make sure before the final deliverable was sent that that sort of thing didn't happen. Now, of course, there are even more sophisticated ways to do this with code. Basically we could just build that capability that I built as a standalone into the overall processing pipeline. But the point that I'm trying to make is that this wasn't coding to solve a technical problem, it was coding to solve a business problem. Increasingly, the people who are most adept at working with AI for The people who are most adept at working with AI, that's what they're doing. The question is decreasingly, what's the best AI tool for this? And increasingly, can I build some custom software to solve or enable this? So what's happening right now at all these different startups is not some cute little example of the non technical folks being able to vibe code and prototype features. What's happening is a complete collapse in the distance between idea and execution for everyone. And that's amazing. For those startups we are going to see complete and utter re evaluations of how to do everything. And startups and small companies are going to be the incubators of that change. The thing that would worry me right now if I was an enterprise leader is that this Rubicon that we've crossed starts to feel more like a shift in kind than a shift in scale. What I mean by that is that for the three years since ChatGPT was launched, obviously enterprises were behind more nimble companies and AI users. Relatively speaking, there's all sorts of systems inertia, there's compliance issues, governance issues, et cetera. But the patterns of what they were doing were still similar to the patterns of what other people were doing, just maybe with a little bit slower adoption and a little bit more process along the way. Still, they were running on a parallel track. I now believe that the tracks have diverged. The frontier of what's possible and the median of what's deployed in enterprises has decoupled in a way in which I believe that they are increasingly pointed in different directions. The standard enterprise invocations at this point to audit and automate your workflows, experiment with AI, will ultimately contain the transformation possible within existing power structures, more or less keeping the org chart intact. The reality is, in a world of code AGI, a world of functional AGI, the org chart is broken. Bottlenecks shift from who can code to who has good ideas. The role of management shifts from resource allocation to taste and judgment. Competitive advantage shifts from execution capability to speed of iteration. And the gap increasingly isn't linear. It's compounding. Every month someone is building in the new paradigm, they are getting comparatively farther ahead from those who aren't. So is the answer as simple as letting everyone on your team vibe code? Honestly, I think you could do worse. I think we are at a moment where increasingly the modality by which things are produced in this world looks and is different to the way that it was just a few years ago. Even just a few months ago, the message is less upskill your workforce and audit your AI use cases and much closer to your entire organizational model is built for a world where execution was the bottleneck and that world is over. I worry for enterprises because this new set of shifts involves accepting a loss of control, a restructuring of incentives and a total transformation of process that is even harder than the AI transformation that has come so far. And to the extent that you are in one of those enterprises and looking for a bright spot in this, it's that at least most of you are in this together and that it is unlikely that many enterprises are going to get comfortable fast with the types of change they really should be making. But the change I believe has happened and I think that the rewards for the companies, not just startups, but enterprises too, who can lean into this new capability set and can live on the other side of this inflection will be immense. So like I said at the beginning, I think Code AGI is functional AGI and I think it's here. This is something that I will be exploring a lot more in the weeks to come. For now though, that is going to do it for today's AI daily brief. Appreciate you listening or watching as always and until next time, peace.

0:00