All The Headlines, All The Model Drops...

24 min

•Apr 24, 20264 days ago

Summary

DeepSeek released V4 Pro and V4 Flash models undercutting US competitors by 5x on pricing, while OpenAI launched GPT 5.5 to reclaim benchmark leadership. Meta announced 8,000 layoffs to offset AI spending, and Google committed up to $40 billion in investment to Anthropic.

Insights

Chinese AI models are forcing a pricing reset in the market, with DeepSeek V4 Pro at $1.74/$3.48 per million tokens versus GPT 5.5 at $5/$30, fundamentally challenging the unit economics of US AI companies
The AI capability race is shifting from raw performance to agentic autonomy and real-world task completion, with GPT 5.5 designed for multi-step workflows rather than single-prompt interactions
Major tech companies are making massive capital commitments to AI infrastructure (Google $40B to Anthropic, Meta record capex) while simultaneously cutting workforce costs through layoffs, indicating a shift toward AI-augmented operations
Benchmark performance has become a critical market signal tracked with 'Talmudic intensity' by investors and researchers, with single percentage point improvements driving competitive positioning and valuation
The frontier AI market is consolidating around three players (OpenAI, Anthropic, Google) with deep integration into cloud infrastructure and chip partnerships creating high barriers to entry

Trends

Cost-performance parity emerging between Chinese and US frontier models, forcing Western AI companies to compete on capability rather than price aloneShift from conversational AI to agentic AI systems capable of autonomous multi-step task completion across codebases, documents, and operating systemsStrategic vertical integration of AI companies with cloud providers and chip manufacturers (Google-Anthropic, OpenAI-NVIDIA) to control inference costs and latencyWorkforce optimization through AI adoption enabling companies to cut headcount while maintaining output (Meta's approach)Benchmark gaming and metric obsession among investors and researchers as primary signal for AI progress and market valuationComputing capacity becoming the primary constraint and competitive moat for frontier AI development, with gigawatt-scale infrastructure commitmentsSpecialized model variants (Pro/standard tiers) emerging to segment markets by use case (legal, scientific, general) rather than just capability levelOpen-source models reaching parity with frontier models on specific tasks, creating pressure on proprietary model pricingInternational AI competition intensifying with Chinese startups forcing innovation cycles and pricing pressure on US incumbentsHardware-software co-design becoming critical for model efficiency, with custom algorithms optimizing GPU utilization for token generation

Topics

Large Language Model Pricing and Unit Economics Agentic AI and Autonomous Task Completion AI Benchmark Metrics and Performance Measurement Frontier Model Competition (OpenAI vs Anthropic vs Google vs DeepSeek)AI Infrastructure and Computing Capacity Mixture of Experts Architecture Long Context Windows and Token Limits AI Chip Competition (NVIDIA vs Huawei Ascend)Workforce Optimization Through AI Cloud Provider AI Integration Strategy Model Specialization and Tiering Coding and Software Development AI Capabilities Scientific Research Applications of AI Chinese AI Startup Competition AI Safety and Guardrails

Companies

DeepSeek

Released V4 Pro and V4 Flash models with 5x lower pricing than US competitors, featuring hybrid attention and 1M toke...

OpenAI

Launched GPT 5.5 and GPT 5.5 Pro to reclaim benchmark leadership from Anthropic, with focus on agentic capabilities a...

Anthropic

Received $10B investment from Google with potential $30B more, valuing company at $350B; Claude Opus 4 recently surpa...

Google

Investing up to $40B in Anthropic and committing 5 gigawatts of computing capacity over five years to strengthen AI p...

Meta

Announced 8,000 employee layoffs (10% of workforce) on May 20th to offset heavy AI infrastructure spending and improv...

Alphabet

Parent company of Google, making strategic AI investments through subsidiary to compete in frontier model market

Tencent Holdings

In talks with DeepSeek for first external fundraising round as potential investor in Chinese AI startup

Alibaba

In talks with DeepSeek for first external fundraising round as potential investor in Chinese AI startup

NVIDIA

Providing GB200 and GB300 GPU systems for GPT 5.5 deployment; anonymous NVIDIA engineer praised model's capabilities

Huawei

Ascend 950 chips powering DeepSeek's future computing clusters expected to launch in H2 2026, reducing inference costs

Broadcom

Partner in deal with Anthropic and Google to expand computing capacity and infrastructure for AI development

Cohere

Canadian AI company merging with Germany's Aleph Alpha to create sovereign AI company valued at approximately $1 billion

Aleph Alpha

German AI company merging with Canada's Cohere to create sovereign AI company valued at approximately $1 billion

Magic Path

CEO Pietro Shriano tested GPT 5.5 for code merging tasks, reporting significant performance improvements in branch ma...

Axiom Bio

CEO Brandon White stated GPT 5.5 Pro could fundamentally change drug discovery foundations by end of year

Jackson Laboratory for Genomic Medicine

Professor Daria Unatamas used GPT 5.5 Pro to analyze 28,000 genes in minutes versus months for human team

People

Brian McCullough

Podcast host covering AI industry headlines and model releases

Amelia Mia Glace

Discussed GPT 5.5 capabilities in video call with journalists, emphasizing coding performance

Greg Brockman

Emphasized GPT 5.5's intuitive interface and ability to handle unclear problems with less guidance

Jacob Pachocki

Stated OpenAI still has headroom to train significantly smarter models beyond current scaling limits

Janelle Gale

Wrote memo announcing 8,000 employee layoffs and 6,000 unfilled positions to offset AI spending

Mark Zuckerberg

Spending aggressively on AI talent and infrastructure for LLMs and chatbots development

Dario Amadai

Former Google AI researcher who founded Anthropic with OpenAI alumni; company receiving $40B Google investment

Dan Shipper

Tested GPT 5.5 for debugging complex system failures, achieving autonomous fixes previously requiring human engineers

Pietro Shriano

Reported step change in performance merging hundreds of refactor changes in single 20-minute pass with GPT 5.5

Daria Unatamas

Used GPT 5.5 Pro to analyze 28,000 genes in minutes versus months for human team analysis

Brandon White

Stated OpenAI's pace could change drug discovery foundations by end of year with GPT 5.5 capabilities

Quotes

"It's definitely our strongest model yet on coding, both measured by benchmarks and based on the feedback that we've gotten from trusted partners as well as our own experience"

Amelia Mia Glace, VP of Research at OpenAI•GPT 5.5 announcement

"What is really special about this model is how much more it can do with less guidance. It's way more intuitive to use. It can look at an unclear problem and figure out what needs to happen next."

Greg Brockman, OpenAI Co-founder and President•GPT 5.5 announcement

"The first coding model I've used that has serious conceptual clarity"

Dan Shipper, CEO of Every•GPT 5.5 testing feedback

"Losing access to GPT 5.5 feels like I've had a limb amputated"

Anonymous NVIDIA engineer•GPT 5.5 early access feedback

"We actually still have headroom to train significantly smarter models than this"

Jacob Pachocki, OpenAI Chief Scientist•GPT 5.5 announcement

Full Transcript

Welcome to the TechBoo Ride Home for Friday, April 24th, 2026. I'm Brian McCullough. Today, DeepSeek dropped V4 Pro and V4 Flash, undercutting US Labs on price by roughly 5x. OpenAI fired back with GPT 5.5, reclaiming benchmark crowns from Claude. Meta confirms 8,000 layoffs. Google plans to invest up to $40 billion in Anthropik. And of course, the weekend long-range suggestions. Here's what you missed today in the world of tech. need to protect themselves from increasingly sophisticated social engineering threats. Their digital risk protection takes it one step further by keeping an eye on every channel to connect patterns and shut them down fast. From deep fakes to bad links to impersonation attempts, Doppel helps you stay ahead of these threats with their AI native social engineering defense platform. Learn more at doppel.com. That's doppel.com. It's another one of those days where any one of like four or five or six headlines could be the big headline of the day, but there are at least four, so let's get cramming. DeepSeek released its new flagship models V4 Pro and V4 Flash in preview, saying V4 Pro trails the performance of state-of-the-art models by about three to six months, but it's still pretty good. It's also pretty cheap, quoting Bloomberg. The Chinese startup unveiled the V4 Flash and V4 Pro series, touting top-tier performance in coding benchmarks and big advancements in reasoning and agentic tasks. They come with architecture upgrades and optimization improvements, the startup said on Hugging Face. DeepSeek singled out a technique it dubbed hybrid attention architecture, which it said improves the ability of an AI platform to remember queries across long conversations. It also pushed the 1 million token context window, a leap that allows entire code bases or long documents to be sent as a single prompt. The V4 arrives more than a year after the Chinese-based startup ignited a trillion-dollar stock market sell-off with the release of the R1, an open-source model that mimics the process of human reasoning. The R1 rivaled the performance of cutting-edge AI systems from companies like OpenAI, but was purportedly built for a fraction of the cost. Chinese chipmakers rallied Friday as investors bet the new model will support demand for local chips. In a post on WeChat, DeepSeek said the service capacity for the V4 Pro series is extremely limited due to a computing crunch. The startup, however, expects pricing for the model to drop significantly after computing clusters powered by Huawei's Ascend 950 chips launch in the second half of this year. DeepSeek is currently in talks with Tencent Holdings and Alibaba for its first fundraising from external investors. DeepSeq's trillion parameter systems use the mixture of experts technique, selectively triggering only a small subset of experts and activating only up to 37 billion parameters per task to keep inference costs far lower than for similar frontier models. DeepSeq v4 Pro's usage costs are just a fraction of leading US labs. For instance, input tokens, the prompt or text a user sends to the model are $1.74 per 1 million tokens, while output tokens, the response generated by the model, cost $3.48 per million, the startup said. Anthropix flagship Claude Sonnet 4 input tokens are $3 for a million input tokens and $15 for a million output tokens. The architecture and techniques position DeepSeek squarely against Silicon Valley competitors OpenAI, Google, and Anthropix's latest models. On Friday, the startup touted superior performance to the likes of OpenAI's GPT 5.2 on standard benchmarks, but conceded the V4 trail state-of-the-art models by about three to six months. Still, DeepSeek emphasized that it's not pushing solely raw capability, but also fundamentally lowering costs. The V4 is designed to be deployed on cheaper infrastructure. That could present a challenge for domestic AI companies, including Minimax Group, end quote. Well, right. Yesterday, Today I focused on open source being good enough and if you run it locally you're paying nobody for tokens but let me underline that pricing again because this is still important Remember these things are measured in input per million tokens and output per million tokens so hold that in your head The newly released GPT 5.5 is priced at $5 and $30 respectively. More on GPT 5.5 in a second. Claude Opus, $5 and $25 respectively. DeepSeq V4 Pro, $1.74 and $3.48. so undercutting by roughly a fifth and if you can get away with using deep seek v4 flash it's 14 cents and 28 cents which is competitive with the likes of gpt 5.4 nano or gemini flashlight but it still undercuts them so again if this is good enough you know why pay more for tokens that are only i don't know kind of sort of better if your ears glazed over with those numbers i've got a link in the show notes to a Simon Willison piece that has this in comparison shopping chart form. Well, as I alluded to, the other big announce was GPT 5.5, which seems like has sort of taken the crown back from Claude in terms of being the best in all the land, quoting VentureBeat. It's definitely our strongest model yet on coding, both measured by benchmarks and based on the feedback that we've gotten from trusted partners as well as our own experience, explained Amelia Mia Glace, VP of Research at OpenAI in a video call with journalists ahead of the launch earlier today. OpenAI positions GPT 5.5 as a fundamental redesign of how intelligence interacts with a computer's operating system and professional software stacks. What is really special about this model is how much more it can do with less guidance, said OpenAI co-founder and president Greg Brockman on the same call. It's way more intuitive to use. It can look at an unclear problem and figure out what needs to happen next. Brockman proceeded to emphasize the areas in which users can expect to see gains from using GPT 5.5 compared to OpenAI's prior state-of-the-art model, GPT 5.4, which remains available, for now, to users and enterprises at half the API cost of its new successor. It's extremely good at coding, Brockman said, of GPT 5.5. It's also great at broader computer work, computer use, scientific research, and the kinds of applications that are very intelligent bottlenecks. The model is available in two variants, GPT 5.5 and GPT 5.5 Pro, distinguished by the latter offering enhanced precision and specialized logic for handling the most rigorous cognitive demands. While the standard version serves as the versatile flagship for general intelligence tasks, the Pro model is architected specifically for high-stakes environments such as legal research, data science, and advanced business analytics where accuracy is paramount. This premium tier provides noticeably more comprehensive and better structured responses, supported by specialized latency optimizations that ensure high-quality performance during complex multi-step workflows. Unfortunately for third-party software developers, API access is not yet available for either GPT 5.5 nor GPT 5.5 Pro and will be coming very soon, according to the company's announcement blog post. At the core of GPT 5.5 is a focus on agentic performance, specifically encoding, computer use, and scientific research. Unlike its predecessors, which often required granular step-by-step prompting to avoid hallucinating, a path forward, GPT 5.5, is designed to handle messy multi-part tasks autonomously. It excels at researching online, debugging complex codebases, and moving between documents and spreadsheets without human intervention. One of the most significant technical leaps is the model's efficiency. While larger models typically suffer from increased latency, GPT 5.5 matches the per-token latency of the previous GPT 5.4 while delivering a higher level of intelligence. This was achieved through a deep hardware software co-design. OpenAI served GPT 5.5 on NVIDIA GB200 and GB300 and VL72 systems, utilizing custom heuristic algorithms written by the AI itself to partition and balance work across GPU cores. This optimization reportedly increased token generation speeds by over 20%. For high-stakes reasoning, the GPT 5.5 thinking mode in chat GPT provides smarter, more concise answers by allowing the model more internal compute time to verify its own assumptions before responding. The market for leading US-made frontier models has become an increasingly tight race between OpenAI, Anthropic, and Google. Literally a week ago to date OpenAI rival Anthropic released Opus 4 its most powerful generally available model to the public taking over the leaderboard in terms of the number of third benchmark tests in which it has the lead Yet today GPT 5 has surpassed it and even Anthropic heavily restricted more powerful model Claude Mythos preview, albeit only on one benchmark, Terminal Bench 2.0, which tests a model's ability to navigate and complete tasks in a sandboxed terminal environment. GPT 5.5 achieved 82.7% accuracy on Terminal Bench 2.0, easily surpassing Opus 4.7 at 69.4%, and narrowly beating the Mythos preview at 82%. The early feedback from power users and engineers suggests that GPT 5.5 has crossed a psychological threshold in AI utility. For developers, the model's ability to maintain conceptual clarity across massive code bases is its standout feature. The first coding model I've used that has serious conceptual clarity, said Dan Shipper, CEO of Every. Shipper tested the model by asking it to debug a complex system failure that had previously required a team of human engineers to rewrite. GPT 5.5 produced the same fix autonomously. Similarly, Pietro Shriano, CEO of Magic Path, described a step change in performance when the model successfully merged a branch with hundreds of refactor changes into a main branch in a single 20-minute pass. Perhaps the most visceral reaction came from an anonymous engineer at NVIDIA who had early access to the model saying, losing access to GPT 5.5 feels like I've had a limb amputated. This sentiment is echoed in the scientific community. Daria Unatamas, a professor at the Jackson Laboratory for Genomic Medicine, used GPT 5.5 Pro to analyze a dataset of 28,000 genes, producing a report in minutes that would have normally taken his team months. Brandon White, CEO of Axiom Bio, went further, stating that if OpenAI continues at this pace, quote, the foundations of drug discovery will change by the end of the year. GPT 5.5 is more than an incremental update. It is a tool designed for a world where humans delegate entire workflows rather than single prompts. While the costs are higher and the safety guardrails tighter, the performance gains in agentic work suggests that AI is finally moving from the chat box and into the operating system. Perhaps most astonishingly of all, it's perhaps not even nearing the end of the scaling limits, whereupon models are trained on more and more GPUs, according to researchers at the company. We actually still have headroom to train significantly smarter models than this, said OpenAI chief scientist Jacob Pachocki, end quote. Well, we knew this was coming, but you know, sword meet Damocles, quoting Bloomberg. MetaPlatforms plans to cut 10% of its workers, or roughly 8,000 employees, in an effort to boost efficiency and offset its heavy spending on artificial intelligence. The company disclosed the move in a memo sent to employees Thursday, saying the layoffs will come on May 20th. Meta also won't hire workers for 6,000 open roles that it had intended to fill. The job cuts come as Chief Executive Officer Mark Zuckerberg is spending aggressively on the talent and infrastructure needed to develop state-of-the-art artificial intelligence products, including large language models and chatbots. Meta already projected record capital expenditures this year and has announced several multi-billion-dollar deals with AI partners over the past few months, employees have been encouraged to use AI agents internally to help with writing code and other tasks. Meta alluded to its AI spending in the memo, which was written by Janelle Gale, chief people officer. We're doing this as part of our continued effort to run the company more efficiently and allow us to offset the other investments we're making, she wrote in the note which was reviewed by Bloomberg. Meta employees have spent much of the year fretting about job cuts, which already hit the Reality Labs division and other teams. Gale said that the company was announcing the layoffs early since details of the plan had already leaked. Reuters first reported on Meta's planned workforce reductions earlier this month. I know this is unwelcome news and confirming this puts everyone in an uneasy state, but we feel this is the best path forward given the circumstances, Gale wrote. Meta had almost 79,000 employees at the start of the year. The company is scheduled to report first quarter earnings next week, end quote. And this just happened right at noon, which means I actually cut another headline I was going to do about Canada Cohere and Germany Aleph Alpha agreeing to a merger valuing the combined company at around billion to work on sovereign AI But quoting Bloomberg Google will invest billion in Anthropic with another billion potentially to follow, strengthening the relationship between the two companies that are at once partners and rivals in the race to build artificial intelligence. Anthropic said that Google is committing to invest $10 billion now in cash at a $350 billion valuation, the same amount it was valued at in a funding round in February, not including the recent money raised. The Alphabet-owned company will invest another $30 billion if Anthropic hits performance targets, the startup said Friday, and support a significant expansion of Anthropic's computing capacity. Anthropic is a major customer of Google's chips and cloud services, businesses that Google is striving to grow as its main moneymaker. Search advertising matures. Google Cloud will provide 5 gigawatts worth of computing capacity to Anthropic over the next five years, and several more gigawatts could follow. The deal marks an expansion of an agreement announced earlier this month between Anthropic, Google, and Broadcom. A single gigawatt is enough to power about 750,000 U.S. homes at any given time. Anthropic Chief Executive Officer Dario Amadai worked at Google as an AI researcher earlier in his career. The companies have maintained close ties since Amadai founded Anthropic with a group of former OpenAI employees in 2021. Last year, Google said it would provide up to 1 million of its TPU chips to Anthropic in a deal worth tens of billions of dollars. The search giant had already invested about $3 billion in the startup by that point, end quote. Hey, look, by the way, if you can invest in Anthropic at that old $350 billion valuation, you probably want to do it. That is increasingly looking like a real sweetheart deal. Time for the Weekend Long Read Suggestions. And first up, you know how I mention benchmarks all the time when new AI models come out like today? Well, the New York Times took a look at the AI nonprofit Metter, makers of maybe the most important one of these benchmarks as their time horizon metrics are used by AI researchers and Wall Street investors to track the development of AI systems. Quoting the Times, this chart, often referred to as the Metter time horizon chart, has become a discourse-dominating obsession among AI researchers, Wall Street investors, and industry watchers. They have studied it with Talmudic intensity, looking for signs that the AI boom is tapering off, or that it is accelerating, or merely that it confirms what they already believed was happening. But what is Metter's chart measuring exactly? How nervous should it make us about what's happening in AI? And what would it mean if, like Moore's law, its curve kept climbing? To find out, I recently spent an afternoon at Metter's office meeting its research leaders. They regaled me with dense technical explanations about their measurements and how they track the progress of AI systems, it was a bit like entering a den of NBA statisticians who track things like developer uplift and covert capabilities instead of assists and rebounds, and it left me with an uneasy sense that if their measurements are even close to correct, things are about to get very weird, end quote. And finally, from The Verge, not an endorsement in any way. It's not paid or otherwise, but I think I found my version of the Vespa. Me want, especially here in Brooklyn, quoting The Verge. My life is filled with trips that are too long to walk, but too short to really need a car. It's a mile to the grocery store, a mile and a quarter to my kid's daycare, a mile and a half to CVS, three quarters of a mile to my favorite coffee shop, each one far enough that walking turns into more than a quick trip, but close enough that I often spend as much time looking for parking as I do driving. The Alto, with its spacious seat and twitchy throttle, is a more elegant take on this problem. A 20-minute walk is 3-4 minutes on the Alto. You can park it basically anywhere. You don't even have to lock it up, thanks to both its anti-theft automatic locking systems and its sheer size and weight. You turn it on with an NFC-capable card or through the Infinite Machine app, and the app can be set to start the bike as soon as you get close to it. Altogether, it feels effortless." No bonus content for you this weekend, but I've got stuff brewing. I do have stuff brewing. We will get to my AI show-and-tell episodes as soon as I finish my own AI project so I can show and tell it off. I was up again until 8am last night, pitting Claude against Claude. Talk to you on Monday.