353: Don't Be Evil Unless the Government Asks Nicely

100 min

•May 13, 20262 months ago

Summary

Episode 353 covers major cloud earnings from Microsoft, Amazon, and Google showing massive AI infrastructure spending and growth, discusses critical Linux vulnerabilities and GitHub's availability issues, and explores emerging challenges around AI agent identity, governance, and the shift from open-source to proprietary AI models.

Insights

AI infrastructure spending is creating a new constraint economy: companies are capital-constrained by chip availability and power capacity rather than demand, forcing difficult trade-offs between CapEx, emissions, and service reliability
The shift from flat-rate to usage-based AI pricing will force enterprises to implement FinOps discipline for AI workloads, similar to cloud cost management but with much higher volatility and less predictability
Agent identity and governance is becoming the critical unsolved problem across all cloud platforms—there's no standardized way to assign persistent, auditable identities to AI agents operating across multiple systems
Open-source AI models are being abandoned by their creators (Meta/Llama) in favor of proprietary cloud-only offerings, consolidating power back to major cloud providers and eliminating the open alternative
Physical infrastructure (undersea cables, power generation, data center location) is becoming a geopolitical and security concern as AI workloads concentrate compute requirements in ways that create single points of failure

Trends

AI-driven code generation is creating 30x capacity growth expectations, forcing GitHub and other platforms to completely rearchitect infrastructure in real-timeRegulatory and compliance pressure is pushing AI governance into the platform layer (AWS Bedrock, Google Agent Gateway) rather than application codeUndersea cable cutting capabilities and geopolitical tensions are elevating physical infrastructure security to strategic importance for cloud providersEnterprise AI adoption is shifting from experimentation to cost governance, with FinOps becoming a required discipline for AI workloadsHardware-focused leadership transitions (Apple's John Ternus, Microsoft's Satya Nadella model) suggest renewed focus on device-level AI and local inferenceMulti-cloud AI model access (OpenAI on AWS/GCP) is fragmenting the exclusive partnerships that defined early AI cloud strategyNatural gas-backed data center power is offsetting stated carbon reduction commitments, creating a hidden emissions problem in AI infrastructureAgent-to-agent and agent-to-tool communication protocols (MCP) are becoming critical infrastructure for enterprise AI governanceLegacy application automation via desktop AI agents (Amazon WorkSpaces, Microsoft ClawPilot) is opening new use cases for organizations with non-modernized systemsSecurity vendors are positioning themselves as middleware for AI governance, creating a new market layer between applications and cloud platforms

Topics

Cloud Earnings and AI Infrastructure Spending Linux Kernel 7.0 and Rust Stability GitHub Availability and Scaling Challenges AI Agent Identity and Governance Undersea Cable Security and Geopolitics AI Model Pricing and Usage-Based Billing FinOps for AI Workloads Open-Source AI Model Consolidation Data Center Emissions and Natural Gas Infrastructure AI Safety and Government Regulation Multi-Cloud AI Model Access Legacy Application Automation with AI Agents Enterprise AI Security and Compliance Kubernetes Container Escape Vulnerabilities AI Code Review and Quality Assurance

Companies

Microsoft

Q3 2026 earnings: $82.89B revenue (+18% YoY), Azure +40%, AI revenue $37B annualized, stock dipped 2% despite beat du...

Amazon Web Services

Q1 2026 earnings: $37.59B revenue (+28% YoY), highest growth in 3 years, CapEx $44.2B, free cash flow down 95% YoY to...

Google Cloud / Alphabet

Q1 2026 earnings: $20.02B revenue (+63% YoY), enterprise AI primary growth driver, $460B backlog, CapEx guidance rais...

OpenAI

Partnership amended with Microsoft: non-exclusive license allowing models on AWS/GCP, Azure remains primary through 2...

Anthropic

Claude code degradation issues over 7 weeks (March-April 2026) caused user backlash and subscription cancellations, p...

Meta

Confirmed Llama is dead, replaced by proprietary MuseSpark cloud-only LLM with no open weights or self-hosting, 1.2B ...

GitHub

April 2026 availability as low as 85%, merge queue incidents, search cluster overload from botnet, scaling target inc...

Linux Foundation

Linux 7.0 released with stable Rust support after 5 years, improving kernel security and memory safety across x86, AR...

Apple

Tim Cook transitioning to board-only role September 2026, replaced by John Ternus (hardware background), potential sh...

Snowflake

Published governance framework for AI agents addressing persistent identity and audit trail challenges, internal AI a...

HashiCorp

CEO Mitchell Hasimoto retired, founder of Terraform, moved personal projects off GitHub due to reliability issues, mi...

Okta

Partner in Google Agent Gateway ecosystem for AI agent identity governance and access control

Palo Alto Networks

Prisma integrated with Google Agent Gateway for runtime AI protection and security governance

Cisco

AI Defense integrated with Google Agent Gateway for agent traffic security and governance

CrowdStrike

Partner in Google Agent Gateway ecosystem for AI agent security monitoring

Datadog

Speaker at We Are Developers World Congress September 2026 in San Jose

Netflix

Speaker at We Are Developers World Congress September 2026 in San Jose

Stripe

Speaker at We Are Developers World Congress September 2026 in San Jose

We Are Developers

Hosting World Congress in San Jose Sept 23-25, 2026 with 10K+ developers, 500+ speakers, 18 content tracks

Waymo

Alphabet subsidiary surpassed 500K fully autonomous rides per week, $16B fundraise at $226B valuation

People

Sundar Pichai

Noted Google Cloud is compute-constrained, stated revenue would be higher if supply met demand

Satya Nadella

Referenced as example of technical leader bringing renewal to organization, similar to John Ternus at Apple

Linus Torvalds

Credited AI for improving quality of bug reports reaching kernel team, noted as unexpected AI advocate

Greg Kroah-Hartman

Noted improvement in AI-generated bug reports quality reaching kernel team

Mitchell Hasimoto

Left GitHub due to reliability issues, moving Ghostly terminal emulator to multiple platforms

Tim Cook

Stepping down September 2026 after scaling Apple's services business and operations

John Ternus

Hardware background, led Apple Silicon transition, expected to focus on device-level innovation

Scott Hanselman

Building ClawPilot open-claw desktop assistant, announcing Windows node for OpenClaw at Build in June

Justin

Co-host discussing cloud earnings, infrastructure, and AI governance challenges

Jonathan

Co-host contributing to discussion on cloud platforms and AI infrastructure

Ryan

Co-host discussing AI governance, security, and enterprise implications of AI infrastructure

Matthew

Co-host contributing to discussion on cloud platforms and emerging technologies

Kelsey Hightower

Speaking at We Are Developers World Congress September 2026

Oliver Pommel

Speaking at We Are Developers World Congress September 2026

Christine Yen

Speaking at We Are Developers World Congress September 2026

Quotes

"They're not spending enough. It's bad news. They're not spending too much. It's bad news. I can't help them."

Ryan•~15 min - discussing Microsoft CapEx expectations

"It's crazy to me that Amazon's entire revenue is Microsoft's AI revenue."

Ryan•~25 min - comparing AWS total revenue to Microsoft AI revenue

"Layer one, physical, is always going to be important. It's one of, I feel like a lot of debugging that I always end up doing at least at home and side projects."

Ryan•~45 min - discussing undersea cable importance

"I would love to see OpenAI also release I think actually maybe have an open source model really?"

Host•~2:15 - discussing open-source AI model options

"We'll know it when we see it. Oh, good. I love that in legalese."

Host•~2:35 - discussing AGI clause ambiguity in OpenAI-Microsoft contract

Full Transcript

Welcome to the CloudPod, where the forecast is always cloudy. We talk weekly about all things APUS, GCP and Azure. We're your hosts, Justin, Jonathan, Ryan and Matthew. Before we get into this week's news, we want to take a minute to tell you about We Are Developers World Congress, which is finally making its way to North America this September. If you've spent any time in the European tech team, you probably already know the team behind it. They've been running World Congress in Berlin for over a decade, and it's a big deal over there, pulling in more than 15,000 developers every year. Our friend Kote from Software Defined Talk is actually speaking at the Berlin event this July, and from what we've seen, these are the people who know how to put on a good developer conference. This September 23rd through 25th, they're bringing it stateside to San Jose. Organizers are expecting more than 10,000 developers with over 500 speakers across 18 different content tracks covering the entire stack, including cloud, DevOps, AI, security, software architecture, data engineering, front-end, and developer experience. If you've got a team, everyone's going to find a full schedule. It's not just sit-and-listen sessions. There are keynotes, workshops, masterclasses, and hands-on labs, the kind of stuff you can take back home and work on on Monday. There's an impressive list of speakers including names from Datadog, Honeycomb, Century, Google, LinkedIn, Stack Overflow, Netflix, Microsoft and Stripe. Plus, Kelsey Hightower, Oliver Pommel, Christine Yen, Scott Hanselman and Angie Jones. Head over to wearedevelopers.us to grab your ticket and use code DEVPOD26 for 15% off. That stacks with their group rates if you're bringing four or more people. And honestly, at that price, you should probably bring the whole team. episode 353 recorded for may 5th 2026 don't be evil unless the government asks you nicely good evening ryan and matt how you guys doing doing well good how are you uh i just got back from a lovely weekend away and uh came back to record with you guys so you're welcome thank you hope you're rested and had fun yeah rested probably not i mean just ran around the city for four days. Yes. Did a bunch of things and ate some good food and saw some shows and it was a nice time. It was close enough to math that we probably should have got together but we didn't coordinate far enough in advance that that would have happened. I had a one-year birthday too so life was a family weekend. Yeah, but my wife wants us to go back for weeks and get an Airbnb and so if we do that, we'll definitely hook up when we do that. Anyways. Well, it is a busy week once again in the cloud and first up is once again earnings. So they were blessed enough to all announce earnings on the same day other than Oracle, which I don't know when they announce earnings. It just happens. Aren't they the ones that do like six months off anyway? Yeah, they're a weird calendar year. They have all kinds of stuff. But anyways, but they all report on the same day, so I don't have really an order other than I'm just going to go through these. So Microsoft posted Q3 2026 revenue of $82.89 billion up 18% year over year. with Azure cloud services growing 40%, slightly ahead of analyst expectations of the 38% to 39% range. Capital expenditures came in at $31.9 billion, about $3 billion below what the analyst consensus thought it would be, which contributed to the stock dipping 2% despite the earnings beat, reflecting investor sensitivity around AI infrastructure spending levels. They're not spending enough. It's bad news. They're not spending too much. It's bad news. I can't help them. Microsoft said annualized AI revenue now stands at $37 billion, up 123% year-over-year, expanding Azure-hosted AI services and Microsoft's own AI tools, though the metric excludes some infrastructure workloads, which is worth noting when comparing figures across the different corridors. 365 Copilot commercial seek count grew from $15 million to over $20 million by end of March, including getting continued enterprise adoption of AI productivity add-ons at a pace worth tracking for cloud practitioners evaluating Microsoft's enterprise AI traction. gross margin narrowed a bit to 67.6% the lowest since 2022 as data centred appreciation costs are increasing because of all those hot hungry AI chips. Yeah, it's interesting that they're spending less because I figured everyone would be just spending more than they originally were predicted, right? I don't know if they're sure they didn't want to spend the $3 billion I think it's the issue of there wasn't supply available to buy or delivered in time for the quarter. That's what I was thinking. They're trying to do it They just can't. There's not enough RAM in the world. I think it's a RAM problem. I mean, I just thought today Apple stopped selling the 96-gigabyte option of one of their ultra-PC computer minis. So you used to be able to get a 512, and then you couldn't get that anymore. You can only get 256. Now the biggest you can get for a, what is the thing called? It's not the mini. It's the studio. Thank you. Yeah. Stupid naming. the studio basically now maxed out 96 gigs so they can't get memory for them which is just crazy to me so hopefully that comes back later but 96 seems low for a studio but just me there was Unify I saw whatever the latest Dream Machine was has a memory upcharge like a memory here's your item and then on the things for taxes shipping it's like memory surcharge so it's replaced where it's a Terra with memory on charge. Nice. Well, that's good. Well, congratulations, Microsoft. Sorry your stock didn't appreciate it quite as much as it should have, but that is a way of analysts and what they do in their business. I am looking at their stock in the last week since earnings was last week. They were at $4.24 on average, and now they're averaging around $4.10, so about $12, $13 drop. They dropped more than that. They've come back about half of what they dropped initially because they're on earnings day they're at 423 and then they drop the next morning to 400 and now they've come back to about 410 to 413 so anyways uh you know not so great for them amazon aws revenue reached 37.59 billion in q1 2026 growing 28 year-over-year which is its fastest growth rate in over three years as we've been tracking it slowly going down which is just shows you how much spend is happening in amazon it came in above analyst expectations of 26 growth Amazon's capital expenditures hit $44.2 billion in Q1, along with a full-year production of $200 billion, primarily driven, of course, by AI infrastructure. Free cash flow dropped 95% year-over-year to $1.2 billion over the trailing 12 months, a direct consequence of AI investment levels, raising questions about when that spending translates to direct returns. Amazon has also formalized AI partnerships with OpenAI, Anthropic, and Meta, which signals continued infrastructure demand growth and suggests AWS capacity expansion will need to accelerate further to continue to support these relationships. Q2 revenue guidance came in at $194 to $109 billion and came in well above analytics expense of $188.9 billion. The wide operating income range of $20 to $24 billion reflects uncertainty likely tied to tariff impacts and the variable AI spending timelines. It's crazy to me that Amazon's entire revenue is Microsoft's AI revenue. Right? Like, whoa. But yeah, that's what you get for Office 365 and its reach. I mean the free cash flow for them is interesting you know I know they're investing but that's a massive drop in cash flow year over year down to 1.2 billion I mean still a ton of money to have in cash I mean I imagine it's a combination of them making these big investments into Anthropic and others as well as the AI capital investments they're trying to make so it's a combination of both those but yeah I'm always glad it goes down to 1.2 billion I mean 95% drop no matter if it's the right thing to do or not right thing to do. It had to be somewhat shocking to Wall Street a little bit on that one. But let's look at this tape, see what the stock did after the earnings here. So that means their original cash flow was, previous year cash was $24 billion. If I can quickly have Claude do math for me. Yep. So basically at earnings last week, they were $263 a share, and today they are at $273. So they've gone up, and they have gone trended up since this announcement. So even though corporate cash flow was down, I think mostly based on the fact that Amazon AWS growth is 28%. That's been dragging on their stock for a bit that it wasn't going up. And it was like 20% or 23%. So that beat the consensus is a big deal. So analysts are thinking now that's probably a good deal is my guess why their stock improved dramatically. And then finally round it up, Google Cloud posted $20.02 billion in Q1 2016 revenue, a 63% year-over-year increase with enterprise AI solutions cited as the primary growth driver for the first time. The unit now carries a $460 billion backlog settling sustained demand well into future quarters. Sundar Pichai noted that Alphabet is compute-constrained in the near term, stating cloud revenue would have been higher if supply could have met demand. It's a notable signal for cloud customers who may be experiencing capacity limitations on GCP. We've been experiencing before AI. Alphabet raised its 2026 capital expenditure guidance to $180 to $190 billion, with the CFO indicating 2027 CapEx will increase further. The $35.7 billion spent in Q1 alone on servers, data centers, and infrastructure reflects the scale of investments required to support AI workloads. Gemini Enterprise paid monthly active users grew 40% quarter over quarter, suggesting Enterprise's adoption of AI tooling on Google's platform is also accelerating at a meaningful pace. And Waymo surpassed $500,000 fully autonomous rides per week and has expanded its traditional U.S. cities. a lot recently, $16 billion, fundraise around $226 billion, which is important because Alphabet owns the majority of the stock. Yeah, it's crazy numbers for the data centers, you know, which we knew. It is funny because, you know, is it part of the bubble? Is it going to pop and then it's not going to be here? I go through these waves of like it's all going to burn tomorrow or now this is just our new normal now. But it's crazy. I don't know. Yeah. Alphabet releases Sunderbatchai's remarks separately. So we use CNBC to track these stocks and that we have kind of consistent reporting on these. But basically, there were some interesting things in Sunderbatchai's notes. So basically, Google Cloud revenue hitting $20 billion, which is 63% growth year-over-year. Backlog, as mentioned. But the Gen.AI model-based products growing nearly 800% year-over-year. So that's pretty huge. They talked about the TPU AI for inference with 80% better performance for a dollar than the prior generation, which is a big deal. But that CapEx is going to go to TPUs, I'm sure. And then they also talked about some of the things they announced last week at Next in his comments. But overall, Google on the stock did very well. Analysts were very happy with Google. So happy, in fact, that their stock, once I find Alphabet, it's a weird one in my Goog. All right, there he is. Basically, on earnings day, they were at $347.68. The next day, they jumped to $369, a $20 increase overnight. And now they continue to grow up, and today they closed at $383. So their stock has basically gone up $40 since earnings, based on all of this positive news in their stock. So congratulations, Google. You won the earnings. All right, let's move on to other exciting topics. We have a cable corner today. and so we basically as everyone knows we love cables undersea cables that is and so anytime we find a good undersea cable story but i talk about it and so uh there's two articles basically about um cutting cables though which is not the best thing for the cables that we love so first up uh a crucial taiwan undersea cable was severed by an old shipwreck and so basically taiwan had to go back to backup microwave communications the dongian island lost its undersea cable connection after a C4 trip wreck shifted during bad weather, prompting activation of a backup microwave communications for the island's 1500 residents. Incident reforces a known as a reality. Physical undersea cables remain the primary backfall and reliable. High bandwidth connectivity, while alternatives like microwave links and LEO satellites serve only as degraded fallbacks. Taiwan finally monitors 24 undersea cable links around the main island and has blacklisted 96 vessels suspected of connections to China. The United Nations are treating cable as a critical security parameter rather than a purely a commercial asset. Interesting. Yeah. And leading into that is that China apparently tested a deep-sea electro-hydrostatic actuator that can cut undersea cables at a depth of 3,500 meters. So, shipwreck or China, you answered the question. Yeah. They apparently successfully tested a deep-sea electro-hydrostatic actuator at a depth of 3,500 meters or roughly 11,500 feet. It represents a notable extension of previous capabilities that topped out at around 2,000 feet. The device combines hydraulics and electric motor and a control unit to a single compact system that will end the need for external oil piping and make it mixing it more practical for deep sea deployments from research vessels. Practical efficiency gains are measurable a 2022 pipeline cut took five hours for a single 18 inch pipe while a 2023 remotely operated vessel could cut 38 inch pipes in 20 minutes illustrating rapid operational improvements. Undersea fiber optic cables carry the majority of global internet traffic and financial data meaning any credible threat to this feature has direct implications for cloud connectivity. So yeah, so bad. So China is definitely preparing if they ever go to war with us or anybody else, they're going to cut all the cables. That's what I hear. There's the jokes about the great firewall of China, right? This is hardcore. It's a physical firewall at this point. Yeah. I mean, the best way to get any real connectivity is always layer one. Check to see if the cable is there. It's one of, I feel like a lot of debugging that I always end up doing at least at home and side projects. So here, layer one, physical, is always going to be important. it's always going to be the fastest and most reliable. We have the other stuff, but it's latency and other issues that are going to arise. Well, let's hope no cable cutting happens anytime soon, especially with Iran, war happening. And let's keep those cables flowing and keep the shipwrecks away. Linux 7 is now available and is available to you on 7 distributions. This is not a milestone release, similar to when Torvalds dropped from 3x to 4x2015 to avoid unwieldy version strings. The biggest thing for 7.0 is the Rust support is now officially stable in the kernel after five years of incremental work, with native build tooling supporting x86, x64, ARM, and RISC-V architectures, which has direct implications for system security and memory safety. The revamped scheduler introduces lazy preemption by default and adaptive scheduling domains, which should improve throughput for containerized cloud workloads and reduce latency on hybrid CPU architectures like Intel, Alder Lake. AI tooling is now recognized part of the Linux development workflow with Torvald's and stable kernel maintainer Greg at Crow Hartman, both noting a notable improvement in the quality of AI-generated bug reports reaching the kernel team directly. Cloud Enterprise users can test 7.0 today through rolling release distros like Arch Linux and OpenSUSE with Ubuntu 2604 LTS and Fedora 44 expected to ship it within a few weeks. So you now get Linux kernel 7. Fancy. Yeah. Rust. Yeah, the rest is a big thing. Because now you get a C++ compiled binaries in the core parts of the kernel. This should be a huge improvement to availability, reliability, and potentially security as well, as long as that was handled well. I think it's interesting that I wouldn't see Torvalds as a guy who loves AI. I mean, I don't know him personally, or I've ever talked to him, but just the few things I see him write on the news, you know, that comes to my attention from the news groups and stuff like that, he seems like an old curmudgeon who would hate AI, but apparently not. Did he really say he loved AI? It's like he just cared that the bugs that he was getting are better written, which is... I mean, that's something. He's accrediting AI for writing the better bug reports, so I guess that's a win. Just proves that humans can't actually do anything. We can't even write our own bugs very well. Especially kernel bugs. I've read a few, and it's fascinating. I don't understand. There's a defect in the memory, Alec. 4775734652 register, and you're like, you lost me. I know kind of what that means, but I don't know how you fix that or what to do with this information. I think I conceptually understand what we're talking about, but I don't understand it really. So in a story that's kind of scary for a little warming, just 11 data center campuses in the U.S. are linked to natural gas projection projects permitted to emit up to 129 million metric tons of greenhouse gases per year, which exceeds the annual emissions of countries like Morocco or Norway, even at half capacity. So that's crazy. Behind the meter power, where data centers aggregate their own electricity rather than drawing from the grid, has grown from 4 gigawatts in early 2024 to nearly 100 gigawatts in the U.S. development pipeline by early 2026, driven largely by grid connection delays and utility cost concerns. Unlike traditional grid-connected power plants that cycle down based on demand, data center power plants run at near constant load, meaning actual emissions are likely to be much closer to permitted maximums than the industry standard two-thirds reduction estimate customers often cite. Major AI companies, including Meta, Microsoft, OpenAI, and XAI have made public carbon reduction of commitments. The scale of these gas projects could offset years of stated emissions progress with Meta's Ohio projects, along potentially erasing over 10% of its claimed four-year emissions reduction. Air permits do not guarantee construction. Permit shortages are a real constraint and several high-profile projects like Fermi face leadership and financial instability. So the full emissions scenario may not be materialized, but the trend towards fossil fuel-backed AI infrastructure raises long-term questions for cloud providers with sustainability commitments. Yeah, that could be bad. Yeah. That's a lot of tons of greenhouse gases. My AI, or sorry, my sci-fi fueled narrative in my head is like, oh, this is how the world ends. So, cool. Huge advancement in AI. Yeah, my youngest son actually is very anti-AI because it destroys all the water, that's what he tells me. I'm like, well, that's, yes, water was definitely a thing in many data centers, but most data centers now recycle water. Most of the new data centers recycled water. The new ones, yeah. The old ones don't. But the new ones, at least, and the new ones I assume are being built to run most of these AI workloads because of the power density requirements. I assume they're using recycled water plants. So, you know, he cannot come to me with this argument. All the CO2 for the AI from natural gas generators. And I know, like, I think it was XAI's data center somewhere in the south. It has, like, apparently 12 huge generator trucks just spewing emissions and the neighbors all hate it. And basically, they're only approved to have six, but they're running more and a whole bunch of community concerns around data centers in the world as a whole. But this is a good reason. This is what I can understand. The light pollution one also is one I also very much understand. Because the light pollution thing is ridiculous. You never think about that as data centers being heavily on light, but it's just a giant warehouse space in a lot of it, right? But where's the light? The problem is the light outside, because they have perimeter security they have to maintain. And to have perimeter security, you have to have lighting. And then they're all using really bright LED lights. So there's an article about some poor couple, I think in Virginia, who lives down the street from an Amazon data center. And they're like, it basically lights up their entire yard. Just how much light comes out of that place. And again, my initial thought was like, well, they don't need the light. They turn around. They're like, no, you need the light for security. so it's like ugh. It's how you have to tell those SOC and ISO audits and you have to prove your security. And then I was talking about the constant buzzing from all the AC units and so it's just a constant noise. The buzzing and the noise as I've heard is a big issue for a lot of people because it's just there, it's always on and it drives people all the crazy, I've been told. That's what I understand too. They're trying to, in my town where I live, they're zoning this area and, you know, wow, there's no plan to build a data center there because there's no power in our area. Apparently they use natural gas generators, but I hadn't thought about that. But, you know, basically they're like, well, technically it's a mixed-use zone, and so one of the uses could be data centers, but the community is just losing their minds about it. Like protests and going to city council meetings, and like we can't have data centers in our backyard, and I was like, I don't want them here either. But, yeah. But my next question is how many people are saying all that? also on the same point, say, I want to use AI or use a SaaS application. You're using it, you just don't want it next to you. These days, you don't really have a choice but to use AI. Google search, you're using AI. So many products, it's built into AI. It's built into everything. That's why I'm a big fan of on-device AI models. I'm hoping Apple to share some SLMs onto the iPhone that you can use for some basic use cases and things. There's a lot of stuff that you need that doesn't require a lot of AI compute power but can benefit from it. Well, if you are here on the show, you should raise your hand if you've been hurt by GitHub in the last months. I definitely have, and my hands are raised. My hands are raised, too. You might have noticed that GitHub has had some bad availability. Some reports are saying that in the month of April, their availability was potentially as low as 85%, depending on how you calculate it. Of course, GitHub doesn't say it was that bad. but you know there's been a trend that you know things have been bad for GitHub and you know so originally people were saying well it's because you know Microsoft transition and or they're you know all the AI features are building or they're you know it's vibe coding causing all these problems GitHub and then someone did the math and was like actually no their availability started suffering from the moment Microsoft bought them so just saying well then they also did the big push and said they had to move into Azure this year yeah I can't imagine that's helping them the timing seems awfully suspect well I'm sure they're going to spin that as like well by moving to Azure we're going to improve stability because the data center that we're in right now which is the GitHub data centers are aging because we haven't invested in them to force this decision I'm sure but basically it's forced GitHub to write a blog post about their availability and so GitHub CTO published a transparency post acknowledging two recent incidents and outlining a scaling plan that has grown up from a 10x capacity target in October 2025 to a 30x target by February 2026 driven by rapid growth and agentic development workflows since late 2035. So basically they're saying a lot of their scaling problems are because the growth capacity they thought they needed is 30x what they thought because of AI generated code and the amount of AI generated code they're getting, which, okay. Yeah, I could understand that. There was a merge queue incident causing incorrect merge commits for squashed merges in groups of more than one pull request, affecting 650 repositories and 2,092 pull requests with no data loss but incorrect default branch dates that could not all be repaired automatically. And on April 27th, the incident involved in a last-first cluster being coming overloaded. Like me from a botnet attack which disrupted a search-backed UI experiences across pull requests, issues, and projects. GitHub had knowledge that this system had not yet been fully isolated as part of the reliability priorization. And this is what I knew about because literally I had a message every time I went to a PR for days that my PR may not be complete due to a search issue. And I was like, well, that's not good. GitHub says they're addressing scaling challenges through several technical approaches including moving webhooks out of MySQL, redesigning session caching, migrating performance sensitive code from Ruby monolith to Go, isolating critical services like Git and actions, and pursuing a multi-class strategy beyond their current Azure migration. GitHub is updating its Azure page to include available metrics and is committed to reporting both large and small incidents responding to developer feedback about needing better transparency during disruptions. This also resulted in an article that, thank you, Matt, basically Mitchell Hasramoto, who is, you know, from HashiWork, or HashiWork, what was his name? Started Hashi, basically. He retired from Hashi, you know, and he got by IBM, and he's basically been working on a product called Ghostly, a terminal emulator, and he basically said, I'm out of GitHub. I'm moving. I can't. It's been too unreliable and too unavailable for me to continue to remain on GitHub, and so he is moving his workloads elsewhere. Wow. I mean, I've definitely been bit by some of these, especially the search one was, like, multiple days, and you couldn't find anything. You couldn't just load up pull requests because anytime you press the pull request button, it wasn't searched technically because it's like, is status open or not? So every feature was just hung for a couple days. I wish I would have thought to blame a botnet attack for all of my Elasticsearch cluster problems. I mean, you were basically being DDoSed by the publishers. You built your own DDoSing Yeah Yeah It is interesting because it you know I guess it a whole bunch of searches that make sense directly There no circuit breaker there between the UI layer and the search backend That sucks Not an easy recovery either. Where does the, I mean, did you say in the article where he's going to go to? Is it GitLab? Or like back to SourceForge? Oh, he says multiple. He's not specific about where he's going. Okay. So he's going to spread his risk basically. Has anyone used Ghosty? I know when he announced it, I was like, I'm sort of interested, but I use Hyper as my terminal most days. And I'm pretty happy with it. So I don't have a lot of desire to try a new terminal. Yeah, I don't. I've been using iTerm2 for a decade, and I'm not going to change. I don't like Hyper because I could write some really simple JavaScript plugins on top of it, which were pretty nice. That was how I found it. I don't know how well that's getting supported these days. I feel like it doesn't get a lot of updates, but it's a terminal. How many updates does it need? Right. Okay. So, yeah, hopefully GitHub fixes their issues and can improve things dramatically because it's going pretty bad for them right now. They're being kind of mocked pretty mercilessly on Twitter and other social media networks about how bad their availability has been. It's going to be 85%. If that number is true, factoring in all these things, that's bad. I imagine that's the worst way you can calculate it because it sounds a little bit crazy. I think there's someone counting all of the last years out, even though most people were not impacted at some point. There's a very small handful of people who were impacted long-term. That's the kind of thing. I mean, anthropic uptime is not great either. No. There's definitely some challenges. I think Claude AI says their uptime was 98.73% over the last 90 days. I think if you shorten that down it's probably a bit bad too, but if you're short of their incident on their status page, it's like, there's a lot of issues on these problems. It's a scaling challenge of any of them, so I get why it's an issue. I just love how everyone uses status page and you can always tell. You're like, okay, this is the last status page, got it. You don't even have to think about it. You know, it's such a, like, I never want to build a status page ever again. And, you know, like, I love that there's just this thing that's so plug and play. Yeah. I mean, it's definitely a toil that I never want to do. I mean, there are some cheap ones out there, but there's definitely at Leisure Ends, which is statuspage.io, and then there's a couple others. And they aren't as expensive. Some of them are more expensive than others, depending on what features you want it to. But if you just want a simple SaaS page, you get like $30 a month. It's not bad. But the, you know, having, when I first started at a company, we had a hand-built status page, and I was on call for incident command. And they were like, here's how you update the SaaS page. you take a jump box into the data center. You go to the specific jump box that has access to the specific SQL node. Then you have to update this table by hand. And then once you do that, you have to go run a Jenkins job that then does a compilation of the static page and then publishes it to the website all from this. And I was like, at 3 in the morning, I'm not doing that because my brain cannot function to think through that process. And so I forced us to change. It's a different tool. Yeah. And I thank you for that. because it's... You're welcome. When they explain that whole process, it wasn't with any kind of tone that was like, we're sorry, or this isn't a good idea, but we just kind of got to it. It was like, this is normal, perfectly normal, everything's fine. It was like, I don't like this. Yeah. This stuff might have been your life, but I refuse to allow you to live in this any longer. Yeah, exactly. I always tell people things should be simple enough and well-documented enough. At 3 a.m. when I'm drunk, I should be able to figure out how to fix it quickly. and that is not that I've actually written that in my sort of documentation that I share with team that's exactly drunk at 3am there's a test I can tell you that this requires more than two brain cells at 2 in the morning to figure out it's not going to go well I'm not your guy so we're in this era of AI defining bugs and lots of really bad vulnerabilities going on, and then this Linux issue here, CVE 2026-31431-dub copy-fail, is probably the worst I've seen yet. It's a local privilege escalation vulnerability affecting virtually all Linux distributions, and I mean all, allowing unprivileged users to gain root access with a single Python script that requires no modification across the distros. This exploit is particularly relevant to cloud environments because it can be used to break out of Kubernetes containers, compromising multi-tenant systems, and inject malicious code through CICD pipelines. The kernel patch exists across multiple versions, including 612.85 and 66137 and 5.15204, and hopefully in Linux 7. But most Linux distributions have not incorporated those fixes at the time the exploit code was publicly released, leaving a substantial window of exposure. Confirmed mobile distributions included Ubuntu 22.04, Amazon Linux 2023, SUSE 15.6, and Deviant12, meaning cloud workloads running on major providers are directly at risk until patches are applied. the five week gap between private disclosure and public exploit release combined with slow diffusion level patching has our ongoing coordination challenge in the Linux security ecosystem that cloud operators need to account for in their patch management process so yeah you do need to patch this as quickly as possible it is bad and you know I feel like we're in this world where we're dealing with a lot of really bad patches and issues and I'm hoping that this is because of the age of AI and you know researchers having new tools are able to find things easier and this will be a bad year and then we'll be more secure from then on. That's my hope. I mean, the article specifically mentioned that a security research found this with an AI tool, so that's absolutely true, right? That is helping surface some of these things. But yeah, this one's scary just because anything like GitHub Actions or Jenkins or any publicly hosted sort of execution engine is very vulnerable. Why is your JSON server publicly executable, Ryan? How do you allow that? It's not, right? It's more like if you have CircleCI or some of these others. Yeah, I know. Just make you fun of your here. Yeah, yeah, yeah. There is no GitHub or Jenkins Cloud, right? Even CloudBeast isn't foolish enough to do that. I don't think so. It would be a really bad choice on their part. It really would be. I'm going to go look right now, though. DevOps for the cloud. Yeah, for all we know, some of these terrible tools use Jenkins underneath. So it looks like the CloudBees Unified platform is a cloud version, but it doesn't look anything like the Jenkins that you know and love. So I'm hoping that it's something that's a rewrite. I don't think the word love is correct, but I'll let you have it. All right. AI is how ML makes money this week. GitHub is going to be making a lot of money because they're going to start charging copilot users based on their actual AI usage. They're shifting to this usage-based billing model starting June 1st, so you have no time to fix this problem. Replacing the current flat premium request model with AI credits that map one-to-one to monthly subscription costs with overages billed by token consumption across input, output, and cash tokens. The pricing variation is substantial depending on model choice. OpenAI GPT output tokens ranging from $450 to $30 per million tokens, meaning a developer using GPT 5.5 for dump-take tasks could see a meaningfully higher cost when using lighter models for simple completions. Basic features like co-completion and next edit suggestions remain outside the credit system entirely, but co-pilot code reviews will now consume GitHub Action Minutes, adding another cost dimension for teams running automated review workflows. This shift reflects the broader cloud infrastructure reality. Multi-hour autonomous coding sessions consume substantially more compute than a single chat query, and a flat rate pricing becomes difficult to sustain as the genetic AI workloads grow in frequency and complexity. For relevant teams, the practical implications of AI spending will now require the same cost governance as other cloud services, with model selection and session length becoming factors in budget planning rather than just feature preferences. And I'm sure you'll to hear all about this in a few weeks in June at FinOpsX. Everybody's going to have FinOps for visibility for AI workloads because this is probably the biggest gap in most of the platforms we're seeing is that cost visibility is very problematic and what people use on that etc. is a big issue. And it's so extreme, right? It's not like EC2 optimization which we chased down for years, right? That was optimizing pennies, but over those long-term things you could add up to real savings. This is like real quick. You can have a bill. And I've been using Copilot largely because the premium request model allows much more freedom. And so like this will annoy, this is going to make me switch to another one, right? Like depending on, you know, what the, how it averages out. So I hope they actually build that appropriately. But I hate running out of quota. And not only that, but you have to work on governance models. You're like, who's going to be in charge of the quota management? who has approvals as a manager approval thing. It's becoming a very complex problem very quickly and I'm in the thick of it right now. It's nice though not to be the person who's in charge of spending all the money because when I own Cloud the CFO just yelled at me. Now I'm like, I don't own this. This is not me. And I've been very clear from day one. It didn't make the mistakes of Cloud. No, no. This specific person spent this money. Go talk to their boss. It's much more pleasurable than in the cloud where it's all a bit opaque. Like, oh, well, this user spun up an API then this thing that he spun up costs a lot of money. Now this is a direct thing that person did. A little easier, but it's not always easy going through. It is not. Like, Bedrock just added the visibility per user and you still don't get it at a lot of the other sort of model. Vertex still has a big gap there as well. Yeah, huge. Oh, sorry. Vertex is no more. Agent Platform. Oh, yeah. Can you change your nomenclature overnight? I know. It took me forever to start calling it Vertex. To the same name? SageMaker 2.0. The same name as every other Gemini product? They've got everyone hooked now. All these companies have everyone hooked. And they want to keep, you know, people are wanting to use it. Everyone, how many of our listeners, I'm sure, have been told, you have to use AI for X percentage of your job, et cetera, et cetera. And it's been so subsidized that in this case, it's not going to be subsidized anymore you're going to start to get bills you're going to have to teach your finance team or someone random hey, while you're using this for this, don't use Opus for everything go use Sonnet for these things go use Haiku for these simple tasks teaching is going to be a whole other learning adventure for your developers also, particularly here which I'm sure everyone's developers have been told AI, AI, AI you have to be using it, if you're not using it how are you getting anywhere now you're actually going to start paying that bill and I think you're going to see some sort of decline because Ryan said and I know I do I abuse that premium tier as much as I could right now and soon I will not and there's so many cases where you don't have really full control over the tokens going in or coming out like doing a code review on a code base like do a code review just on this portion of my code base Sure, but is that going to be as meaningful as something that can trace calls all the way through and realize that calling it this way is going to have an impact with the module elsewhere? So it's kind of annoying. Well, and it's expensive, too. I turned on Claude code reviews for my personal project, and I turned it off within a day because it burned every code. I think we talked about it on the show when they first announced it. Every code review is going to cost like $30. Now, the code review is super thorough. and if I was an enterprise I'd probably be interested in this but as my personal development I was not interested in what I was getting. But it's weird because on the CloudPod we have CloudCode for pull requests as well but it's the old version and the old version is good enough for what I need for the CloudPod I'm like how do I get that old version though on my other project? I can't figure out how to do that so I've actually been playing with other tools like CodeRabbit which are like 30 bucks a month for CodeRabbit and I'm like well that's worth it to me that's just as good as what I see with CloudCode and it does some very similar things. But again, it's the full system context we don't all have, which is what you really need. It is funny to me how many dumb bugs that AI makes. Yes. AI then catches itself making. Even as the same model, you're like, huh, that's so weird. Okay. It goes back to our prior conversation too, which is like, okay, we're finding a lot of bugs because we're leveraging AI now that you're not. And we're producing so much more code. If you look at that GitHub chart that they had of like the number of code lines committed and everything else for producing more code than ever before and people continue on that pace but we're not reviewing it and we're not getting that second I'm going to say quote unquote eyes on it even if it's another AI bot reviewing it are we just going to be adding more bugs or is AI going to be producing better code but based on what I've seen no you know and kind of you'll see that drop off so I'm curious to see where that on the spectrum everything falls over the next couple of years. I assume if costs go up, they'll kind of drop proportionally to one another just because you'll generate less code and therefore you'll review less code. But we'll see. Depends if you have your security department budget to go use the expensive AI tools. Yeah. That's, yeah. I mean, it's true. I mean, when I look at my GitHub profile, like year over year since AI's release, like the amount of code lines that I've contributed is hundreds of thousands more year over year than previous. And I'm not a developer, right? This is sort of tangential to my day job. So it's kind of crazy how much it empowers, but it's expensive. And as we talked about earlier, it's killing the environment. So it's like we got to balance all that and figure out what it is. Yeah. But until then, we're going to all keep losing our jobs to AI allegedly. Right. And not the slowing economy due to the Iran war, tariffs and everything else, but we're going to continue to blame AI. Well, yeah, and we'll use AI and then we'll forget how to do our jobs, which is fine. And then so that's, you know, it's all downhill. Yeah. There are a lot of cloud cost management tools out there, but only Archera provides insured commitments. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment as short as 30 days. If you do not use all the cloud resources you've committed to, Artera will literally cover the differences. Other cost management tools may say they offer insured commitments. But remember to ask, will you actually give me my rebate? Artera will. Check out the cloudpod.net slash Artera to schedule a demo today. one of the core issues that we haven't talked about that's depressing about ai is of course identity and ai agent identity and so there was a good article here from snowflake and you know they have a solution i think it's more interesting to talk about ai governance in general the core issue snowflake raises that ai agents lack persistent verifiable identities meaning when an agent queries data initiates a transaction or produces a derived insight there's often no audit trail linking the action to defined authorization or scope. Snowflake argues governance must be embedded at agent creation, not added later, with explicit permissions, expiration windows, and scoped access that does not simply inherit from the invoking user's credentials. A notable technical concern is the derived insight problem where the agent authorized to access HR data and financial data separately may not be authorized to combine them, and currently access controls and source data alone do not address this boundary. Snowflake's internal go-to-market AI assistant serves as a practical reference point using role-based access, certified queries, and defined scope at creation to support over 6,000 employees answering 35,000 questions per week with full audibility. For enterprises and regulated industries like financial service or healthcare, the absence of agent and the infrastructure creates concrete compliance exposure. So everyone's trying to solve this. Okta has a solution for this. I see it in some of the other players like Delinea are trying to do things in this space as well. So there's lots of people trying to solve this and there's really no right answer that seems perfect for all use cases. You can't implement a solution without having a specific set of technologies that you're combining together as a platform or like a big purpose tool that you're spending all the money on to leverage it. It's really tricky to do. Like how do you gate all the agents from like someone running cloud on their desktop to the application that's running in the cloud to the chatbot that's on a website. And it's sort of tricky to gate all of those things and put protections on all of those things. And, you know, until there's like sort of a, I'm waiting for the next, you know, like MCP or agent to agent protocol where we're, where it's off so that it can all be a common element and you can leverage centralized tooling to sort of govern identities for AI agents in multiple places. like this solution works great for Snowflake. Gemini has released Agent Identity that works great on Gemini Enterprise and within Azure, but how do I manage, can I manage those the same way together? You know, not really. That's tricky. Well, we'll keep an eye on this when Ryan solves this problem for all of us. Let us know what it is. We'll come back and talk about it some more. Yeah. For those of you who are big cloud code users, and you follow any of the Reddits for CloudCode, you'll know that there's a lot of people who always complain about basically CloudCode getting dumb over time. And this is typically being caused by the massive amount of change that Anthropic is making to CloudCode. And so they basically have kind of killed their own credibility. In fact, they've seen Cortex downloads increase like 4x in the last two weeks just due to some of the CloudCode things. And so, it's interesting because... I'm one of them. Anthropic confirmed three product level issues degraded cloud code performance over seven weeks starting March 4th, including a reasoning effort downgrade from high to medium, a bug discarding reasoning history mid-session, and a system prompt capping response to the 25 words between tool calls. That's an issue. The issues were fixed as of April 20th and the public published at post-mortem, but the seven-week gap between the first issue shipping and any public information led to significant user backlash, description cancellations, and speculation across GitHub, Reddit, and X. A notable analysis by AMD's Senior Director of AI examined 6,852 cloud code session files and 234,760 tool calls, including cloud-shifting from a context-gathering approach to a faster edit first style that increased error rates on complex engineering tasks. As in, it has a radical risk for team-building workflows on top of AI coding tools. Undocumented behavioral changes cascade into downstream systems, delivering commitments and overtrust where any official acknowledgement arrives. And to the fact that I don't update cloud code automatically. Like, I wait and go read what people are saying about it before I do any of those upgrades just because it does, it has had an impact and has had problems in the past. So, we are hoping to see Claude and Anthropik basically become more transparent about these issues and hopefully address them quicker waiting seven weeks because it really hurt them in the eye of public markets. I also call BS. I've had issues much later than April 20th and it always seems to come up right around the time when they're releasing a new model. No, that's their whole infrastructure crashing down the scenes. But it's them tuning it to deal with the load for sure. And so there are things where you're requesting a certain level of reasoning and they're like, no, I'm going to shift you through two days or two layers down. There's nothing you can do about it. It's completely behind the scenes, except for you get this answer that doesn't make any sense with all kinds of hallucinations. I mean, I think part of it is the, you know, you had to send out these new models, you know, demands are really high. So you start kind of chipping away capacity from your other ones to reallocate them because there's only so much finite capacity out there. But I think it's interesting they haven't released a new Haiku model since October of last year. They've released a new Sonnet and a new Opus. Two Opuses since then, right? Yeah, two Opuses. That's true. And I think it's just because they're like, well, if we give another Haiku model, then it's even more capacity we have to deal with. And so I imagine it's a challenge of how do you scale something this large at this frequency and not piss people off in the process? But I think transparency goes a long way in this. And I think if they were more transparent, of course they're not going to tell you, hey, we're going to release a new model next week, so we're pulling capacity. But other issues like bugs or acknowledging bugs would be a big part of it. And so that would be helpful. Yeah. This is interesting because SaaS companies, as the world took on more SaaS, you did get a lot of that transparency. And you got companies like Amazon talking about how they deal with big spikes in traffic and a lot of transparency. And it seems like we're going away from that. And I wonder if that's just, because it's not really a SaaS business. I mean, Anthropic did have a public postmortem about this, but again... It was like three weeks later. Yeah, it was like three weeks afterwards. And it didn't really say... And it's gaslighting me. Like, it's not true. Right. Yeah. Like, there's no way you can tell me that, oh, no, we fixed it all on the 20th. No. No, you're not. I feel like a lot of the large companies and the massive companies do public RCA's and postmortems in that way. A lot of the medium to small companies dole. because I've had vendors of mine that just go down and you're like, what happened? And they're like, I'm like, well, I need an RCA because you went down this the third time in two weeks and they're like, it was a bug. We fixed it. I'm like, that's not a post-mortem. Give me any real information. They're like, no, we don't do that. I'm like, okay. I mean, I guess Anthropics is not that big, but it's kind of crazy. They're bigger than you think they are. They have 5,000 employees, allegedly. Okay. I was thinking of the cloud flares, those level, but I guess it would be a doubt. Even Amazon has gone away from the really good post-mortems they used to do. I feel like the reason ones have been light on details and a little bit more, they blame the user, which was something they would not have done in the past. Well, the divorce is official, official, I think, finally. OpenAI and Microsoft have amended their partnership agreement to make Microsoft's license to OpenAI's IP and models non-exclusive, allowing OpenAI to offer its models for major cloud providers beyond Azure. Azure retains the designation of primary cloud partners through 2032, but that status is conditional on Microsoft's ability to continue honoring the arrangement, which introduces some ambiguity worth watching. The revenue share structure changes notably. OpenAI will continue paying Microsoft 20% of revenue, but that obligation is now capped at an unspecified amount and only guaranteed through 2030 rather than running indefinitely. and the removal of the AGI clause and meaningful structure change as the revenue share is now independent of OpenAI's technology progress, eliminating the previously contentious trigger that could have ended exclusivity based on a hard-to-define benchmark. For developers and businesses, this opens the door to assessing OpenAI models through providers like AWS or Google Cloud, which we'll talk about shortly, which could affect pricing, latency options, and procurement decisions depending on where the workloads already live. I feel like whoever wrote this contract either was done so long ago that the concepts that were running into didn't exist or it's a really bad job also negotiating it. Contracts should have details and metrics and very defined things, but maybe it wasn't possible back then. I know. How do you even have this conversation? I'm like, okay, Ryan, I have a new technology that you have never seen before that I think is going to revolutionize the world. And I'm going to need hundreds of millions of dollars of compute capacity from you that I cannot pay for. Would you invest? No. Exactly. Yeah. So Microsoft had all the cards. And so they were able to find, you know, Microsoft was willing to try this experiment with the right to get digital revenue things. Now, this is all pre-Chat GPT. This is all back when they were doing GPT-1 and GPT-2 and no one saw those products. and they were going out trying to find this. And so then finally, they get ChatCPT becomes the secret app that unlocks the potential of all of this stuff. And now all of a sudden, they're making money hand over fist and all of a sudden the deal doesn't look so great. But now also, you can't meet my demands either, Microsoft. So now we have a different problem. Not only has the business changed, we now know it's profitable and it's something, and we know we have something and you've screwed us over, but now you can't actually produce what we need. That's the problem. That's the change. So all the leverage shifted and that's why they're able to get this done. Yep. And that's why it's been slow, right? They had to unwind it. Right But like and not like I feel like there so much ambiguity in there Just me I mean the AGI thing was super ambiguous At least I got rid of that. Because it was like, what's AGI? I don't know. We'll know it when we see it. Oh, good. I love that in legalese. We'll know it when we see it. Yeah. When the AI gets smart enough to nuke us from orbit, but then we won't really be worried about that clause. Yeah, we won't care anymore. I think we talked briefly about Meta releasing MuseSpark, which is their new proprietary cloud-only LLM built from scratch, new infrastructure and architecture. And when we talked about it, we didn't really know what was the future of Llama. But apparently now, Meta has confirmed Llama is dead. MuseSpark offers no downloadable weights, no self-hosted capability, and is currently limited to private API preview access. Existing Llama models will remain available on major cloud providers that are expected to receive only incremental maintenance updates with no continued frontier-level investment. a substantial user base has met a report of 1.2 billion LAMA downloads before the pivot. There is no migration path from LAMA to Muse so due to fundamentally different deployment models and switching to alternative providers requires rewriting vendor specific APIs. Developers looking to stay in an open way to ecosystem have three practical options. Continue using existing LAMA models knowing that they will fall behind front-tier competitors. Switch to alternatives like Mistral, DeepSeek, or Alibaba Quinn or migrate proprietary APIs from OpenAI, Google, Anthropic, or Meta. yeah this i long ago seemed to fill a large gap right like it was and so i mean quinn i i see a lot of but then other you know i don't see mistral very much and so like it's kind of kind of nuts for for local stuff and it kind of you know if you don't want to pay huge amounts of money and you want something that's a little bit more open source it sucks if there's not a real real option that really can replicate what you're experiencing with the like a commercial great one yeah i've really heard much of a Mr. at all. Cohere is kind of out there too as another option potentially. But DeepStick and Alibaba and Kimi, all Chinese, all highly popular and very successful and way cheaper than any of the opening APIs. It's one of these where I'm sure Meta felt they were under pressure that they weren't monetizing Llama. And so because they weren't monetizing it, they were getting punished on their stock price. But the fact of the matter is if they built a better ecosystem around Llama that people could take advantage of to customize and tune Llama into different things. They could have built probably a pretty successful business around their open-weight ecosystem, but Zuckerberg's just not that guy. It's not his style. And so I just, I think it was always sort of a weird choice that they went open first. And, you know, I hope maybe someday they'll come back and rethink this, but I doubt it. And without any of the traditional stuff, right? Like, you know, there's plenty of companies like Reddit that they've made money on the open-source models of things, right? but so nope yep so rest in peace llama we barely knew you you know hopefully the Gemma 4 and some of the others like you know those models get some love from Google etc so I would love to see OpenAI also release I think actually maybe have an open source model really? I think really? I don't know if it open source open code you know I don't probably define these as a little weird they have an open model by OpenAI the GPT OSS model. I have seen that around. Yeah, the O4. The OSS 20B GPT stuff. So they exist, but again, I don't know how open those are versus LLAMA was open. I don't think I can take a GPT OSS model and go create a new model out of it like you could with LLAMA. And then speaking of OpenAI, they released GPT 5.5 Instant as a new default model for all chat GPT users, replacing GPT 5.3 Instant and is also available in the API's chat latest. Paid users retain access to GPT-503 Instant for three months before it's retired. The hallucination reduction numbers are worth noting with GPT-5.5 Instant producing 52.5% fewer hallucinated claims than GPT-5.3 Instant on high-stakes prompts in medicine, law, and finance, and reduced inaccuracies claims by 37.3% on conversations flagged for factual errors. The model includes improvements in visual reasoning, math, STEM questions, and smart decisions about when to invoke web searches, making it more capable across the kind of tasks everyday users actually run into. Persolation gets a notable upgrade with faster retrieval from past chats, uploaded files, and connected to Gmail using a new memory source feature that shows users exactly what context shaped their response and lets them delete or correct it. For developers and businesses, the API availability as a chat latest means these factually and persolation improvements roll in automatically, though teams relying on consistent behavior may want to pin to a specific model version, given the default is now changing. Although it will go away in three months, so pin, what care? Yeah, that's difficult to adjust. I don't know. Most businesses I've worked at three months isn't a whole lot of time to pivot to something new. I don't know what kind of changes it would take to go from 5.3 to 5.5. I almost wonder if it's like if they're trying to make it just like, you know, here's, oh, I can't think of a package. You know, whatever open source Linux package, you know, 1.1 to 1.2 to 1.3, just a standard, you know, software development package, you know, library that you're upgrading. and 99% of the time you can just upgrade whatever the default is as long as your application code seems to work, you don't care. I wonder if that's where OpenAI is trying to make it by just changing the default over to this thing. You're just always using latest, was it SCA or STAT, whatever, and your dependency analysis just automatically updates to the latest version of it as it goes. Yeah, I mean, that part makes sense. but it's like if you I know that I can't remember which one if it was 4.0 to 4.1 but it was one of them where it was like it changed the entire interaction and people were really upset like my friend that I've been chatting to for six months just went away you know became lobotomized. I think that was like I don't remember which versions right but it's one of those things that companies who are using these for their interaction for customer service or for that, you know, I can see the need to want to sort of control that a little bit more. But I don't know what it takes to sort of update all your prompts to like be less of a jerk, you know, be less friendly that, you know, tuning all that stuff when you're embedding that into your own products. Yeah, I think it's less of that, but just QA testing it, you know, and making sure that you've worked out and like using and like adjusted the prompts accordingly, because A lot of companies, you know, from doing, you know, security questionnaires and whatnot that touch that world of AI, they go, how are you testing? How are you validating the results are accurate? How are you working through the full system of it? And that's what takes the time. Yeah. And everyone's making up answers there because there's not a real good way. Yeah. We asked the same question multiple times. And so we said the same thing. So we won't. You know, like. well let's move on to amazon as we're already an hour into this jebus i believe there's not a ton of articles here but uh last week we talked about quick and we were very confused and it just happened we had gotten some quick articles on monday and then tuesday after recording cutoff is really when all the news really dropped on this so basically they took a product called quick that they used to have and they killed that product and they created a new product called quick so what we were very confused about was that there was still part a QuickSight in it, which is the authentication layer, which is still the case, which I think is basically Amazon's public cognito instances through QuickSight. It's the no AWS account sign-up model. Yeah. And so basically, Amazon Quick is an AI system that connects you to your apps, tools, and data to answer questions and take actions on your behalf, including scheduling meetings, sending emails and following up on tasks with roles for workflows for sales, marketing, finance, and operations. The new free plan lets users sign-up in minutes using personal email or using Google, Apple, GitHub, or Amazon credentials with no AWS account required, lowering the barrier to entry compared to most AWS services. The personal knowledge graph feature is notable because it learns individual user priorities and preferences over time, grounding responses on real business data rather than generic AI outputs. Pricing tiers include free plus professional enterprise plans with higher tiers adding agentic and business intelligence capabilities, enterprise governance, and unlimited user support. Pricing details are available to you over at aws.amazon.com slash quick slash pricing. The no AWS account signup, which is the quick side thing, positions quick as a standalone and SaaS product rather than traditional AWS service, which is meaningful shift in how AWS is packaging and distributing AI tooling for most business users. The pricing on this is the free tier is $0, which allows you to chat about any topic or task, research anything in depth, automating repetitive stuff, turn ideas into real apps, turn conversations into deliverables, and connect the tools to Slack, Microsoft, Google, Workspace, QuickBooks, and more. For $20 a month on an annual contract or $25 per month billed monthly, you can get everything in free, plus Quick on your desktop, which is a proactive AI across email, messaging, and local files, shared spaces for your team with knowledge, agents, and automations that compound across people, quick work where you work, browsers and Microsoft 365 extensions, and scale when you're ready with user management, centralized billing, and up to 300 users. And then the professional level is $20 per user per month plus a $250 infrastructure fee per organization, growth out limits, enterprise governance, RBAC, SSO, data sovereignty, and admin controls, dashboards and data visualizations that surface for what matters, automate complex processes, support when you need it, and 25 gigabytes pooled storage for user. And then for $40 for user per month, you get additional, all that, plus author dashboard is your way, certify and publish assets, and 50 gigs pooled storage for user. I was just doing a query on Nova about, you know, what is the difference between Amazon Nova and Quick, just because I wanted to get it. And it failed, like you'd expect. So, yeah. It talks about, like, the Nova launcher on Android phones. I don't think... Well, ironically, Amazon Quick is powered by Claude. Because if you ask Quick, it says, I am powered by Claude, made by Anthropic. So it does not use Nova. What? I wonder... That has to be some sort of limitation they built into Nova accidentally, right? Because... I think Nova needs a major update. So I just think... So yeah, so there's Amazon Quick for desktop application as part of the paid tiers. Right now it's available for free because it's preview. So play with it now to see if you want it later. They do support documents and visual creations in chat. So it can create Word documents, PDFs, PowerPoint presentations, Excel spreadsheets. I have not tried any of those features yet, but I am sort of intrigued to try them. And it also integrates into Google Workspace, Zoom, Airtable, and many other SaaS applications through their connectors that are out there. So there you go. That's what Quick is. So now we understand it. So it's just not that. It's just not that. It was kind of what I said last week. Just was confused with the QuickSight part of it. I was like, I don't know exactly how it ties into QuickSight, but it's a desktop app. It looks like chat GPT. And since I installed it last week, I launched it for the first time today, and it had basically 30 releases since last week. I opened this product. Wow. It's rapidly being evolved as we speak. They have their CICD workflow down properly. Yeah. Amazon announced a couple of new things for Kinect. Kinect, of course, is their call center software. First of its decisions, it's now generally available as an AI-driven supply chain planning tool, combining demand forecasting, constraint-aware supply planning, and automated exception triage into a single solution targeting retail, CPG, automotive, and industrial manufacturing sectors. The service positions itself as an overlay on existing systems rather than a replacement, which lowers the adoption barrier for enterprises that have already invested heavily in ERP or legacy supply chain infrastructure. That's nice, I guess, if you're into that. The one that's more interesting to me is Amazon Connect Talent extends existing Connect Contact Center platform into the hiring space using AI agents to conduct structured voice interviews and score candidates consistently, which reduces recruiter workload during high volume hiring periods. System draws on Amazon's internal hiring practices to power adaptive questioning and science backed assessments, aiming to bring more consistency to a candidate evaluation compared to traditional recruiter led screening calls. Preview capabilities include ATS integrations, a mobile first candidate portal, and the ability to evaluate hundreds of candidates simultaneously, making it relevant for organizations that experience seasonal or surge-based hiring needs like retail, logistics, or call centers. They're available only in U.S. East-North and U.S. West with no public pricing announced yet for the preview period. This organization's interest in cost modeling will need to request access through Amazon Connect and the Talent page to get details. One product consideration with the regulatory and bias-risk landscape around AI-led hiring tools, And the fact that if you make me do an AI hiring tool, I probably will not continue on the interview process. It sounds terrible. It's already so bad. I've done a few of these. Yeah. Have you done the AI interview? Because I've just done, I've seen a lot of like AI evaluation of like resumes and seeing the output of that. They're not good. No. Okay. So I've done in the past, I actually had a client I used to work with that did, you know, it's one of those things I sort of understood what they did in 2018. and now I fully understand what a multi-modal model is, which I did not fully understand back then, where they did kind of games, and their target was more like UPS drivers, FedEx drivers. You know, you're looking at like pools of people, and then you scored based on how those games interacted with other people at that same position. So you had to have like a certain number of people at the positions and whatnot. But I've done a few of these recently as, you know, I've kind of played with, you know, one of those things you always do in life is you show it to be looking at new jobs and everything else but you know as I looked at this it's interesting to do one I did one voice call mainly out of curiosity of how not fun it is and it's like tell me your name and you tell it and there's just like this pause like the real timeness of it isn't quite there yet so like it's almost like awkward still because it's not in real time and it's laggy and it's clunky you know so maybe connect now has it because that was probably about six months ago i did that but it was still it was an interesting thing to do i think if you ever actually make me do it as like an initial phone call i might just say thank you i'll try again later my interview is from someone on space what's going on yeah all right next up is open ai models codecs and managed agents have come to AWS. They are basically expanding their partnership to bring OpenAI models, including GPT 4.5 to Amazon Bedrock in limited preview, giving enterprises a path to use OpenAI capabilities within existing AWS security controls, identity systems, and procurement workflows. Codex, OpenAI's coding agent used by over 4 million people weekly, can now be configured to run on Amazon Bedrock as the model provider, meaning usage counts towards AWS cloud spending commitments and customer data stays within Bedrock infrastructure. Initial integrations include Codex CLI, the desktop app, and VS Code extension. Amazon Bedrock Managed Agents powered by OpenAI is a new offering that handles orchestration, tool use, and governance for multi-step agentic workflows, reducing the infrastructure work required to move agents from prototype to production. All the capabilities are launched today in limited review, so available is now not yet general, and pricing details have not yet been publicly disclosed beyond the note that codex usage can apply towards existing AWS cloud commitment. So, if you have a burn down you need to do on your cloud build, this might be a great way to do that. That won't take a lot. It won't take long, I'm sure. Yeah. I mean, I'm starting to like this model more and more just because it's, you know, it's something that a lot of enterprises already have, which is a cloud ecosystem. And especially with Amazon and Bedrock, them releasing the sort of visualization of the IM identities behind some of the usage on Bedrock is super powerful. So that's, I kind of like it. So this one sounds like it's a little bit more full-featured than what I've seen on similar things from Vertex AI with managed agents and being able to orchestrate multiple codecs things. So it's kind of neat. AWS is adding a visual configuration editor for CloudWatch agent directly in the EC2 console, letting users set up metrics, log sources, and deployment targets without manually editing JSON configuration files. Feature supports tag-based policies for automated fleet-wide management, meaning new instances launched via autoscaling automatically receive the correct monitoring configuration without manual intervention. From the instance detail page, operators can view agent status, update configurations, and troubleshoot agent health in one place, consolidating observatory management without previously required separate tooling or CLI work. Visual editor is available in all AWS commercial regions at no additional cost for the management experience itself, but standard CloudWatch pricing still applies for the metrics, logs, and traces the agent collects. and having troubleshot CLI-level CloudWatch stuff many times. Thank God. Thank you. I mean, I store my configuration once I get it right into parameter store just so I don't ever have to do that wizard on the client ever again because it's so painful. But being able to quickly see JSON configurations for log files would be great in a GUI. So thank you for that. I have not played with this yet. I meant to do that before the show, but definitely this potentially is a huge quality of life improvement for me. Yeah, especially if you're doing like custom log location and want to tweak it. Mm-hmm. If I didn't run so much containerized workloads, I probably would care a lot more because these container logs are always in centralized. Yeah. It's really the stuff that's not in the container that I would need this for. For the things hosting the containers right like that. Yeah, the ECS logs now I can, you know, typically my configuration CloudWatch logs. that CLI base setup I can still see in my head and I don't think I've done it in about 5 or 7 years and I'm very happy they made this yeah that was truly truly terrible well then you also had the multiple agents for a while because you had the CloudWatch agent you had the SSM agent I think there were two CloudWatch agents I think there were two CloudWatch agents just by itself yeah and then you had the SSM agent and I know they merged them all I think at one point yeah they did merge them all finally no they did successfully do that yeah they did it but it was painful it was just painful everything about it was painful I'm going live real time here so I have not installed the cloud on either of my machines so this real time feedback is going to be more interesting yeah so I will I'll be back next week a follow up Justin does the thing Yeah. All right. Amazon is apparently trying to turn its massive shipping operation into another AWS. Amazon Supply Chain Services, or ASCS, opens Amazon's fulfillment network to outside businesses across automotive, healthcare, electronics, apparel, and food industries, directly competing with DHL, UPS, and FedEx. Companies can store inventory in Amazon fulfillment centers globally and access its fleet of trucks, aircraft, and delivery vehicles. The service expands on the supply chain by Amazon offering launch in 2023, which initially focused on shipping products directly from factories. ASCS broadens this thing to include freight distribution, fulfillment, and parcel shipping for businesses of all sizes. Early adopters included Procter & Gamble, 3M, Land's End, and American Eagle Outfitters, suggesting the service is targeting establishing enterprises rather than just small sellers. Pricing deals have not been publicly disclosed at the launch. The parallel data is worth noting for cloud practitioners. Amazon built internal infrastructure at scale, then monetized as a third-price service. The same model is used when opening its web infrastructure to outside customers in 2006. ASES follows that same pattern with physical logistics. So, yeah, this could be really cool. I mean, they've always had some capabilities around this. Like if you sell on the Amazon marketplace, you can ship your product to the Amazon warehouse and they'll take care of fulfillment for you. This is basically saying, look, you don't have to have any of your stuff going through the Amazon web service. We'll just sell you directly the logistics network. And so if you want to ship your packages, I'm sure all of the tools like Shippo and others will add Amazon supply chain services as one of those options. And if it's cheaper to ship through Amazon than it is to ship through DHL or UPS or FedEx, it'll tell you that and you can make that choice. Didn't Toys R Us move to Fulfilled by Amazon in like 2019 or something really early? So they originally had a partnership where Toys R Us went to Amazon.com and that was a bad choice. because that basically moved all those Toys R Us customers directly to Amazon customers, led to part of the deterioration of the Toys R Us brand. So that was not a great move early on. But that was early .com. No one knew it was, everyone thought it was a fad. But yeah, I mean, there was definitely been things like that. But it's interesting to me too because a lot of Amazon's fulfillment still comes through UPS and FedEx. So a lot of the last mile delivery is USPS or FedEx or these things. And so are they going to, if you're using ASCS, is your stuff still delivered by UPS anyways in some circumstances? So like in some ways, you know, does UPS and FedEx, are they a partner or are they a competitor? Kind of both. Kind of both in some ways of this. So curious to see how this shakes out over the next year, really, probably before we really see the impacts of it. But definitely on the news, FedEx and UPS stock were down. Well, is it there also, I remember there was a negotiation like a couple months ago with USPS and Amazon trying to finalize their multi-billion dollar deal too for that. So you're kind of looping in and taking all these different shipping vendors sort of along with you, but also tearing them down as they go. It's going to be interesting to see where all this falls. Yep. I'm very curious. I'm really curious about the pricing of it. That's going to be the biggest part of it. I don't think you'll ever really find out. can I make CloudPod t-shirts and basically send them to the warehouse and sell them through our website and then have them get shipped by Amazon? That'd be awesome. I would love it. I don't want to sell the t-shirts through Amazon website. That's silly. There's like a thousand TikTok about side hustle and how people do dropshipping. Yeah. Now you have AI handle the front end stuff. You really don't have to touch anything, right? Yeah. Potentially, This is going to be really cool. And then, yeah, we'll see how it works out. Launching agent core optimization and preview, adding automated recommendations, batch evaluation, and A-B testing to close the Observe, Evaluate, and Prove loop for AI agents running on Amazon Bedrock Agent Core. Privacy developers had to manually read traces and guess at prompt fixes without systematic data-backed evidence. The recommendations feature analyzes the production traces from CloudWatch law groups and proposes changes to system prompts or tools descriptions based on the specified evaluator without touching underlying tool implementations. So this is a good feature. Yeah, this is what we were just talking about with the ChatGPT model. So this is, that's pretty sweet. It's like we had the foresight to read all the show notes ahead because we could have linked those two together a little bit better. I mean, I read it. We read them. I knew. I forgot. I'm not going to lie. For the dozens of us who are very excited about Ruby 4.0, it's now available on Lambda. I'm not even sure there's dozens of you anymore. I don't know. I can tell you that I haven't even written anything on Ruby 4.0, so I have no idea if this is good or bad. I have, unfortunately, moved on to Python and to Go for most of the things I code these days, and TypeScript a little bit as well. Unfortunately, JavaScript. Yeah. But it exists. It's a thing. And it's front-end, and you have to. Yeah, you have to do it sometimes. But anyways, yeah, so this is great if you are into Ruby. If I thought I wanted to put myself with a dead language, I would go be really excited about this. I'm happy. At least it's available if I ever need it. And full disclosure, Justin did try to kill this story. Yeah, you said I had to keep it. I had to keep it just to make fun of it. And Peter's not here anymore. AWS IAM is now providing higher maximum quotas for roles, role trust policies, instance profiles, managed policies, and identity providers. Some of these are increasing from $5,000 to $10,000 per account, or open ID connect providers from 100 to 700 per account. The role of trust policy length increased from 4,096 to 8,092 characters, particularly useful for organizations with complex cross-account or federated access patterns. These increases are not automatic maximums with adjustable limits, meaning customers still need to request increases via the service quota console. Boo. There's no additional cost, especially with these quota increases, as I am itself remains free. I mean, the only way I actually understand is the role of trust policy length, because again, the cross-account and federated access makes a lot of sense to me that that's much more complicated. But anything other than that, I hope you have automation for. Because 5,000 to 10,000 instance profiles, bleh. That would suck. This is the agent identity problem, right? I think they're getting ahead of it. Especially the OIDC provider limit, I think. You're going to have a whole bunch of agent apps that are handling that OIDC flow or authenticating into Amazon using OIDC. So this is going to be something that you'll see more of. And hitting limits is going to be probably pretty common given just how much spread there is with agent identities and how we don't even really know how to assign an agent identity to a workload. We don It just me Yeah right Until it not Until it not Well that good If you want a terrible way to run AI agents in Amazon you can now run them on WorkSpaces, as they now support AI agents operating virtual desktops and public preview, agents to interact with legacy desktop applications through mouse clicks, keyboard input, and screenshots without requiring any API integration or application modernization. This feature addresses a real enterprise problem. According to a 2024 Gardner report, 75% of organizations run legacy apps without modern APIs, meaning AI just previously had no practical way to automate workflows in those environments. So I do know that there is a Cloud Code plugin, or you can run Cloud Code on Amazon Lighthouse. What is it? LightSail? LightSail, thank you. But I feel like this is maybe them planting some seeds that we might get an open claw implementation at reInvent. That is interesting. I was immediately thinking about, like, you know, the old mechanical Turk type things where this is just, all this is going to be used for is sending me spam emails and texts and yes, terribleness. But it is interesting to like have AI virtual desktop sort of Amazon workspaces. Cause you know, I don't want to use Amazon workspaces, but you know, an agent doesn't have any choice. So yeah. And they can't complain, right? Cause they're too happy. And can I make it run in a Linux one too? That way they have to suffer that outdated Linux packages. No, you're going to run that Linux workspace and you're going to like it. I'm just picturing somebody spawning every agent into their own workspace and having hundreds of workspaces scale up and down every second because each agent gets its own workspace and it sounds painful. It's definitely an interesting choice. Again, I assume this is an open cloud. You need this. It basically is interesting. It uses, it authenticates through IAM and full auto trails by CloudTrail, which of course you need. It does the invitation using the MCP standards. So the future is working with popular agent frameworks like Langchain, Crew, and Strands to manage MCP endpoint exposed to the workspace stack. So definitely intriguing opportunities that you might be able to do with this. So we'll see. AWS WAF now includes an AI traffic analysis dashboard that tracks over 650 unique bots and agents, giving organizations visibility into which AI companies are accessing their content, what those bots are doing, and which endpoints they re-target most frequently. Thank you. Because you had bought, you told me that it was an AI bought, but you didn't tell me what they were doing. So I just have this big number. I'm like, I have no idea what that's doing to my site right now. Yeah. And then you had to go look at IS logs or Apache logs. And then you're just having a bad day and no one's happy and I'm cranky. And so thank you. Thank you for probably doing this. And he's the executive. He needs it in picture form. Yeah. Yeah. Which is the WAF dashboard is very pictory. It is very pictory. But I mean, comparatively to, you know, when I started using it, like where you didn't have any logs, you were just like, it's, working, I promise. I also found lots of fun ways to make WAF really expensive. There's all kinds of really complicated rules you can turn on. Why did that bill go up so much this month? Oh, no. And they have different weights, right? Depending on which one you use. Which one's being triggered and which one's being hit first and the order of operations matters and all kinds of things. I also found out the hard way. Still stuck on WAF 1.0 and I'm like, you guys and 2.0, which has been out for so many years. Yes, 2.0 is... Well, I think it's even like 2.5 now. I'm sure it is. It's just been that long since I've really gotten that detail. I mean, I use it for side projects and things like that. It just runs, but I never really touch it. I just use it to affect the CloudPod website because it gets a lot of... because it has to be open to the world because, of course, podcast listeners are global. They're everywhere. And so, you know, you have to leave Russia open and China open and all these places. And so there's a lot of spam kitties who like to hit the site all the time. So, yeah, I protect it with WAF and then also a firewall on WordPress and all kinds of craziness to make sure there has to be. Defense in depth, yo. Yep, exactly. Because that is the only way to secure that thing. It's WordPress. It's never secure. It's never truly secure. I know. All right. Google has signed a classified deal with the U.S. Department of Defense allowing use of its AI models for any lawful government purpose. If you remember right, this is what Anthropoc got in trouble for. So apparently the no longer do no evil Google is no longer applying to military use cases. So the agreement includes non-binding language stating Google AI should not be used for domestic mass surveillance or autonomous weapons without human oversight. But the contract explicitly states Google has no right to veto or control lawful government operational decisions. So we told you not to, but if you do, I can't stop you. The deal also requires Google to assist in adjusting its AI safety settings and filters at the government's request, which raises the question about how its standard model guardrails will be maintained across commercial and government deployments. and for GCP Enterprise customers, this is framed as an amendment to an existing government agreement rather than a new standalone contract, such as Google is expanding its existing cloud and AI footprint within federal agencies. Yeah, the AI safety settings is the part that really bothers me because it's going to be the government saying, don't provide details about this thing that we're doing. Or if someone asks about finding out our dark ops something, something, something. Send us an email. It's just kind of gross. Use the tool. I get that. I know why people don't like that, but this amount of interaction makes me feel real gross about that. You can now generate files in Gemini, which was something I thought you could always do, but apparently that was because I used Gemini Enterprise. But Gemini itself can now generate downloadable files directly from chat prompts, supporting a broad range of formats including PDF, Doc, X, XLS, CSV, Google Docs, blah, blah, blah, blah. Features available to all Gemini app users globally at no additional cost beyond existing Gemini access with outputs downloadable to local devices or exportable directly for Google Drive. Thank you. Yeah, really handy. No longer have to copy paste everything. Yep. Introducing Agent Gateway for ISV ecosystem for security and governance. This provides a programmable data plan that sits in the request path for all agent traffic covering user to agent, agent to agent, and agent to tool interactions, including MSP calls. Google announced a partner ecosystem of 14 security vendors integrated with Agent Gateway covering identity governance from Okta, Ping, Savant, Silverfort, DLP solutions from Symantec and Netscope, and Runtime AI Protection from Palo Alto, Prisma, Cisco AI, Defense, CrowdStrike, Zscaler, 2.5, Exabeam, and Thales. A key design principle across most integrations that security controls inject into the existing request path without requiring application code changes, which lowers the barrier for enterprises to add governance to existing agentic workloads. Identity-focused integrations address a specific challenge with non-human identities where tools like Silverfort automatically discover agents, map them to human owners, and flag overflutage or stale credentials at one time rather than relying on static credentials. Pricing details were not disclosed in the announcement. Availability varies by partner, with some integrations like Imperva for Google Cloud, noted as currently in preview. Overdicts and internet-specific integrations should come out of the Aging Gateway partnership team directly. This is one of the things I really focused on when I was at Google Next, just because I think we're going to see this pattern grow, because I can't imagine anything else that's going to work, right? Like I said before, it's really difficult to control where your agents are being executed from. And every solution up until now has really been, well, you have to modify the application code so that before your prompt gets analyzed, you send a response out to the service. And so now being able to sort of plug this in and have the visibility, it's something. It's not foolproof and you still have to work with the rest of your business to sort of make sure that these things have their proper guardrails. But I'm happy to see tools like this and I want to play around. I've asked for demos. We'll be back to your demos too. Well, most of them will probably be behind the house. Close yourself. Eventually. Okay, fair. All right. Google Cloud is running a series of hands-on developer workshops across North America focused on building agentic AI applications, targeting platform engineers, security engineers, and data practitioners who want practical production experience rather than theoretical overviews. These are available all over the place. Sunnyvale, New York, Seattle, Austin, Texas, Toronto, etc. And Chicago. So if you're interested in this, definitely check out this tool or this training as it's something free. And Google's training that's free is typically pretty decent. Yeah, definitely. They do such a good job at offering training. Yes, they do. Which is a great transition to Adder, who's also offering free training. I don't know if it's good or not, but it's free. With a Microsoft Azure InfoSummit 26, it's a free virtual event running May 19th to 21st, starting at 8 a.m. Pacific each day, targeting IT pros, platform engineers, SREs, and inter-teams with level 300 to 400 level technical content, allegedly. Three-day agenda is organized around build, operate, and optimize pillars, covering topics like AKS operations, IAC, storage, networking, backup, and DR. Now, no AI here, so go to the Google one for AI. If you care about cloud things that aren't AI, go to the Azure one. How's that? This would be refreshing to actually go to. I'm kind of thinking about it. To deal with servers and just to optimize that kind of thing? That'd be pretty sweet. I mean, I guarantee you they're going to talk about AI, especially when they hit the SRE stuff and things like that. If it's 300, 400 level, there's no way they're not. But as soon as also anything says as no marketing slides, I'm like, oh God, there's definitely going to be a sales pitch. It immediately caused me to have the opposite reaction. next up for Microsoft in public preview memory and foundry agent services so basically they're getting memory just like everyone else is memory feature integrates natively with Microsoft agent framework and link graph meaning teams already building on those frameworks can adopt persistent memory without significant architectural changes I feel so weird that all these companies are just getting memory I'm like it's been in Claude it's been in OpenAI ChatGPT for a while apparently only on the desktop side and in the user consumer space not in the enterprise tools so it is kind of that's right like yeah it seems and it's there's just there's tools being uh launched so you can put it into your app i think is mostly the newness but which who knows what that you know the people that were doing this before were climbing together but yeah i mean i think you're right i think it's building into every app so therefore you can kind of have your memory which i think is also you know some of that grounding and whatnot that people have done so they use a standard memory it lever XAP for multiple for all the sessions. So every session starts with the same memory. Yeah. Microsoft Agent Framework has reached version 1.0 for both .NET and Python, bringing stable APIs and the long-term support commitment, which gives enterprise developers a reliable foundation for building production AI agent applications. The framework supports multi-agent orchestration and multi-provider model support, meaning developers can coordinate multiple AI agents and swap between different AI models without being locked into a single provider. I mean, it feels like things are changing so fast right now that standardizing and long-term support feels sort of weird. But I appreciate that they're trying something. I mean, it's always been a name-only anyway, right? Yeah. There is a belief that if you do 1.0, that you at least have to keep supporting. You might have a 1.1 or 1.2 that's much better, but you can still force a lot of people to get through this path, I think is how I would see it. Tell HashiCorp that with Terraform. 10 years. So all bets are off. All bets are off. Tell back in the day, Terraform and Hashgore, how about that? Next up is Microsoft is open sourcing their integrated HSM. Basically, it's embedded in every new Azure server designed to meet FIPS 140-3 level 3 certification and appears encryption keys with hardened hardware at all times, meaning keys never appear in host or guest memory ever during active crypto operations. So Microsoft announced at the OCP Amina Summit that the HSM firmware, driver, and software stack will be open sourced by GitHub at GitHub.com with an OCP workgroup launched to create ongoing development. They integrate HSM complements existing services like Azure Key Vault and Azure Manager HSM by adding server local cryptographic protection, addressing the shared blast radius and network maintenance limitations of centralized HSM models. so I mean this is nothing you're going to typically do anything with so I don't know who the people who are going to be contributing to this open source supplies are but I glad you did it I guess yeah I guess it's open just so that people can test it like it seems like yeah I feel like it's trying to build that level of trust and everything else they're like look we trust our software here it is you go look at it to validate that we are doing it right but the part that I find interesting, it's in the Azure V7 virtual machines and Gain Capacity is going to be its own beast. Just to get that, you know, moving all your virtual machines up all tier and then everything else that's going to take some time and effort. So while it's there in V7, I think there was one region I was trying to give V6 and I couldn't give V6 scale in yet. I wonder if this is like the Azure equivalent of Nitro? Yeah. It kind of sounds like I think it is. But it's a piece of Nitro, I feel like. Right. Yeah, yeah, one component of Nitro. I mean, I guess from a trust perspective, it allows companies to evaluate it and make sure they're comfortable, and maybe people will actually provide stuff to it. But it's still just weird to me, so I don't know. Nitro enclaves specifically. Certainly. Microsoft's internal Project Lobster team is building ClawPilot, an open-claw-based desktop environment that functions as a 24-7 autonomous personal assistant within Microsoft 365, growing from 100 to over 3,000 daily internal users in a single week as of May 1st. This is a designer and a multi-agent architecture, including a chief of staff agent, executive assistant agent, and specialist agents, each with their own Entre ID, exchange mailbox, and team's presence for governance and identity isolation within Microsoft Graph. Security remains a central challenge as Microsoft's own Defender team explicitly states OpenCloud should not run on standard enterprise workstations due to risks, including persistent credentials on tested input injection and vulnerability to prompt injection attacks turned into action injection attacks. The project differs from existing copilot offerings from copilot tasks and copilot co-work in that it targets a full-life context for knowledge workers, handling tasks like DoorDash orders or rescheduling personal calls without requiring constant user prompting. Microsoft VP Scott Hansman has a built-in Windows node for OpenClaw that made a service of Microsoft Build in June so it's a near-term developer facing that sense around Windows as an enterprise-ready agentic runtime environment may be coming soon. No pricing or GA timeline has been disclosed. So this is either going to be amazing and exactly what everyone wants, which is a desktop app that does all the cool stuff, but it's backed by Entra and all the security stuff that your IT org is already running, or it's going to be so nerfed and not able to do anything because it's backed by ENTRA, which your IT is managing, and they don't give it any permissions. I think we're going to see a lot of open claw across all the enterprise tools. Get ready, Ryan. Secure the enterprise. Super excited. Are you? No, no, I'm not. I didn't pick up the excitement. Yeah. we'll see how things continue to evolve here but it definitely feels like a lot more automation and agentic is coming across the board for everybody. I mean it's something I want. It's you know the functionality is definitely something that we need to provide because it's a huge enabler but it's also we don't need to throw away all of our security controls with the bathwater to mix metaphors or half a metaphor. Agreed. And then Finally, our last Azure story. Microsoft Foundry Model Router consolidates multi-model dispatch into a single endpoint that routes across up to 18 underlying LMs, shifting the routing logic from application code to the platform layer. This matters for cloud architects who currently manage the spoke routing logic across model fleets. The model subset feature is the most governance-relevant control, letting teams define which vendors and regions for their prompts can touch, set an effective context window ceiling, and bound worst-case per-call costs. New models added in future router versions are not auto-included, which deliver compliance guardrail worth noting. I guess I was sort of making the assumption that the previous LLM router was a central endpoint, but it seems like you had to have a lot more logic at the app where you use it. Yeah, I think there was a few different pieces you had to tie together to make it work, and this is just giving you a single place. Yeah, I think so. That's it, guys. It was a long road to get here, but it was a long show, but a good conversation. Yeah. All right, gentlemen, we'll see you next week here in the cloud. Bye, everybody. See you. Another week of cloud news wrapped up. Bolt will collect the news. Justin will get the notes. Jonathan will write some code. Ryan will watch the perimeter. And Matt will reluctantly watch Azure. Till next week for AI, Amazon, Google, Cloud, and Azure. And hey, maybe even Oracle. Who knows? Check out the cloudpod.net for our newsletter. Join our Slack, message us on socials, or leave a review. Well, I have an after show today. Think anyone's still listening? I mean, maybe. Marathon. Marathon. Marathon episode, yeah. But basically, you know, something happened. Tim Cook, currently the CEO of Apple, has announced that he is transitioning to board member only. as of September 1st. And he's being replaced by John Ternus, who, for those of you who know, Ternus comes from a hardware background, which may be a significant or continued increase in Amazon, Apple, Silicon, and device-level computing. But he basically has run the hardware vision for many years now and did lead the Apple Silicon transition. So Tim Cook, of course, replaced Steve Jobs, the late Steve Jobs, after he passed, or actually before he passed, through his final stages of his life. And then he basically took what Steve Jobs had built with the iPhone and iPad and turned it into the behemoth that is now Apple. People have sort of hit or miss opinions of Tim Cook's tenure at Apple. I have relatively positive feelings about it, other than some of the things he's done recently with the political side of things, trying to make sure that he doesn't get tariffs on his iPhones. He's done a lot of sucking up, which I think is not a great look for him. But he'll be still around the board to hopefully keep doing those things and keep John Ternus clean. That'd be nice, right? the biggest thing that's interesting to me about this and you know we don't talk about apple too much unless they're doing something in ai that's interesting in our space or you know we're getting a new mac version on amazon but uh you know i i kind of excited about this honestly having a hardware person kind of come in who's more technical i mean tim cook is a very smart guy but he's logistics dude like logistics all day long he can take any product and make china sing and basically build that product at massive scale that's what he's good at that's what he's always has been good at and that's where he really has always helped scale apple's business the way he did you know he also helped transition them into more of a service of business so you have like apple tv plus and you've got other subscription services that didn't used to exist that's been part of tim cook's kind of uh tenure there but during a harder person it sort of reminds me a little bit about bringing someone in like satcha nadella at microsoft who you know abandoned a lot of things that balmer did that you know were sort of i wouldn't call tim cook a balmer i think he's better than a bomber. He's not just a pure sales guy. He does operationally understand how the business works, where I don't know if bomber ever did. But I think bringing a technical person back into the role at this time, when Apple isn't doing much in AI, they're not, you know, they haven't released a lot of new features. The Apple Vision Pro has not been well-received or highly adopted, mostly because it costs a small fortune. And so, I think it might be a really cool time for Apple, and maybe they get their groove back. I don't know. I agree. I think they, you know, like The only thing Tim Cook didn't do is invent the next iPod, right? Like something that was splashy and big that, you know, that we kind of got used to with Apple because we're spoiled. And, but yeah, it was more, more operationally focused. And so, yeah, it's sort of a hardware guy gives you sort of that renewed faith that maybe, you know, maybe the new Apple car or whatever, you know, like is going to come out and beat us. I thought the car was silly. I know, I know. I feel like the MacBook Neo that came out was like, what is the big deal? From what I was reading, they can't keep them on shelves as stores because it's a lower price point. So it's enabling a lot more people to get into the Apple ecosystem. Well, I mean, the Mac Neo undercut the Surface, and it's not that much or very similarly priced to the Chromebook. And so it's a very viable interest to lower-end market, and it's a pretty serviceable machine because Apple Silicon runs as well. And most other low-end laptops of that size, not very many of them are, I mean, the Surface was ARM, but like a lot of other Windows computers are really neutered Intel chips. So, you know, it's a very powerful machine. And they're difficult to use because they're slow. Yeah. So, I mean, like the MacBook Neo, it's not a computer I would ever buy. I could see for a college kid who's not going into computer science who just needs a workhorse, you know, for college, it's a great system. It starts at $600 for the base entry config. you know so again i think it makes sense to me it's just a matter of you know what makes sense for your use case but you know low cost is not a bad thing no definitely isn't and that's you know largely why you see you know a lot of windows machines in corporate environments and stuff is that cost yeah i mean my one can be because i think 512 gigs for disk is pretty tight OS's are like 100 gigabytes between upgrades. Yeah, I mean, I haven't had a Mac with less than 2 terabytes in years. Now I want to check what mine is. I think mine's one of the lower end ones. I got a lower end storage one too, just because I tend to use a lot of external storage for stuff. I mean, I do too. I mean, I synchronize all the stuff with my Synology, and I've got all that stuff too, but I could fit on 1 terabyte, but I don't want the stress. Especially now with all the models and stuff I download locally and run locally to play with. I definitely use Lumber to Space in the last year or two than I did before. I'd have a problem, but I'm a big RAM guy too. I typically buy at least 32 or 64 gigs of RAM on my laptop. It's definitely not my config. But again, 512 I think is probably the size I would go with, which is $100 more. That's probably the only thing I would tell someone who's looking to buy something like this. Get the 512. Please, just trust me. but I imagine that this might also push Mac because right now they're still carrying support for like x86 based macOS and so if they start cutting that stuff down to just support ARM which I think they'll be able to do finally this year with the Mac Pro finally falling out of support with the next version of macOS maybe they can reclaim a bunch of that space on the desk because that would be my one concern is the operating system is pretty big on the drive I thought they already did cut Intel support it's still it's rolling through you can still install Rosetta and Rosetta will emulate the Intel stuff so that takes quite a bit of stuff in there and there's still apps that require Rosetta unfortunately I'm sorry I was thinking the other way around they have about 18 months before they need it I think this version of macOS still supports Intel Mac so this is the last version of macOS that will I know I had a friend that had a really old Mac that was definitely Intel well they've been dropping support so technically the Mac probably would have ran the current operating system but I think the Mac Pro is the last version of the Mac Pro Intel is the only Mac that's still certified for the operating system yeah they won't release newer OS versions for older hardware because they can't run it and so like your old Mac laptop is topped at it Yeah. Well, good. I'm excited. We'll keep an eye on it. Maybe they'll turn us to be a big cloud guy. Maybe he'll get more cloud into Google and to Azure with Macs. I don't want them to. And maybe he'll fix the pricing problem because the pricing problem is horrendous. Not with today's RAM and CP prices. So the problem I have with the Mac Mini on Amazon is that you have to pay for a day. Yeah. If they would fix that problem, I'd be super happy. It makes you feel like there's someone logging into it to wipe it, right? It's probably that level of load. Yeah, who knows? All right, gentlemen. Well, have a good one. See you later. All right. Bye. Bye.