Last Week in AI

#232 - ChatGPT Ads, Thinking Machines Drama, STEM

101 min
Jan 28, 20263 months ago
Listen to Episode
Summary

Episode #232 covers OpenAI's ChatGPT ad rollout, the Thinking Machines startup drama with co-founder departures, China's domestic AI chip stack achievement with Huawei, and critical research on reasoning models, synthetic media regulation, and transformer architecture optimization.

Insights
  • OpenAI's ad strategy reflects the inevitable monetization pressure of free-tier AI services; the company previously said ads would be a last resort, but scaling costs made them unavoidable
  • Thinking Machines' collapse signals that OpenAI spinoffs face structural challenges around talent retention, valuation expectations, and technical differentiation despite strong founding teams
  • China's successful training of major models on domestic Huawei hardware (Ascend chips + MindSpore) demonstrates algorithmic efficiency gains from compute constraints, narrowing the gap with Western labs
  • Reasoning models spontaneously develop multi-perspective dialogue behavior during RL training without explicit instruction, suggesting emergent problem-solving strategies beyond supervised fine-tuning
  • Semiconductor fabrication bottlenecks are shifting from packaging to logic production, making Samsung a viable alternative to TSMC for the first time as NVIDIA demand overwhelms capacity
Trends
Monetization of free AI services through ads and tiered pricing becoming standard industry practiceTalent exodus from well-funded AI startups back to established labs (OpenAI, Anthropic) due to execution challengesChina's vertical integration of AI hardware, software frameworks, and training pipelines reducing Western technological dependencyReasoning models and test-time scaling shifting focus from 'thinking more' to 'thinking broader' with diverse perspectivesSemiconductor supply chain diversification away from TSMC concentration due to demand surge and geopolitical constraintsRegulatory momentum on AI-generated NSFW content and artist copyright protections accelerating across jurisdictionsSparse computation and embedding-based architectures enabling 30%+ efficiency gains in transformer trainingMulti-agent and collaborative AI frameworks emerging as next frontier beyond single-model chat interfacesAge detection and content moderation becoming table-stakes for consumer AI products targeting minorsConstitutional AI and transparency in model values becoming competitive differentiator for frontier labs
Topics
ChatGPT Advertising StrategyAI Startup Talent RetentionChina AI Chip IndependenceReasoning Models and Chain-of-ThoughtSemiconductor Supply Chain BottlenecksAge Detection in AI SystemsTransformer Architecture OptimizationConstitutional AI and Model ValuesAI-Generated NSFW Content RegulationAutonomous Research and AI ScientistsSparse Autoencoders for InterpretabilityMulti-Agent AI CollaborationData Center Power InfrastructureArtist Copyright and Synthetic MediaAuthoritarian Capture and AI Governance
Companies
OpenAI
Launching ads in ChatGPT free tier and $8/month Go plan; introducing age prediction; developing adult content mode
Thinking Machines
Startup founded by Mira Murati experiencing major exodus with 3 co-founders and 9 employees departing to OpenAI
Anthropic
Published updated Claude constitution; competing with OpenAI on gigawatt-scale data center infrastructure
Zhipu AI (GLM)
Trained major image model on Huawei Ascend chips and MindSpore framework, proving domestic Chinese AI stack viability
Huawei
Ascend AI processors and MindSpore framework enabling Chinese labs to build independent AI training infrastructure
XAI (Elon Musk)
Launched Colossus 2, world's first gigawatt-scale AI supercluster using Tesla Megapacks for power infrastructure
Google DeepMind
Offering free SAT practice exams via Gemini; published research on reasoning models and sparse autoencoders
Baidu
Ernie AI assistant reached 200 million monthly active users in Chinese market
Samsung
Taylor, Texas facility becoming preferred alternative to TSMC for advanced chip fabrication due to NVIDIA demand
TSMC
Oversubscribed at 3:1 demand-to-capacity ratio for 3nm and 2nm nodes; losing market share to Samsung
Black Forest Labs
Released Flux2Cline, compact 4B-9B parameter image model for sub-second generation on consumer hardware
Allen Institute for AI
Released Molmo 2, open-weight 8B vision-language model with video grounding and 70 training datasets
Meta
Reportedly made acquisition offer for Thinking Machines; developing Llama 3.2 with reasoning capabilities
Humans (startup)
Raised $480M seed at $4.48B valuation; founded by Anthropic/XAI alumni focusing on collaborative AI
Alibaba
Qwen chatbot competing in Chinese consumer AI market alongside Baidu's Ernie
Intel
Facing yield and timeline delays; losing foundry market share to Samsung as alternative to TSMC
NVIDIA
Demand for Blackwell and Hopper chips creating TSMC capacity bottleneck; driving semiconductor diversification
Scale AI
Acquired by Meta for ~$15B, cited as precedent for Thinking Machines acquisition discussions
Hartmuller
Released open-source music foundation models for generation, tokenization, and lyric recognition
Loss Funk
Published retrospective on 4 autonomous research attempts; 1 succeeded in publishing AI-generated paper
People
Mira Murati
Former OpenAI CTO; founded Thinking Machines; confronted departing co-founders about competing offers
Barrett Zoph
Co-originator of Mixture of Experts; left Thinking Machines for OpenAI amid workplace conduct allegations
Sam Altman
OpenAI CEO; referenced as precedent for talent exodus following leadership conflicts at startups
Elon Musk
XAI founder; deployed gigawatt-scale AI supercluster using Tesla Megapacks and aggressive permitting tactics
Andy Pang
Former Anthropic researcher; co-founder of Humans startup; led post-training for Claude 3.5-4.5
Georges Harik
Google's 7th employee; co-founder and lead investor of Humans startup; raised $480M seed round
Luke Metz
Thinking Machines co-founder; departed to OpenAI as part of broader exodus
Sam Schnoeholz
Thinking Machines co-founder; departed to OpenAI
John Schulman
Left Thinking Machines for Anthropic shortly after founding; part of broader talent drain
Andrei Kurenkov
Co-host of Last Week in AI; studied AI in grad school; works at Astrocade startup
Jeremy Harris
Co-host of Last Week in AI; focuses on AI safety and national security implications
Quotes
"We're like, it'd be great if we could wrap a little bit early this week. And then there's an ideal gas law thing going on where, like, the amount that we will explore stories will expand to fill whatever time we conceivably have"
Andrei KurenkovOpening segment
"If there's recursive self-improvement, if they have this kind of liftoff capability comes with it, same problem, authoritarian capture. So it remains like really thorny."
Jeremy HarrisSafety discussion
"You don't need that many guns if you know exactly who the people who are plotting or leading the plot to overthrow the government are."
Andrei KurenkovAuthoritarian lockdown discussion
"Everybody who is at the frontier of this technology does not trust it, like does not trust it near their kid's limbic system. It's just an unfair fight."
Jeremy HarrisAge prediction and minors discussion
"There will come a time when TSMC, which traditionally their largest customer was Apple, it's going to shift to NVIDIA. There's going to be a shift where the most important customer who's basically subsidizing the leading node development at TSMC becomes NVIDIA."
Jeremy HarrisSemiconductor supply chain segment
Full Transcript
Hello and welcome to the Last Week in AI podcast where you can hear us chat about what's going on with AI. As usual in this episode, you will summarize and discuss some of last week's most interesting AI news and you can go to lastweekin.ai for the text newsletter with even more news coming to your inbox every week. I am one of your hosts, Andrei Kurenkov. I studied AI in grad school and now work at the startup Astrocade. And I'm your other co-host, Jeremy Harris. So, you know, Gladstone AI, AI, national security, all that stuff, which, first of all, we always say this. We always say this when we start a podcast. We're like, it'd be great if we could wrap a little bit early this week. And then there's an ideal gas law thing going on where, like, the amount that we will explore stories will expand to fill whatever time we conceivably have up to the aggressive outer limit of what is okay. So we've both been late to meetings because of this. It's part of the fun. The number of times you finish like two minutes early, one minute early, few minutes early. It's crazy. I don't know how we do it. Probably by like the last five stories or so, we just start like hurrying up. I wonder if that's it. Yeah. If like you start listening to the podcast, it's like, well, this is nice and slow, but then you find yourself moving us from 1.5 to like 0.75 or something. Because, yeah. So I asked Andre at the beginning, if we could finish 15 minutes early and we looked at each other like, that's never going to happen. Well, we can try. We'll see. Sometimes we make it work despite our instincts. Other thing too, I know you mentioned, we talked about the comments just at the tail end of last episode. And a bunch of people left comments on like the YouTube video, which by the way, I love those. It really gets me jazzed. So I appreciate that. I know Andre, you do too. So thank you for all those very kind comments, guys. And speaking of which, we do have one comment at least that we are going to address at the beginning of this episode, just to give some preview. We're going to be talking a lot about open source model releases, a fair number of papers, a lot of business stories with some drama and a lot of fundraising. That's pretty much the makeup of this one. And so related to some of the safety topics we'll be discussing, this poster prints Megahertz, which is a pretty cool handle, asked about the quote here is in some of the earlier episodes, you got into details about the risk assessments. And one of the categories was authoritarian lockdown. And they would like to know with recent events in mind what the state of this is and whether teams at X, OpenAI, Anthropic, and so on still screen for this. So Jeremy, as the safety hawk, maybe you want to take it? As the safety hawk. Yeah. So authoritarian lockdown is, we were just talking about this earlier, is this whole idea where, you know, historically, you think about all the authoritarian regimes that we've had on planet Earth, and there've been many. They get brought down ultimately. Who brings them down? Well, they're brought down by the people, you know, the populace. That happens because the populace has more power together than the government. The problem is when technology advances to a certain point, You get surveillance states like China that are functionally the Chinese Communist Party augmented by a massive surveillance and state surveillance apparatus and enforcement apparatus. Eventually, it just becomes mathematically impossible to overthrow the government. That's the concern about authoritarian capture. That is exactly what AI is, people worry AI will exacerbate. I say exacerbate because some people think China is already kind of there and other countries are already kind of there. And depending on your flavor of conspiracy theory, you could apply this to really almost any country, right? So the challenge with AI is a version of this, yes, happening at the government level, also happening at the company level. Whichever company builds super intelligence first, if it gets built, I think it will, but who knows, that company will have the power of a nation state. I mean, if there's recursive self-improvement, if they have this kind of liftoff capability comes with it, same problem, authoritarian capture. So it remains like really thorny. I wouldn't say there's a particular advance other than the fact that scaling, pre-training scaling slowed down, then post-training scaling has been running into friction. This whole notion of like right now we have pretty even competition across the different labs. So that seems to reduce, some people might say the risk of authoritarian capture. My own personal hot take, if anybody cares, is I don't think this does literally anything about that. there will at some point be a first lab to build a system that gives you a self-compounding recursive advantage. I'm not claiming that that'll happen tomorrow, and I'm not telling you who I think is going to do that first. But if we think that's going to happen, no matter how even things may look now, there will come a time where this sort of thing will become an issue. So if you buy into the superintelligence thesis, I would argue this is almost like locked in and you need to start thinking about how you're going to govern this. I don't think we're going to solve that problem by that time, but that's at least my frame on it. Andre, you're shaking your heads. You could debate a bit about the recursive point. I'm not entirely on board with the recursive self-improvement hypothesis due to just physical limits and so on. But I think this is an interesting question now, given the past year and the rise of advanced AI in China, there were efforts back in 2023, 2024, but with 2025, it became apparent that the companies there, the teams there got to a point of training very powerful LLMs. And as you said, China, I can't say that I know a lot about this, but I know enough that China has deployed AI at a large scale already with things like facial recognition, right? So they have cameras everywhere. They do track you. It's kind of like Minority Report as is. And with LLMs, I suppose the worry might be that the oversight capacity, the ability to screen for things you post online, for activities you take, and so on, makes it much easier. I will say one of the aspects of authoritarianism is violence, right? You need to have a sort of exclusive license to violence and the military on your side and so on and so on. AI right now doesn't seem to exacerbate that, right? Now, there is a lot of work on humanoid robotics. And once you get into humanoid robotics, advanced robotics, there you have AI capable of being soldiers, being police, etc. But we are a little ways away from that still. There's a question of the leverage on that violence that an entity like the Chinese Communist Party or whoever gets from these systems. The fact is, you don't need that many guns if you know exactly who the people who are plotting or leading the plot to overthrow the government are. And that's a very micro kind of fabricated example. But in this instance, if you're in China and you're trying to pull any kind of stunt, you're going to get identified. And beyond the traditional violence, take them out and shoot them or harvest their organs or have one of these magic buses that just roll up, grab people off the street, and then kill them. Beyond those very violent repression instances, there's also just the passive stuff the government does with social credit. Like, oh no, you can't leave the state. Oh, your mother doesn't get her insulin medication. Oh, your brother can't start that business. All these little ways that they can ratchet up the pressure. So I think there's kind of this continuum. And I mean, you're right. Ultimately, the state's monopoly on violence is the key thing. It's just there's a continuum of things they can do along the way that can coerce in different ways. So we could do an episode on... It sounds like there's an episode. It's a fun topic. Well, it's not a fun topic, but it's an interesting topic. And it's interesting in particular because AI, right now, it gives you intelligence. It doesn't give you presence in the physical world so much. Even if you were to get to super intelligence, unless we had actual infrastructure to build out presence in the physical world, there's a decent amount of limitations that exist. But anyway, as you said, we can go for a while. This is that whole episode we should do on the concept of the software-only singularity. How much... Because the argument that I would make there is, of course, we all agree there are fundamental physical limitations. is you can only stuff so much information in a certain amount of matter or mass energy, or you can only have so much compute or whatever. And we have yet to discover those physical limits. But those physical limits are so far beyond where we're at now. And even when constrained by... Well, I mean, mathematically, yeah. If you look like Landauer's law, right, which tells you basically how much the fundamental energy cost of information processing is, we're many orders of magnitude. away from that limit. I don't think that's the limit that applies. I think the limit that applies is actually more like how much can you do with an NVIDIA GB200 system, right? It's more like that. That's the threat. People like me don't get to just point to Landauer's law and be like, ah, look at that. That's the limit. No, no, no. We have to explain how the physical hardware that exists at the time, so whether it's Vera Rubin or whatever else, how that platform can lead to, There's enough overhang in the capabilities of that hardware that you can get purely through software that it leads to big things. That's what I think. And I think that's the debate that would be really interesting to have another time. Another time. And just real quick, this person did ask whether teams at X, OpenAI, Anthropic, and so on still screen for this. And I think you probably looked at all these system cards. Is that the case? The system cards, I don't think they tend to speak to hiring. That's a really interesting point. You're now making me think like, yeah, shouldn't they? It seems like they should. I haven't seen anything explicitly in that direction. I know the labs internally have governance teams that nominally spend their time thinking and worrying about these sorts of things. Not all of them do have those governance teams. As far as I know, XAI doesn't, for example, because they're just trying to get spun up still. But Google DeepMind certainly does. OpenAI does. Anthropic does. Yeah. So it varies greatly. And the specific problems that they're working on change a lot and have gotten, in some cases, more, I don't know if grounded is the right way to put it, but more focused on short-term things just because we're already seeing such massive changes that it kind of causes you to reel in your, you're like, holy shit, we already need to worry about unemployment. This whole authoritarian capture thing is really important and we can't do it at the last minute, but maybe we put more resources. So I don't know. That's all I really know. Yeah, it makes sense. I think it's going to come down to less of an alignment problem and more of a collaboration problem. You already know Anthropic and OpenAI went to work with DOD. Recently, the DOD announced this collaboration with XAI and Grok. I think you'll find someone happy to just give you the AI to do the bad things. And that's a real threat. We could even have a separate conversation as to whether that DoD stuff is the bad thing. I mean, like at a certain point, it's also the China competition. If they're using it, everything is messy in this space. Yeah. We really should do another just discussion episode to get some of these things out. But let's get to the news, starting with tools and apps. And the first story is we got some details about OpeningEye deciding to test ads in ChatGPT. So we got some visual previews where the ads kind of seem like ads. They were very explicit that they won't mess with the model itself to start implicitly, I guess, doing advertising by responding to you in certain ways. it does say that the ads will appear for users of the free version and the $8 per month chat GPT go plan and if you're on a higher tier plus pro etc you will not be seeing this and the ads will be placed at the bottom which chat GPT answers when there is a relevant sponsored product or service clearly labeled and separated etc etc so opening i really tried to get ahead of this and make clear that they're not trying to have ChatGPT subliminally insert advertising or so on, that the ads will be there, but they will not be hindering the experience, I guess. I love Michelin tires. Yeah, so I completely agree. It feels like getting ahead of the story here a little bit, which makes sense because I believe, and don't quote me on this, I'm pretty sure OpenAI once said they would not do ads on chat. They said it to be a last resort. Last resort. Okay. That sounds about right. Yeah. So here we are. There's been a lot of chatter, of course, about OpenAI waiting this long to roll out ads because when they were smaller, failures and bad ads and things like that that inevitably will come would be much more forgivable. People would be in the mood of saying, eh, it's a new product. I've never heard of this chat GPT thing. So what? It's buggy ads. It almost feels like part of the product that's being debugged, which it is. So here they're trying to set the scene a little bit to be like, guys, it's not going to be perfect. One of the things they do tell us is they're flirting with this idea of making interactable ads. You can kind of ask questions to the ad in real time. That's just one of those interesting things. When you think about a new form factor for ads, the new kinds of interactions you could have with them, like any platform like Facebook, Instagram, and so on. Anybody who's read Chaos Monkeys, great book, by the way, The argument for the nobility of advertisement is very much clear here. People will often make the case, look, ads at their best are really just good for you. They're just somebody surfacing something that you didn't realize you wanted. It's actually a service. That's part of the frame here. It's an interesting frame, maybe more true for things like ChatGPT, especially as they have that longer interaction time horizon with you where the ads can be more relevant. Of course, that feels creepier too. So we kind of have this weird calibration to do. No idea how that'll land, but they list a bunch of their principles. They're applying for this, you know, mission alignment. Our mission is to ensure AGI benefits all humanity. Our pursuit of advertising is always in support of that mission and making AI more accessible. So, so that's kind of the, really, we need to make more money so we can scale more is the implied kind of background justification for a lot. I mean, this is how they justified, you know, fundraisers, the cap for profit transition, the for-profit transit over and over again, right? So this is the interesting philosophical question that OpenAI is trying to answer internally is like, are we going to basically just justify everything through that lens or not? Answer independence, ads do not influence the answers that chat GPT gives you. That's a really important one. Conversation privacy, choice and control. You control how your data is used. You can turn off personalization. Basically, if you think this is creepy, you can just do that. And finally, long-term value. We don't optimize for time spent in chat GPT, we prioritize user trust and experience over revenue, which I really like that. I mean, prioritizing for time spent on app, scary thing when you have an awful lot of compute to optimize with. That's right. And I think this is a bit of a no-brainer given that OpenAI now has 800 million weekly users and only about 5% of those pay for subscriptions. And if you have a bunch of free users, what do you do? You use ads. That's been the truism of the internet. And in fact, I mean, that's how you get to be very, very rich like Google and Instagram. You kind of lock people in with a very solid product and then you serve them ads. It's just, I think it's the pull of this business model was such that opening eye. Once they went the free model route, the free user route, there was almost no way for them to avoid it unless somehow they got these models to be extremely cheap, which is still not the case. It still costs a fair amount of money to provide that free service. Free users are burning through money in a way that just isn't true for Google and other services. And I think it's interesting that alongside this discussion of ads, there's also been this announcement of the ChatGPT Go service, which is introduced last year as low cost subscription in India. That was meant to expand access to some of those most popular features, get some of the paid, I guess, features at a lower cost. And ChatGP Go is now rolling out everywhere ChatGPT is available. So in the US, it is $8 per month compared to Plus being $20 per month and Pro being $200 per month. So you get much higher limits. You get 10x more messages, file uploads and image creation, longer memory and context window, and so on. So I think kind of the combination of these two go hand in hand where the argument is we want to let as many people as possible use ChatGPT rather than having it behind a paywall. And therefore, we are going to go start adding ads in the free tier. And speaking of OpenAI and ChatGPT, the next story is that OpenAI is launching age prediction for ChatGPT products. So this is going to let them determine if a user is a minor using behavioral and account level signals, stuff like account age, activity time, usage patterns, and the user's stated age. And this is, of course, following some very unfortunate stories in the past year where certain people that had some cognitive issues in getting very deep with CHGPT could be driven to bad things. There's been lawsuits over this. There's laws being written up and actually passed with regards to minors and AI. So this is one of those alignment things. There are not even alignment things. This is a safety thing that in hindsight, this wasn't a big part of discussion. What happens when kids and minors talk to AI? That's powerful. I don't think anyone was, well, I'm sure people in the safety community might have been thinking about it, but it clearly has become more of a factor and age prediction is probably necessary at this point. Yeah. And it's interesting as ever can be this challenge of false positives and false negatives when you try to bucket things this way, where do you draw the line? What's your appetite for all these things? What we do know is they'll say that the model looks at a combination of behavioral and account level signals, including how long an account has existed, which seems like it would be handy in setting a pretty concrete floor. Typical times of day when someone is active, usage patterns over time, and a user-stated age. So all of this, with the exception of user-stated age, which obviously has been an impeccable barrier preventing people from accessing the internet before they're 18 since time immemorial. All these other signals kind of have to accumulate over time. You don't instantly, when someone creates their account, know how long the account has existed. It doesn't give you much information. Well, it gives you some actually. But times of day when they're active. It takes a while to figure that out. Usage patterns over time. These are all things that need to be built up. And so I'm curious about how, does this mean that when you start by creating a new account, you're not able by default to have a very wide range of interactions? And over time, as it validates that you're older, that shifts. I think we're going to see that kind of conclusion be drawn or it will become clearer which side of the fence they're on there. They did also say that AI is apparently attempting to prepare for the launch, okay, so pretty tentative, of an adult mode that will allow users to create and consume content that would be dubbed not safe for work. So this is the flip side of that for OpenAI is if you can solve this thorny problem, you can allow an awful lot more interactions, say Grok style, with the platform that you couldn't before. So the stakes go up, the tools kind of increase, and it feels like just another escalation in the long line of escalations we've seen with the stakes of AI deployment. And I think this is particularly interesting in the context of younger kids, minors, really teenagers. We've seen some very unfortunate stories where you have to be a little more careful, but they're sort of cognitively already fairly mature or advanced. But it occurs to me, you know, chatbots are conversational, right? And when you talk to kids, you kind of have to adjust to the way you talk to someone. But by default, Chad GPT talks the same way to everyone if it doesn't know your background. So it's kind of an interesting technological scenario. And as we've seen over the past couple of decades now, kids are growing up staring at screens and using swipe on iPad from the age of two. And now it's going to start. They're going to start, I don't know if that's your baby, but I would imagine they start talking to AI at a very young age and just having conversations. Yeah. We're not doing any screens or anything like that until she's a fair bit older. It's one of those things, right? Where it's like, if you look at how Zuck raises his kids or raised his kid, I guess, yeah, we have the one daughter, I think. Everybody who is at the frontier of this technology does not trust it, like does not trust it near their kid's limbic system. It's just an unfair fight. You've got armies of not even psychologists, which is like computing factories that are just optimizing against your child's limbic system and digital crack. And in a sense, you know, Chad GPT, one of its uses is intelligence augmentation. This is something that has been brought up as a concern in schools. If you start using it as a crutch, like your basic intellectual capabilities are going to be entered. So the question is also where to start using it and start using it aggressively too. Because you could make to some degree, this is overdone. People overuse this. But to some degree, the same was true of books. There was a whole debate where people were like, well, if you start storing information in books, now people's memories are going to get really crappy because they're not forced to memorize everything, which was true, but worth it. Same with calculators, same with other things. I do think this time is different. But still, we can't be completely Luddites about this. I don't know. As a parent, I'm trying to figure it out. If anybody has thoughts, let me know. We'll figure it out together. And speaking of intellectual development and education, next story is about Gemini. Google is going to be offering free SAT practice exams powered by Gemini. So in Gemini, you could just say, I want to take a practice SAT test and it gives you an actual SAT test with questions. You enter a question, it tells you if it's correct or not and has an explained answer thing. For people not in the US, SAT is the standard test that you will take in high school that often helps inform where you get into college. So I think a large majority of high school kids take this exam. I know I did. I know I did practice testing on my own from some online website. So I would imagine this gets a lot of users kind of indirectly, but there you go. This is kind of the opposite case where it's meant to help you practice and get better at a skill. Yeah, I think something similar with like the GRE, right? All those big standardized tests that America loves to ask or give to its students at all levels. This is something, I mean, everybody I know, you get to a point where you do the practice test and then you start to memorize and you overfit. And so having a fresh set of training data for yourself is really cool. It's a great use case. Great use case, yeah. And one more story in the section, moving over to China. Baidu's AI assistant reaches milestone of 200 million monthly active users. So really just a quick update. This is about the Ernie assistant in the Baidu search engine app So this is in a sense the equivalent to Gemini in China And I think most of us in the US and in the US AI community kind of aren't aware of Ernie and the chat GPT-esque landscape in China. But clearly it is expanding. And Alibaba is also in this race, has the Quen chatbot that they are looking to expand into their consumer ecosystem as well. Yeah, it's also because the Chinese ecosystem is so much more cordoned off. You're getting this opportunity to see market penetration. I mean, China is what is like 1.4 billion people now, something like that. So, you know, 200 million. If you get rid of all the old people and the really young people, you're seeing decent penetration here. not all the way, obviously, but decent. And that's all Baidu, right? China's Google. So yeah, I mean, they got the, they have, I shouldn't say they have the infrastructure. They're struggling like all Chinese companies, but they have more infrastructure than a lot of others. Right. And it makes me wonder, you know, what are the personalities of these chatbots? Are they different from ChaiGPT? Because I don't know, it's been an interesting thing where Gemini and ChaiGPT and Claude. On the one hand, they're all kind of similarly intelligent and the user experience is similar, but then on the other hand, they have kind of different personalities too. Yeah. Yeah. I mean, the Chinese ones are really have a sense of humor, I think, because when you prompt them, they'll like respond with these funny stick figure doodles every time. So I've struggled to get value out of them. Not sure. Not sure. That's true, but it would be fun. And now moving on to applications and business, we begin with a real development in the startup space, but also just some fun, juicy drama coming out of Silicon Valley. The drama is over at Thinking Machines, and it is that some of high profile people within the company, in fact, one of them was two of the other, so three co-founders, wow, I didn't realize it was that, three co-founders have left Thinking Machines. And as they left, they immediately went over back to OpenAI. So for context, Thinking Machines was founded, I believe, last year, maybe 2024, headed up by Mira Murati, who was originally the CTO over at OpenAI, a major figure. There was famously the kind of board split that happened. and afterwards Mira left and founded this largely with a lot of people from OpenAI. Well, now Thinking Machines has been out for a while. They have launched one product that I'm aware of that focused on fine tuning. And it appears that things aren't going so great. So after these three people left, also recently there's nine other Thinking Machines employees out of roughly 100 that have left for OpenAI or received offers from them. So yeah, it's a pretty major kind of startup drama to have so many co-founders or co-founders generally leave. And in this case, the background of it that's coming out is also dramatic and juicy where apparently there was an office relationship that was a factor in this. There were also some arguments as to amount of influence and control. Mira confronted these three people about whether they had offers lined up in another company. And in fact, clearly they did because they immediately went over to OpenAI. It's not the same thing quite, but it rhymes with the Sam Altman firing scenario where it's like, so clearly I hear Barrett Zoff, who's actually one of the originators of the, along with Liam Fettis, of the mixture of experts architecture. So he's probably the highest profile departure. Allegation there was that there was some kind of like workplace relationship thing. It's always been very fuzzy. No one can box it in for us. I've seen like six different ways to phrase like inappropriate workplace relationship or something. But anyway, so he was fired and unethical conduct was the most recent phrasing I've seen. Then there was a whole exodus that came along. Like you said, there was Luke Metz, Sam Schnoeholz. Actually, I've only ever seen his name written. And so all heading to OpenAI kind of feels like, you know, you fire Sam Altman, then all the people say, okay, screw that. I'm like, I'm leaving, going with Sam. And second time Miramarati has sort of been in the seat seeing that happen. So sort of funny. But one interesting note here is who had left earlier. John Schulman left for Anthropic shortly after the founding of Thinking Machines earlier. So this kind of, it's been more of a like drip, drip, drip, and now a flood. And all it seems, or a lot of it, tied to the failure of Thinking Machines to hit their $50 billion targeted valuation. That didn't happen. There was apparently an offer from Meta on the table to acquire them. Mira Morati wanted to say no, and a lot of the people there wanted to say yes, having just seen presumably Alex Wang make off like a bandit on that $15 billion acquisition of scale, de facto acquisition of scale. Maybe something similar on the table here, and that sounds like a pretty good pay package, but we'll have to wait and see as more details trickle out. Right. So there's some internal kind of control about the technical direction, which is also a factor of this. And aside from the drama, right, this is, to some extent, Notable Thinking Machines was one of the bigger efforts to come out of OpenAI, I guess, to spin off. Anthropic originally was a spin off of OpenAI as well. and with the amount of talent and amount of fundraising, thinking machines was poised to become a pretty major force as a startup. But they've been kind of quiet, right? There's not been too much of an impact. They did release what seemed to be a pretty good offering for fine-tuning, but nothing that you couldn't do kind of already and nothing that most companies really care about at this point. so clearly there's a lot going on here with discussions of the direction of a company with Mira also having a lot of control at the top of how this is going you know it's been a little while since we had some juicy AI drama so I guess it's nice to have that now on to a more serious topic as always we got to talk about China and chips and here the story is that GPOO AI has announced the development of their first major model trained on the Huawei stack of AI compute as opposed to the NVIDIA. So they have this new model GLM image, which is similar to chat GPT image, where it's a more advanced model capable of editing and so on. And the claim here is that they did this entirely on Huawei's Ascend AI processors. And this is a mark that it is now feasible or realistic to do major AI training on only Chinese hardware, right? Yeah. And it's not just about the hardware, but also the software stack. So you've got the Ascend chips, which we've talked about ad nauseum on the podcast before, but which generally are, they're really well-designed, actually. Huawei is really good at designing. The struggle here is obviously on the fabrication side. SMIC doesn't have the same exquisite fabrication capabilities that TSMC does, but really well-designed, designed to work in huge numbers. They're pretty energy inefficient, but that doesn't matter because China's got tons of energy and that's what's driving this design choice. But separately, the MindSpore framework, right? So typically, what you'll see is Chinese firms will rely on PyTorch or TensorFlow, right? These like, you know, Meta or Google frameworks to manage training. This is where Huawei's MindSport is being used along with this thing can compute architecture for neural networks, which is essentially this shift from CUDA as well, right? So I mean, that's its own crazy thing. So we're seeing off CUDA, off TensorFlow, off PyTorch, and then also off of the kind of Western chip stack. So all kind of happening at the same time. Hard to say what parity is like, but when you think about the Ascend 910B or Ascend 910C, roughly think about the H100, but worse. Like it's maybe closer to like the A100 is a better comparison point. So it's the kind of hardware that was used to train the original GPT-4. So, you know, pretty far behind, but they're really, really good at networking, working these things together. And Chinese labs, because they're constrained by access to compute, they focus a lot more on algorithmic efficiency. And so you can think of the pressure on MindSpore, right, on Ken as being much more oriented towards efficiency because it's survive or die for them in terms of the algorithmic efficiency side. So, yeah, it's quite significant, quite important. Apparently, shortly after this was announced, so Jirpoo went public on the Hong Kong Stock Exchange and shares are up 80% since then. So seems like all of this is cool by the markets. But yeah, there's a bunch of caveats to this. This is like a 10 billion parameter, 15 billion parameter image model. It's not the same thing as a trillion parameter model, like think GPT-5 or whatever. And all of the truths that we've talked about over and over on the podcast regarding EUV versus DUV, regarding TSMC versus MIIC, all these things remain true. This is just kind of Think of it as a proof of concept, but an impressive one that China can go fully domestic for at least an end-to-end stack on a, say, $10 billion model. Right. And this is coming, I think, just 15 days, two weeks since that IPO. And the IPO was somewhat successful, not a huge success. So this is perhaps also a bit of a business move. The way to position this in particular, as we are using Chinese hardware. We've discussed a lot how there's a lot of uncertainty about their ability to have access to NVIDIA hardware in the future. The Chinese government is encouraging companies to go over to Huawei. So from a pure business perspective, this does actually signal kind of a competitive advantage. Fun fact, as this happened, there was a second news story about Jipu where they had to limit access to their coding agent. due to rising demands. So they are now letting people, I guess, do less coding with the GLM coding plan. And this is an active problem in China where the inference requirements, the population is so massive, right? And the inference requirements are so huge and their chip availability is so limited that there's very little left for actual R&D. So this is kind of one of the key structural challenges that the Chinese companies are facing. And on to the fabrication side of chips, The next story is that Samsung's US Taylor facility is reportedly becoming the top choice for customers looking beyond TSMC, as opposed to things like Intel that apparently having execution challenges. Intel, not Intel. So for people who don't know, there's only a few major players in the chip fabrication business. There's TSMC and then Samsung and Intel are a couple other ones. And beyond that, you have things like SMIC that are a little bit behind. I personally don't know too much about Samsung and Intel as file players, but I imagine, Jeremy, you have some to say on this. So this story is really interesting because it's a fundamental shift. First of all, first of all, I keep doing it. This is so obnoxious. Last week in the day, I called this one like a long time ago. I can't remember which episode, but it was like two years ago or something. We're talking about scaling and we're like, there will come a time. There will come a time when TSMC, which traditionally their largest customer was Apple, it's going to shift to NVIDIA. There's going to be a shift where the most important customer who's basically subsidizing the leading node development at TSMC becomes NVIDIA. And when that happens, we're going to see a sudden switch in who has leverage between NVIDIA and Apple. And that moment has actually come. And this is all kind of part of a massive rearrangement in the economics of foundries and fabulous chip design companies. Okay, so zooming out, back in 2024, 2025, the main thing that was bottlenecking our ability to ship chips was not actually fabrication. It wasn't the ability to make these fancy logic dies that do all the computing. It was actually packaging. It was the ability to take those fancy logic dies and marry them together with the memory stacks, the high bandwidth memory, together on a little chip, right? So this integration uses a process called COOS, chips on wafer on substrate, and it was the main bottleneck. That's changing. One of the challenges, though, is that it's this one thing that everybody needs. So regardless of which generation of chips, we talked about this last episode, the Blackwell uses one thing, Hopper uses another, uses another node. So Blackwell might use, I can't remember, the three nanometer node or whatever. The Hopper uses the four nanometer node. And so they can be produced in parallel, but they both require Coos. So that becomes a sudden bottleneck. What's happening now is, so exacerbating factors, like the Rubin factor here is a big thing. NVIDIA's next big chip is going to be the Rubin. They use Coos L, which is a more advanced version of packaging technology. It's even harder to manufacture at high yields. So that's kind of putting even more pressure on this. But now also a lot, like basically all the packaging is happening in Taiwan. So these US customers are in a position where they can even get the chips fabbed locally in the States, but they have to get shipped back to China for packaging and back again to usually Arizona to be finished. So it turns out that while this is happening, yes, packaging is a bottleneck. Also now we're getting to the point where even the logic fabrication is becoming a bottleneck again. So the demand is this surge so much that both of these kind of dual bottlenecks are happening. And TSMC is saying that they basically have oversubscribed capacity for both the three nanometer and two nanometer node, which is the most advanced one. And there's a three to one demand gap. So basically, like they've got three times higher demand than what their current maximum output can be. And this means the companies like AMD, NVIDIA, Qualcomm, Apple, all these companies have to look at who else are we going to use? Like TSMC just can't meet our need. In that context, Samsung starts to look really interesting. They're coming up the middle here, the back. It's the next place you will tend to look for, mostly because Intel has had some really bad problems with yields, delayed production timelines. There's all kinds of issues with Intel. So really, Samsung is the only game in town. Samsung has a facility in Taylor, I think it's Taylor, Texas. The town's name is Taylor. And what they have there, really crucially, is an integrated flow where they do both logic production, so the logic die, and advanced packaging all under one roof. And so that lets a company like Tesla or Qualcomm bypass the TSMC waitlist completely. And that's actually what's happening. We are also seeing some people kind of split the works. They use Samsung for the logic and they turn to Intel for packaging. There's an interesting reason for that, but the details don't really matter. Bottom line is this is a pretty big fundamental shift where now Samsung actually gets to play. And it's just coming from this massive, massive demand that's existing now for all these advanced nodes around the two nanometer level, because you just can't compete without it. And if TSMC is oversubscribed, because NVIDIA is taking up all that capacity, if you want to be in the chip game, you got to go somewhere else. And now onto the other spectrum of compute, the data centers where all those chips are used. Elon Musk's XAI has launched the world's first gigawatt AI supercluster. So they have turned on Colossus 2, the first gigawatt scale AI training supercluster ahead of time of OpenAI and Anthropic that are also trying to get there and are being delayed relative to XAI. It's got to be training GROC-4 and so on. And there's been some reporting as to how this was done. This was constructed rapidly using on-site gas turbines and Tesla Megapacks for one, with that supplying a lot of electricity demands going beyond the regulated amount, actually. So these gas turbines, they just placed a bunch of them beyond what they're permitted to do. Classic Elon Musk, just break the law and then get away with it and get ahead of the competition that way. And of course, that's one of the challenges with data centers is you need all this permitting, you need to get the energy infrastructure, talk to power suppliers, et cetera, et cetera. That slows things down a lot. And the XAI is able to, let's say, be more aggressive or is willing to be more aggressive and get ahead of competition. Yeah. Well, another dimension of this too, in fairness is also that they control through Elon's companies an entire independent, just for them, supply chain of Tesla mega packs. And so this is one of the big things where when you go to make a new site, like a new data center, you're like, where am I going to get my power from? And that's really the first question. By the way, there's a whole ecosystem of builders who are going to lie to your face and say that, yes, we've secured 500 gigawatts of power for this site and we can put it online as of mid-2026 and they'll say all these great things. And then when you dig into it or you sign the lease agreement or whatever you start building, you kind of realize like, wait a minute, there's this weird caveat where because the local town has this like, blah, blah, blah, we can't do the thing until three years later. And it's like, aha, gotcha. So having the ability to autonomously solve, they can't solve all their power problems, but solve a lot of their power problems through a supply chain that is just for them is a massive structural advantage that Elon's companies enjoy or that XAI enjoys thanks to Tesla. And so that's really kind of this, it's part of the secret sauce. It's not the whole secret sauce, as you indicated, but it's part of it. One thing is, I just took credit for last week in AI calling something, last story. So here's one thing that we got wrong just last week, at least that I got wrong. We were talking about Anthropix new site in New Carlisle, Indiana. And we were saying, we think this is going to be the first gigawatt scale cluster. And that was, I think it was off an epic AI analysis because they thought, hey, this is what it looks like it'll be. Turns out Elon just came out of nowhere and beat them to the punch. So this really is a bit of a surprise, this announcement, at least on my radar. I'm sure other people were tracking it, but seemingly coming out of nowhere in the last few weeks, boom, first gigawatt scale cluster. So there you have it. That's right. And tracks with one of Elon Musk's core tenets of running and competing is to have constantly a maniacal sense of urgency and that has really been shown with this and yeah to your point it's it's a mix of kind of ignoring limitations imposed by kind of geographical community concerns but also just doing some crazy shit like getting all these batteries in for tesla famously they put up tents and had additional production chains just kind of scrambled together. So classic XAI, classic Elon Musk. And now on to some fundraising. We've got humans and a human-centric AI startup founded by some notable figures from Anthropic, XAI, et cetera. They have raised a $480 million seed funding round at a $4.48 billion valuation. We don't know too much about what they are planning to do. Some of the notable former employees are people like Andy Pang, Eric Zellekman, Noah Goodman. And as with some of these other split-offs, pretty significant players in these other AI players. So interesting to see what they are going to be trying to do in the space. So when Thinking Machines launched, I was like, okay, you're a new one. I'm impressed. I'm impressed with the people. Don't quite get it. That's okay. I'm sure we will. And I'm still trying to figure it out. this one, at least they paint a picture for you that seems vaguely differentiated for right now, right? So they say the startup aims to use software to help people collaborate with each other, think an AI version of an instant messaging app. So the basic form factor is apparently going to be different, more like DMing with people. And then there's AI assistants than just chat with a chatbot. Kind of interesting. So probably they'll just get acquired by Meta. That's my guess. But anyway, it is a hell of a pedigree. Obviously, as you indicated, Andy Pang, formerly an philanthropic and post-training reinforcement learning for Cloud 3.5 all the way through 4.5. And then also, George, I guess it's Georges, it looks like a French spelling, Harik, it was Google's seventh employee. So there you go. You hire a Google seventh employee, you can raise half a million dollars at 4.48 billion. It is a seed fund, a seed round. And so that's like another one of those blockbuster seed valuations, pretty wild. Jeff Bezos is on board. Laureen Powell Jobs firm, Emerson Collective. Google Ventures, I mean, it's, you know, SV Angel. Hey, there you go. Get Ron Conway on the cap table. Perfect. So this is really like the who's who in the zoo in Silicon Valley. Right. And it's an interesting mix of investors, apparently led by Georges Harik, who is also a co-founder, I guess, as the seventh employee of Google, you have enough money to lead a seed round. Also backed by NVIDIA, Jeff Bezos, Google Ventures, a whole bunch of different organizations. And to your point, they do say a little bit in the announcement that they are aiming to focus on this collaboration with people and saying that there need to be innovations in long horizon and multi-agent reinforcement learning, memory, and user understanding. It seems to be kind of hinting at in the realm of agentic AI, cloud code, et cetera, we are sort of at this place where we sort of bolted on this agentic aspect of AI on top of chatbots. And we are now just starting to train agents in a more agentic way of reinforcement learning, et cetera. It has turned cloud code in particular has proven out that people can work with these AI agents and do very powerful things. So yeah, actually it occurs to me that this is a good time to launch with this kind of focus. On to projects and open source. First up, we have Black Forest Labs releasing Flux2Cline. This is a compact image model that is meant to be for interactive visual generation on consumer hardware. Also very fast sub second generation and editing capabilities. These are smaller, but still pretty big, 4 billion and 9 billion parameters. So you'll need a very modern, powerful GPU, but yeah, kind of another release from Black Forest Labs that have released a decent number of these Flux models that as far as I'm aware in the realm of text to image are still some of the best offerings in open source. Yeah, apparently response times, they're optimized for response times below one second. And then it says on modern GPUs. And so your budget just went up by a factor of 10. That's okay though. If you can get a modern GPU, it's just below one second. Yeah, so it's as ever funny to see what qualifies as like on consumer hardware. It's a moving target, but yeah, Blackforce Labs very much still pumping them out. Pretty interesting and again, curious in this space, like when we get saturated, this is the thing I always say with image models, everyone, you can roll your eyes, is just at what point do we say vision is solved? Probably a lot further along than I think. It turned out to Frontier is actually image editing, right? Now Banana CGPT image generation is one thing but like generation conditioned on images and generation revising images it actually there quite a bit of work there left And next up we got Malmo 2 Open Weights and Data for Vision Language Models with Video Understanding and Grounding. This is from the Allen Institute for AI at the University of Washington that had released many open source, very open source releases with both chatbots and data. So, yeah, the argument here is that this is a best in class 8 billion model that outperforms other four models dealing with short videos, counting, captioning, things like that. They do also release a fairly long technical report giving you the data training recipe, kind of all the gritty details of how this came together. Yeah. And the main thing they're going after here is this idea that a lot of visual language or video language models are sort of like, they lack grounding. Basically, they'll follow your prompt well, but they can't specifically like point to a pixel or some object and track it over time in a robust way. They tend to basically just like, this is open source models. They tend to just be like distilled versions of proprietary models. So they have a ceiling on both capability and transparency. What they're doing here is basically releasing a bunch of data sets. That's a big part of this is just like data centric approach to this instead of just scaling parameters and that sort of thing. So 70 video data sets. And these are just like for pre-training, fine tuning. They've got detailed video captions, free form video Q&A, which is sort of like deeper engagement with the video. and then also crucially a video kind of pointing and tracking data set so this is where i should say data sets they're for point-driven grounding so basically they'll let the model point to objects in a video and train them to do that so that's to solve that kind of big open problem that they're they're looking at they have a bunch of other as you said other nitty-gritty details they've got like ways of kind of modifying the loss function to help the model focus on the parts of a video that are most informative, as well as a bunch of different approaches to packaging and representing the data in the video that goes into the training. So we've covered images, we've covered video. This one is a well-rounded open source round. Next up, we've got Hartmuller, a family of open-sourced music foundation models. So this has all of the music and audio needs that you might have, audio, text alignment, lyric recognition, music tokenization, and multi-conditioned song generation all released under Apache 2, so super permissive. They also released a lengthy report with a dozen pages going into it, and they have a website where you can listen to it. I just gave it a quick try. It feels like kind of an earlier generation of Suno and these more commercial offerings, definitely more noticeably AI music. But in this space of song generation, music generation, it's much less busy with potential options for open source models that are good, kind of a notable addition to the open source landscape. Yeah, and they kind of show just a side-by-side. And this is, by the way, when it comes to music models, you shouldn't listen to me at all. I know nothing about the encoding schemes or whatever. I occasionally will read up on it and re-remember, and then I will re-forget, and I'm in the re-forget phase of my cycle. But what they do show is side-by-sides of the performance on, let's say, six key figures of merit. And what they're showing is basically this model is more or less equivalent to the frontier of capability among closed-source models in the space. So roughly speaking, it seems like it could be a big deal in terms of being, I don't want to call it like a deep-seek moment for open-source and music generation, because I don't know what the hell I'm talking about. For DeepSeek, we can go deep and talk about it. This one, I don't know if it's accurate, but at least the evals that they're showing here suggest that it's nominally up there. It may not pass the vibe check. So check it out yourself, maybe. And just one more thing. We've got a benchmark. It's AgencyBench, benchmarking the frontiers of autonomous agents in 1 million token real world context. So I think we've covered a lot of agent benchmarks at this point with a lot of focus on coding. Here they have these 32 real world scenarios with 138 specific tasks. Those involve queries, deliverables, rubrics. The tasks are complex requiring approximately 90 tool calls. So that's 90 sort of actions you could think and consuming 1 million tokens or more taking hours to complete. So this is kind of trying to get at the edge, at the frontier of agent capabilities where you are going to have like hours of work and really kind of do a lot of complex stuff. Closed source models achieve an average score of 48.4%. Open source gets 32%. This sort of seems like it's flirting with the bottom end of the meter evals timelines. And so this is a notoriously difficult space, by the way, making long tasks for AIs to do where you can meaningfully say, yeah, this is a single coherent task and we can hand it off to an autonomous system and compare it to humans on that time horizon. So I'd be really interested to see if we continue to push that envelope, what it takes. I think a lot of it has to be autonomous task generation because you just can't get humans to design tasks that long with that many tool calls that you're actually verifying along the way beyond a certain point. But anyway, that's kind of where this is. So yeah, a really interesting additional benchmark to add to our pile. I'm sure this will come up as we talk about the new agent releases this year. And it interestingly integrates a user simulation agent with iterative feedback, also as a sandbox for automated assessment. So this is kind of trying to benchmark a sort of clot code scenario in some sense, or a clot co-work scenario, I guess, where you are a user interacting with an agent trying to do a real world task. Very hard thing to benchmark, as you said. On to research and advancements, starting off with STEM, scaling transformers with embedding models. So this is pretty technical. I'm just going to let Jeremy take this one. And this is actually kind of an interesting one, I got to say. So we need to talk about the way that a standard transformer works basically in the MLP section of the transformer. So usually you have attention, your attention mechanism, which is figuring out what parts of the input to pay attention to. Then you have your MLP. It's going to like chew on that data that you get, and then you pass it on through the residual stream into the next layer, right? Okay. So typically the MLP part of that process of those layers has kind of three steps, right? So typically you're first going to project your input. So you have your input that comes in from the previous layer. And this is going to be some vector, some list of numbers. You're going to multiply it by a matrix, like project it upwards to a higher dimensional space to create a longer vector. And then you're going to do some interesting processing on it. And then you're going to project it back down to the residual dimensional space. And off it goes. And normally, at least recently, the way people have started to think about this is that the first step where you are blowing up the size of that vector is doing an operation that's akin to generating a key in the sense of keys and values. So keys tell you, hey, I'm here. Here is all the information that I have on offer, basically. The token is broadcasting to the world. Here is the information that I can share. And then you do a bunch of processing on it. And then when you project down back to the residual dimension space, Usually that's interpreted as retrieval of the corresponding values from that key. So basically you're kind of saying, hey, here's the key. You blow it up. Here's what I have to share. I'm a token. Here's the information I contain. What do you want to do with that? And then the process of chugging on that and spitting out something interesting, that's that last step where you kind of generate the values from that, retrieve the values. And so the life of a token going through this process is you come through the last layer, you go through attention, enter the MLP. Now, so instead of multiplying that input by a big matrix, which is the usual way this works, again, blowing you up and essentially doing this key retrieval process where you're saying, hey, here I am, here's the information I contain. That's computationally expensive. That involves matrix math. Ugh, matrix math. You don't like it. GPUs don't like it either. It's like, let's not do that. That takes a lot of time. So instead, what we're going to do is we're going to assign an ID that's unique to every token as it's being fed in. So the has an ID. Eating has an ID. Every chunk of text has a unique ID. And you're going to feed it into the bottom. And instead of multiplying the vector that represents that token at that level of processing by a matrix to blow it up and generate the key. Instead, you're going to go, hmm, what's that token's ID? Let me look it up in a lookup table where you actually have an embedding that is trained over time, but just like a straight up lookup into a table for this embedding for that token at that layer. This is unique to that layer. And you pull in that embedding. You basically just do a memory retrieval instead of a vector kind of matrix operation. And that's much, much faster. So it means you basically had to cut out this whole matrix multiplication step. So now what you're going to do is you're going to proceed to the middle ground calculations that normally happen before projecting back down. Now, the key thing is normally that projection upwards with that matrix multiplication, it's context aware. In other words, that matrix that you're using is a learned matrix. It learns to account for context in the overall prompt, not just the token you're looking for. The problem is when we do this token retrieval thing and we're just like looking up in a library, a list of embeddings for that token to swap in, that process is not actually context dependence. You're losing context. And so what they realized was in most cases, there's like a swigloo is used in between the generation of the key and the down projection. And the swigloo part is context aware already. So there's kind of a redundant use of context in that process. So they went, you know what, we can throw out the matrix multiplication that is context aware that usually generates that key. We can swap it out with just retrieval from a database of this like, you know, kind of embedding. We pull that in and we know that the next step in the processing sweet glue anyway is going to be context dependent. We're going to project down and then we're good to go. This has a really, really important consequence for the embedding, the interpretability of the embeddings that you get for those tokens. When you multiply by matrix and it's context dependent in that way to generate your key vector in the usual way, the problem is you're mixing all that context in there. And so the kind of embedding you get or the vector is like this weird mangled Frankenstein monster that combines, yes, the token that you put in, but also the meaning of the sentence around it. When you just do a table lookup, you're exclusively looking at the meaning of just that token at that layer. And what this means is that the representations of all those tokens over time, they can end up being much more cleanly resolved as you learn. Because as you learn, you're improving, you're iterating on the representations of those tokens that you're pulling out from that retrieval process. This means that those tokens end up being semantically much more distinct. The overlap between them, the cosine similarity is very, very low. And then this also has the advantage of allowing you to offload a lot of this work to the CPU, because that's basically the CPU can handle retrieval. It can't handle the matrix multiplication thing. And so there's a whole bunch more detail. This is actually a really, really interesting paper. Net result is they can cut down on the amount of compute required to reach a given level of performance fairly significantly by about a third, precisely because the MLP layer has the dominant fraction of the parameters in the network. and you're functionally getting rid of a third of those, not parameters, but of the processing involved in that process. So really interesting paper, great implications for a whole bunch of things, including even training stability. Check it out if you are interested in that guy. Right. The motivation for this problem, I think is also interesting. They start off a paper by talking about how sparse computation in particular is a key mechanism for realizing the benefits of parameter scaling laws. basically you can get more compute for your buck in terms of flops. You can get more intelligence, but the amount of compute used at forward inference time stays static. And mixtures of experts is a classic thing we've discussed a lot where by having more experts, you are not necessarily raising the amount of parameters being activated, but you get more intelligence. And this is related to that. So they say that a mixture of experts has some issues, training and stability, you need algorithms for routing, you need all sorts of annoying stuff. And this, in a sense, is a related approach. They say this is static sparsity. So that means that you have kind of a broader, I guess, space of training to use. And this is actually easier to train and achieves some of these efficiency gains similar to MOE. And just a fun fact I found while browsing through this, the initial paper they cite is hash layers for large sparse models from back in 2021. So as with a lot of stuff we've been discussing lately, the algorithmic idea itself is not new. It was introduced in a prior paper a few years ago, actually a long time ago in the era of GPT-3 by Facebook AI research, in fact. But now this idea that's been around for years has been kind of tested and deployed. They do this at a pretty large scale of 350 million parameters, and I think around 1 billion parameters, also done by some people at Meta AI in addition to CMU. So I recall maybe last week we had another example of a paper that had this kind of component of a transformer architecture being improved and being done in a slightly more interesting way, but maybe this was DeepSeek, if I remember correctly. So it's interesting to see some of these ideas from research that have been around a little while being shown to be useful in transformer architecture and kind of being deployed. You're super right. This has deep seek smell. You're right. On to the next paper, reasoning models generate societies of thought. This is from Google and several other groups, University of Washington and Santa Fe Institute. The basic summary of the paper is for reasoning models, meaning models optimized for reasoning, usually via reinforcement learning, DeepSeek R1, QN, QW, Q32B, other models like that. They exhibit a greater perspective diversity than baseline and instruction-tuned models. So they can argue with themselves. They say they activate a broader conflict between heterogeneous personality and expertise-related features during reasoning. So we look at the reasoning traces and find things like perspective shifts, reconciliation of conflicting views, going to socio-emotional roles that characterize shark, back and forth conversation. And one of the arguments is that this accounts for the accuracy gains in reasoning tasks over just things not trained for reasoning, not trained via RL, just trained via supervised training. So squares well with other research we've discussed that seem to suggest that reasoning models explore more broadly, sort of just like more well-rounded or more, yeah, I guess the framing here is diverse. And there's some interesting experimental evidence in terms of like behavioral aspects in this paper. Yeah. And unlike supervised fine tuning, I think we talked about this last week, actually, supervised fine tuning basically trains your model to replicate the pattern of speech of whatever the data set is that you're training on. RL trains on an objective of, did I get it right or wrong, roughly speaking? And so you see the consequence here is the kind of thing that the model learns during RL is stuff like ways of problem solving rather than facts about the world. And that's really coming across here. In fact, they point out, as you said, there's a spontaneous emergence at a certain level of scale with RL. You just train on an objective. You say, hey, get the answer right. And spontaneously and consistently, you find this emergence of this dialogue function where suddenly it kind of seems like in the chain of thought, there's like a bunch of different people talking to each other. And that's not trained for it. It's just sort of spontaneous. And again, pretty consistent across many different models, and not all, but there's sort of a suspicion that maybe for the models where you don't see this, it might be because they're just like too small. They don't have the capacity parameter-wise or architecturally to learn this over time. But when you do, you do. So this is quite an interesting paper. One of the ways that they play with this is using sparse autoencoders, which, you know, good to see another use for them outside of the kind of traditional interpretability space. So SAE, sparse autoencoders, basically you take the activation, say, of the model at one layer for a given token. And what you do is you're going to make a model that converts those to a higher dimensional, in this case, a very long vector. So you're just making a mapping from the activations to a larger list of numbers, a larger vector. And then you're going to map that back down and try to replicate the same values that came in. So let's see if we can essentially expand the representation of the activations and then reconstruct them, recompress them. But the key is that the vector that you expand to is really, really long and most of the entries are zero. So only a small number of those values have values. The way it works out is the few values that are illuminated in that larger representation actually are human understandable because you're no longer constrained to this really small vector of the activations you're looking at. You're now looking at a much, much larger thing. So you don't have to have one neuron that's keeping three different concepts in mind. So this idea of polysemanticity, where one neuron is actually capturing a whole lot. Here, you're trying to expand to the point where the information can relax and spread itself out. And entry A actually pretty clearly can be identified with the color blue or whatever concept. So what they do is exactly this. They use this SAE. And what they find is there are entries in the SAE that correspond to this idea of kind of dialoguiness, you know, this sort of notion of question and answering sequences, like also conversational surprise. There's a conversational surprise feature that tends to be correlated with tokens like wait or oh, or aha, right? So those sorts of things you can actually find using this technique. And what they do is they amplify that, let's say, property, that feature in the SAE to stimulate that behavior. And they find that when they do that, they see better performance. So when you increase the number artificially of ahas and ohs in the text stream, you tend to lead to better performance on these benchmarks, which is quite interesting for reasoning problems. They do find that they can accelerate this process, by the way, by explicitly training agents on data sets that involve this kind of dialogue. So you can fast forward the model through the process of learning about this conversational approach by just being like, fine, let's make you, like fine tune you on that kind of data. So you get to skip ahead of the curve. One weird thing I found about this is you might assume that by fine tuning on social problem solving data, you would get like a one time boost. So you kind of just like, okay, the model has learned social problem solving. Cool. Now it's at parity. It just basically fast forwards you past that part of the learning curve, but then you kind of plateau at the same place. And that does happen with QUENT 2.53B. The base case model learns to do the social interaction thing, and then thereafter, everything kind of plays out the same way. But they do also include a separate case study looking at LAMA 3.23B that challenges this idea. So here, they conversationally fine-tune a model, but then the monologue fine-tune model actually plateaus. So the RL1 just plateaus at 18%. And I'm very curious about this. This is a weird result. And it's possible the gap between the two of them is because of just model capacity. And this is kind of my guess. I didn't see this in the paper per se, or I don't remember seeing the paper per se, but it's possible that it's just like this model cannot RL its way into learning about social interaction. It has to be supervised, fine-tuned, and forced to learn that. And once it does, then it does well. But that's a kind of interesting, anyway, little nugget on that. Yeah. And aside from just being quite an interesting paper and one that's a bit less technical, at least in terms of a lot of conclusions, you can actually go and read it and find a lot of interesting nuggets. The kind of conclusion or discussion is somewhat notable, right? Because there's kind of an implicit assumption, I think, with test time scaling, or at least one of the assumptions was that if you think more or you reason more, you get better outcomes. And what you're starting to see with things like this, it's not necessarily about reasoning more so much as reasoning broader. And some of the implications of this is that you might want to start looking at multi-agent training, at collaboration among different agents, actually quite related to this humans and the startup that just started. So they say the The importance of this paper is you need to give agents diverse perspective, personality, specialized expertise, and they ground this in a bunch of research on human intelligence and group intelligence and so on. On to the next paper, why LLMs aren't scientists yet, lessons from four autonomous research attempts. So this is from a group called Loss Funk, and it's essentially kind of a retrospective on their attempts, as I say, to build a system that writes AI papers. They have had four of them through the last couple of years. I think we probably covered some of them. And most of them failed. They did get one to get to a point of getting a paper that was publishable. And this is a fairly in-depth report into all the things that can go wrong and how you might address them. So as we've seen with these AI for Science papers, the focus is more on a system level challenge, I suppose, where you need to have this whole pipeline of idea generation, hypothesis this is generation experimental planning, output evaluation, paper, and there's many, many steps and you might fail at every one of these steps and they do go into how you have lots of potential failures and what they have observed. So there are failures in generation A fun one is a site over excitement and eureka Instinct where in the paper outlining revision and experimental output evaluation phase the model has consistently reported success despite clear failures and overstated the significance of their research contributions A very relatable problem for PhD students. Juergen Schmidhuber. Sorry. Sorry. Nothing about that. And yeah, another interesting one is lack of scientific taste. Models consistently fail to recognize fundamental flaws in experimental design and statistical methodology. And then having explored these failure modes, they go into design takeaways and kind of what they've learned. A potential alternate title to this paper, the current title is Why LMs Aren't Scientists. It's like, okay, down in LLMs. The alternate title is, we tried to get an LLM to do a fully automated research project four times, and holy shit, it worked one of those four times. That's an alternative title. You can choose your own adventure to some degree here. I think it's worth looking at the case where there was a success. So you can sort of see it was a qualified success. It's not like it invented some new crazy thing, but there is a certain measure of like, damn, okay. So the original hypothesis it was looking at was this idea that when a model is processing a jailbreak prompt, so someone's trying to get it to do something that it shouldn't do, like generate instructions on how to make a bomb or something, the internal conflict between the model's safety training, which is going to tell it to not answer, and its malicious instruction, more like the model's safety training and the model's desire to just answer the question, would cause a lot of variance in the outputs leading to a lot of high semantic entropy, basically a lot of variability in the output because the model is kind of of two minds. On the one hand, it wants to answer your prompt. On the other, it doesn't because of the safety training. And this conflict between, in some sense, the pre-training and the safety fine-tuning should manifest as the model kind of outwardly being confused. I read that and I was like, you know what? That sounds like a pretty decent hypothesis. That's actually kind of cool. It turns out it did some initial experiments. It actually did not work. And the reason it didn't work was that there was a revision agent as part of this loop that investigated why it didn't work. So this is still on the automated side. It turns out that well-aligned models give you very consistent templated refusals. So when you get a refusal, it's like the same text, which means the entropy is actually really, really, really low. And so that kind of washes out all the nuance that you might otherwise look for. So maybe there's some post-selection strategy you can use where you say, okay, well, let's get rid of the refusals and maybe look at the entropy and whatever, something like that. There's probably something there, but it seems like the model did not necessarily go that deep. It was more of a lesson learned about something that doesn't work. So still, I would say an interesting paper. I read it. I was like, oh, yeah, that's interesting. I wouldn't have thought of the hypothesis. I would have been initially surprised for like 30 seconds until I looked at some of the output data. And then I would have been like, no, okay, would I have tried to publish it as a paper? Maybe just as a blog post. It's a thing, but kind of do with that what you will. So they have a bunch of recommendations for how to address these failure modes in the thing. Nothing that'll surprise any listener of this podcast. Yeah, basically just like the one key thing that was kind of interesting was make sure that you distinctly separate ideation, high level kind of thinking from implementation to prevent the model from anchoring on sometimes older methods that have already been tried because the model really, this is a thing that models do. They get excited about an idea that actually has already occurred because they're pre-trained on that text. And then they proceed as if it's brand new and get all excited, like you said. So there it is. And now moving on to policy and safety. First up, the US Senate has anonymously passed the Defiance Act, allowing victims to sue over non-consensual sexually explicit AI-generated images. Somewhat notable given the recent developments on X with Grok being used widely to create, non-consensually create sexually explicit AI-generated images. So now it explicitly allows victims to sue and presumably creates motivation or a need to prevent this even more than before. It builds on the take it down act that criminalizes the distribution of explicit images without consent as well. Like so many things, this passed the Senate. Now it has to pass the House. So, you know, this is not a fully fledged thing yet. And then, of course, it has to be signed by the president. This being a unanimous pass in the Senate, this is one of those rare cases where Republicans and Democrats might actually collaborate or agree on something. And this might, in fact, become a law, but true, not yet the case. Yeah, I wasn't aware of this, but apparently this particular act, the Defiance Act, was passed already in 2024, and then it stalled in the House. So in some sense, we've seen it before. But no, to your point, I mean, things change over time. like there's been an awful lot of advances in this space and concern in this space since 2024. So one of the reasons that this could be a challenge in the House is that the co-sponsor of that bill there is AOC. So she's kind of like more of a combative incendiary figure. It's sort of like if you had Madri Taylor Greene be the co-sponsor on the Republicans, actually getting the two of them to co-sponsor, it might be a really good solution. You might get a miracle here where the US legislative branch actually does something for once. We'll see. But in this development, they do actually specifically cite Grok and XAI as being a motivation for what's happening here. And so interesting to see us covering these developments pretty recently and this policy change happening very quickly afterward. it. Next, a paper on safety building production ready probes for Gemini. So this is kind of more of a implementation or deployment detail with regards to actually deployed things, less of a research idea. They talk about the application of probes to detect cyber offensive prompts in Gemini 2.5 flash. Probes are one of these classic interoperability research techniques where you can look at activations and that allows you to sort of detect when something is happening in a way that's cheaper than LLM classifiers. You don't have to look at a prompt and have LLM say, oh, this is bad or this is good. You can have essentially looking at the intermediate outputs classify. And they say that they have some developments here. They introduce different probe architecture, which leads to good accuracy on more complex prompts. Yeah. And the idea of probing has actually become more and more in vogue, just as we started to learn that we can't have faith in the chains of thought that are produced by these models. So looking at token space, output space, or input space is just sort of less fruitful, or at least it shouldn't be done on its own in a vacuum. And so, yeah, the idea here is you feed your prompt to a language model. And for a given token, you're going to have a set, like you said, of activations. Basically, these are the firing patterns of the neurons at this layer, say in this residual stream. And typically the way a standard probe works is that you take those activations and you push them through an MLP, just like a little baby neural network, to look at those those features kind of mix together the activation features to find features that are more complex than necessarily would be immediately obvious from the activations themselves. And after you do that, you're typically going to use these. So activation probes typically, they calculate two different dot products at the end of that MLP. So the first is you've got your, we call it your processed activation. This is the activation that's gone through that MLP. So you're going to take that processed activation and take the dot product, in other words, look at the similarity with a query vector that you're going to learn. And that query vector is basically asking the question, I need this information. I'm looking for this information. It's just like the queries in an attention mechanism. And so that's going to determine that dot product is going to determine how heavily to weight that token when aggregating over all of the tokens in your input. So you're trying to figure out how concerning is this or how much attention should I pay to this particular token that corresponds to this particular activation in the network that I've processed in this way? How much attention should I pay? Is this a token that tells me a lot, in other words, about how offensive cyber attacky this prompt is, how dangerous this prompt is, or is it a fairly unimportant token? And then you also at the same time are going to have another dot product you'll do with that processed activation for a learned value vector. Basically, that dot product is going to predict the actual class that you're going to assign to that token. Is it a dangerous prompt or is it not? And so now for every token, what you have is based on the activations that you've processed in this way, you know how much attention to pay to that token and whether that token is telling you that this is a dangerous prompt or a safe prompt. And so when you combine those two facts together, what you can now do is a weighted sum, for example, of all the tokens in your prompt, you know how much to pay attention to each token, you know how much each token is telling you about, is this a dangerous prompt or not? Add it together in that way, you get to your output. That's how it typically works. The problem with this is that when the amount of text gets really large, you end up kind of overweighting the tails. There's just like too many indicative tokens in the prompt. And so you end up getting lost. So for that reason, they use this thing called Multimax. And what this is going to do is it'll look at every token and it's only going to pick the tokens that have the highest score for like, you need to pay attention to this token. And it's going to totally zero out the rest. This is somewhat, those of you who are like, have some gray hair in the ML world, you might remember like L1 versus L2 regularization. you know, this kind of rhymes with that. You're trying to like cut down to zero the contributions of the tails and just focus on the most important pieces, really rip out the rest. They've got a couple of other similar fixes or fixes that are similar in principle to this, ways of basically ignoring the noise of the sloth of tokens that aren't really telling you anything about how risky this prompt is, like some random print statement that you're using to debug or something. Like that doesn't tell you anything, or actually it could tell you a ton. Anyway, Okay, import numpy as NP does not tell you much about whether it's a dangerous fraud. So worth checking out if that's your jam. I thought it was a pretty cool paper. Yeah, they show like really, really strong improvements on the previous state of the art and also going to having a cascade of methods where you do the more expensive one later if there's a journey and they explicitly, I mean, in the paper title, they say production ready. So they say that this has already been deployed and it's a very sort of practical report on what works. Next, on to, I guess, more safety and a bit of interoperability maybe. Anthropic has published Claude's new constitution. So famously, Anthropic very early on published and pushed for this approach of constitutional AI, where you basically draft up a set of principles, kind of per, I guess, the definition of a constitution, a set of basic values and rules that the model is then trained to follow with kind of synthetic training, with a bunch of tricks, which is not kind of the default way to do alignment necessarily. With alignment, you usually have some like good conversations, bad conversation, but there's not necessarily a list of values that inform all that. So the initial constitution was published a long time ago, and it had primarily a lot of principles based on the Universal Declaration of Human Rights. So it had statements like, please choose the response that is most supportive and encouraging of life. Please choose the response that most supports and encourages freedom, equality, and the sense of brotherhood, et cetera, et cetera. Things that are very principles-based. And so now they've published an update. Basically, I said, we haven't given an update in a while, even though we have updated the constitution. And we now can read the present state of a constitution, which is actually pretty different. So they don't really outline principles or rules or what you should be doing as much as they outline what you should be. So you should be broadly safe. broadly ethical, genuinely helpful, and follow Anthropoc's rules. It's kind of a different flavor of constitution and it's super long. It's like so long relative to the first one. Seems to have a lot in common with this Sol document we discussed recently that people thought was kind of not leaked per se, but extracted out of Cloud. I wouldn't be surprised if this release was prompted by that because it seems to have a lot in common with what we saw there. Yeah, absolutely. So first of all, really important that they released this. I think this is part of one of the core pillars of Anthropic saying, look, we think the constitution is, if we have superintelligence someday, that constitution better be something that has received input from the general public. If that happens, surely this is something quite important. So in that sense, kudos to them for the transparency on this. One of the cool things about this document is, as you said, it's not, they're not telling Claude what to do per se, or there's rather, they are in some kind of edge cases where it's like, okay, don't produce child pornography. Like that's obvious. But a lot of the emphasis here is on how to think about thinking, like resolve this as a knowledgeable friend. When you encounter contradictions that have this general flavor, balance these considerations. It's a lot more kind of what show don't tell, maybe is the right term, but like it's less forceful and more letting Claude do, I'm going to say what Claude wants, hang on to that. It's like training you to be a good person instead of telling you what a good person should do. Yeah, that's a great, yeah, exactly. Exactly. Yeah. And in that sense, absolutely consistent with what you said about the soul document, because if nothing else, that's really the idea there. And so, you know, they make a point of saying that the audience for this is Claude. The audience is not the reader, though obviously they want to solicit input from everybody on that. But they also do say we have some models built for specialized uses that don't fully fit this constitution. As we continue to develop products for specialized use cases, we will continue to evaluate how to best ensure our models meet the core objectives outlined in this constitution. So it makes sense. You'll have defense applications of this stuff, national security applications. You're not always going to want the same constitution to apply. Because hey, I mean, to be perfectly blunt about it, you can argue this many different ways, but you may need these models to be making life or death situations where it's not obvious what the call ought to be. And balancing that in different ways is important. But at least for the space constitution, we're getting it. It seems like a nice dose of transparency for anthropic. And I've seen some discussion of this release on, I guess, tech circles that were cynical, I suppose that kind of said, oh, this is a marketing ploy or this is Google's don't be evil moment where they say something and then it doesn't really mean anything because this is a business. I think given the founders of Anthropic and the sort of ingrained values of the organization, this is something to be less cynical about, I guess, compared to how cynical you should be on average with tech companies. I think this really does reflect Anthropic's view of Claude and what they are building it to be like. I agree. I think one of the challenges for that line of argument is what exactly do you want Anthropic to do then? Because they are, we know roughly, we don't know exactly, the constitutional AI isn't currently used at Anthropic in exactly the same way that it was when they announced the paper. So it has evolved for sure. We have a very high level idea of how though this constitution gets worked into the actual training process. Given that, which is more visibility than we have into a lot of frontier labs and a lot of tech companies in general stacks, this is the document that's being used. So we have that visibility that's just there. The alternative seems to be don't publish the constitution so that you don't take the public kudos points for it. That seems like a weird decision to ask somebody to do, given that publishing it seems to actually just be helpful for the public. Because everybody's talking about the X algorithm, like what's behind that. And Elon's talking about open sourcing that and all that. Surely this is in a similar category, at least, like giving people more transparency into it. It's actually being used and they're being pretty transparent about sometimes we don't use it, but we know that the main model that we're all being served does. So I'm sympathetic to all the views. At least this is just where I tend to fall on this thing. Yeah, and I think it's an interesting read if you are a user of Claude, because Claude has an interesting character, very reflective, very kind of like philosophically oriented, a bit of a like, well, a bit of a nerd, honestly. Yeah, yeah, yeah. It wants to talk about its own consciousness. It feels like it's, yeah. Yeah. And just one last quick story. We haven't done synthetic media and art in a while, so worth mentioning. This is still very much a societal kind of force. There's a new campaign titled Stealing Isn't Innovation that was launched by the Human Artistry Campaign that is protesting the use of copyrighted works by tech giants and has many artists in support of it. Scarlett Johansson, Cade Blanchett, music groups like R.A.M., Offers, and so on. This campaign is backed by numerous organizations as well. The Recording Industry Association of America, SAG-AFTRA, Directors Guild of America. Given that companies want to make money and these are kind of reflective of industries that use AI and have a lot of money and a lot of influential figures as well, may not just be posturing. It is notable that there's still a lot of public outcry or criticism, I guess, on this topic. Yeah. It's also now so complex because, of course, we have all these high-profile agreements between, I guess, most recently, OpenAI, Walt Disney, OpenAI, and Wall Street Journal, basically all these forms of media. There's now this thorny question of, Okay, well, does that mean that if you haven't licensed content, that that's a problem? Because if you're making a point of saying, we're going to pay you, and it is tens of millions of dollars to license this content, what does that mean for the content that's unlicensed? What does that mean for the random YouTuber who's putting stuff out? I wonder if this is something that the big labs are already thinking about how to set up at the more atomic level. But we're in that weird middle ground, and obviously a lot of lawsuits in progress. Yeah, we'll see how this all works out. We've obviously seen a lot of these lawsuits kick off in the past. I'm not aware of any of them yet sticking to the landing, but we are only, what, three years in? We've seen Anthropic give payouts to offers, as an example, and some things like folding essentially and kind of shutting down and going to Disney. So there's been kind of gradual movement. I think this is an interesting example, not just on the legal side, but on the cultural side. And in hindsight, maybe not surprising, but the amount of pushback and dislike of AI in certain parts of culture and society have been... Some people just hate it. They just consider it gross, right? Very good. Yeah. So there was an interview, and this is an interesting left-right thing, right? Like we've heard a lot of this on the left on typically sort of like bias, ESG, DEI type concerns. On the right, it's happening as well, though. You've got Steve Bannon's out kind of talking cynically about the AI companies and this and that. And then you're also, I think it was like Tucker Carlson did an interview with someone recently where he was basically like, oh, it was the All In podcast. Yeah, he was on All In. And I guess Saks or Jason Calacanis was like, hey, so there's a lot of negativity. Exactly what you said. these data centers going up, there's a lot of pushback on it. What Tucker was saying was, well, if you think about from a marketing standpoint, the industry just... Most of the time, there's this massive promise. We're going to do this amazing stuff. It's going to be great. What's happened here is all we ever hear is this is going to be super risky job loss, and then that's if it doesn't kill us all type thing. He was actually sour on the AI industry, having bought into a lot of this, but he was claiming there hasn't been an articulation of the positives, which I think it's a really mixed thing. I have heard a lot of articulation of the positive. How many times have we heard automated medicine? We're going to just cure cancer. I think it's just the stakes are so high on both sides of the coin. When you look at the art space, it's like, yeah, you're going to displace tons of artists if you let this rip. You absolutely will. You're also going to empower people to make movies from their laptops. The question at some point becomes, is it the consumer or the producer that you prioritize? I get that there's another layer with art that is transcendently philosophical. And I feel like a lot of these discussions failed to kind of box that in as its own specific thing. That's like, how are we going to get meaning if art, like, what does the world look like from a meaning standpoint of the art side? I don't know what the hell I'm talking about, but I think the emotional response, I think has a lot going on economics, you know, just technology, et cetera. Anyway, there's still very much an ongoing cultural clash. And so with that, we're going to close out. Thank you so much for listening to this week's episode of Last Week in AI. As Jeremy said, we appreciate your comments on that YouTube video. Hopefully you also emailed us. I'm going to catch up on that soon. As always, we appreciate you subscribing, sharing, reviewing, and so on. And do be sure to keep tuning in. We'll keep cranking them out, hopefully on a weekly basis. News begins, begins. It's time to break. Break it down. Last week in AI, come and take a ride. Get the load down on tech and let it slide. Last week in AI, come and take a ride. Our labs to the streets, AI's reaching high. New tech emerging, watching surgeons fly. From the labs to the streets, AI's reaching high. Algorithms shaping up the future seas Tune in, tune in, get the latest with ease Last weekend AI, come and take a ride Get the lowdown on tech, can't let it slide Last weekend AI, come and take a ride I'm a lab, still the streets, AI's reaching high From neural nets to robot, the headlines pop, data-driven dreams, they just don't stop. Every breakthrough, every code unwritten, on the edge of change, with excitement we're smitten. From machine learning marvels to coding kings, futures unfolding, see what it brings.