The Alien in the Room

61 min

•Dec 12, 20256 months ago

Summary

Radiolab explores the nature of artificial intelligence by tracing its evolution from early neural networks to modern large language models, using the metaphor of an 'alien' intelligence fundamentally different from human cognition. The episode breaks down how AI learns through mathematical optimization, examines its capabilities and limitations compared to biological intelligence, and features a professional Go player's encounter with AlphaGo to illustrate what it means when AI surpasses human expertise.

Insights

AI systems possess a fundamentally alien form of intelligence that doesn't map onto biological cognition—they excel at tasks humans find hard (math, pattern recognition) while struggling with tasks any animal can do (physical reasoning, common sense)
Modern AI's core mechanism is statistical prediction: generating the next word, pixel, or move by identifying patterns in training data through mathematical optimization, not through understanding or reasoning
The shift from rule-based programming to machine learning required a paradigm change from hard-coding logic to letting systems learn by adjusting connection weights through calculus-based feedback loops
GPU hardware and transformer architecture enabled the scaling that made large language models possible—processing entire datasets in parallel rather than sequentially unlocked emergent capabilities
Encountering AI superiority in domains of human mastery creates psychological and philosophical challenges distinct from technological disruption—it forces recalibration of identity and meaning

Trends

Generative AI proliferation across multiple modalities (text, image, video, audio) driven by same underlying architecture and scaling principlesGrowing recognition that AI interpretability remains a critical unsolved problem—even creators don't fully understand what intermediate layers learnShift in AI discourse from anthropomorphic framing ('thinking,' 'understanding') toward mathematical precision ('prediction,' 'pattern matching')Hardware (GPU/parallel processing) as primary constraint and enabler of AI capability advancement, not algorithmic innovation aloneProfessional and creative fields confronting displacement by AI systems that match or exceed human performance in specialized domainsTemperature/randomness as controllable parameter enabling creative variation in deterministic mathematical systemsEmergence of AI-generated content in commercial markets (music charts, potential film) raising questions about authenticity and human creative laborOctopus metaphor gaining traction as framework for understanding radically different forms of intelligence without anthropomorphizingPhilosophical reframing: AI advancement as opportunity to clarify what makes human experience unique rather than existential threatScaling laws: sufficiently large models trained on sufficiently large datasets exhibit unexpected capabilities not present in smaller versions

Topics

Neural Network Architecture and Learning MechanismsMachine Learning vs. Rule-Based ProgrammingLarge Language Models and Transformer ArchitectureGPU Hardware and Parallel Processing in AIAI Interpretability and the Black Box ProblemGenerative AI Across Multiple ModalitiesAI Capability Benchmarking and TestingComparative Intelligence: Biological vs. ArtificialAI in Creative and Professional DomainsHuman Identity and Meaning in Age of AIStatistical Prediction as Core AI MechanismTraining Data Scale and Emergent CapabilitiesTemperature and Randomness in AI OutputAI Safety and Alignment ConsiderationsHistorical Evolution of AI Technology

Companies

Google

Developed AlphaGo system that defeated professional Go player Fan Hui; created transformer architecture and GPT research

OpenAI

Created ChatGPT, the large language model that exemplifies modern generative AI capabilities discussed throughout epi...

IBM

Pioneered early chatbot development in the 1980s using neural networks to predict next words in conversations

NVIDIA

Manufactures GPUs (graphical processing units) that enable parallel processing critical to modern AI training and inf...

Salk Institute for Biological Studies

Institution where Terry Sejnowski conducted early neural network research that led to machine learning breakthroughs

University of Cambridge

Home to Leverhulme Center for the Future of Intelligence led by Stephen Cave, which runs Animal AI Olympics testing

People

Terry Sejnowski

Neuroscientist at Salk Institute who pioneered neural network learning in 1980s, trained systems to recognize speech ...

Geoffrey Hinton

Collaborator with Sejnowski on early machine learning research that established foundation for modern deep learning

Stephen Cave

Director of Leverhulme Center for Future of Intelligence at Cambridge; designed Animal AI Olympics to test AI cognition

Grant Sanderson

YouTuber (3Blue1Brown) who explained neural network mathematics and transformer architecture to the Radiolab team

Yann LeCun

Researcher at Google who developed attention mechanism and transformer architecture enabling modern large language mo...

Fan Hui

Professional Go player (3x European champion) who faced AlphaGo in 2015 match, lost 0-5, reflecting on AI's alien int...

Stephen Levy

Technology journalist who published 'Artificial Life' in 1992; discussed early chatbot limitations with longer text s...

Tom Mullaney

Stanford professor of modern Chinese history; provided philosophical perspective on AI's impact on human identity and...

Quotes

"They have a completely different profile, capabilities and skills than any animal. They are not like us."

Stephen Cave•Early discussion of AI testing results

"Easy things are hard and hard things are easy."

Lativ Nasser•Describing the Moravec's Paradox of AI capabilities

"It's like the computer simultaneously lives in the multiverse of that sentence where each word in that sentence is the most important."

Grant Sanderson•Explaining transformer attention mechanism

"Go for me, it's like a mirror. Because when you play, you can see your mind on the board."

Fan Hui•Describing the philosophical nature of Go before facing AlphaGo

"I see myself. So it's like Africa teach me that our life, where we'll always lost, lost, lost, lost. This is important for us. I think this is human."

Fan Hui•Reflecting on losing to AlphaGo and what it revealed about human nature

Full Transcript

Hey, I'm Molly Webster. Hey, I'm Mono Montgomery. Mono and I just made a snail episode. It's called snail sex tape. And we have not stopped talking about snails for like months. We've become deeply obsessed with snails. I think we should all get snail tattoos. Ooh, snail tattoo could be cute. But you know what, you can get instead of a snail tattoo. What? You can get an enamel snail pin in honor of our snail sex tape episode. I've never been more honored in my life. I know. It is based on a real medieval snail miniature. I will be rocking it on my gene jacket all spring long. So to get one of these pins, you have to join the lab. And when you join the lab in addition to helping fund our show, you get access to sponsor free podcasts, plus monthly bonus content, plus invitations to events with the team. Including an AMA that we're going to be doing next month, you and me about the behind the scenes of making snail sex tape. Behind the shell, BTS, all you have to do is go to radiolab.org slash join. And if you use the code word snail, you get two months off the first year of an annual membership. Get your pin. And we can't wait to see you guys next month. Thanks, everyone. Oh, wait, you're listening. OK. OK. OK. OK. Door listening to radio lab. Radio from W and Y. See? Yeah. OK, after all of that, it is time to finally discuss. Let's if the question. Yeah, the topic, OK, the theme of the moment, perhaps climate change. No, that nobody cares about climate change, man. Come on. Simon. Hey, I'm Lative Nasser. This is radio lab, where despite what reporter producer Simon Adler just said, we here at the show, including Simon, do care about climate change. But we're here today to talk about a different, huge overwhelming thing that we're all in the middle of. I mean, I don't want to put words in your mouth, but what I have been feeling is a general sense of frustration. Yeah, yeah. Something that everybody is talking about, but nobody seems to actually understand. You and I have even done interviews together with people on this stuff, right? Which is, of course, artificial intelligence. So much of the coverage about this stuff right now is this running debate, right, where you've got people on one side saying, these AI, they think they are intelligent, and eventually they'll outsmart and destroy us all. Right. And then on the other side, you've got people being like, no, they aren't actually intelligent. They're just mimicking us, and it's not as big a deal as everyone says. Right. And I don't actually know who to believe. Yeah. And I think it's because I don't know what AI is. Like, I don't know how it does what it does under the hood. Yeah. Because we don't know, right? This is one of the most extraordinary things about, you know, machine learning AI is that we don't really know what they are. But after reading countless articles, talking to tech people and scientists, I finally felt like I was getting at that question when I talked to this guy. Stephen Cave, I'm the director of the Leaver Home Center for the Future of Intelligence. He leads this sort of think tank at the University of Cambridge, and it was about 50 of us now trying to understand these systems using a really wide range of methods, including tests taken from animal psychology, tests designed to measure how well a mouse can problem solve, and applying them to AI agents in order to understand, well, where are we in the kind of evolutionary cognitive tree of life of AI? And they've actually turned these tests into a sort of competition, that they call the animal AI Olympics. Yes, indeed. Okay. Oh, that just sounds fun. Right. Yeah, exactly. Yeah. So to do this, they've created a slightly lower resolution toy story, looking digital world. Okay. Or maybe even more accurately, like if you know the game Minecraft. Oh, yeah, yeah, sure, sure. It looks like that. It's this three-dimensional space filled with all these different bright primary colored objects. Okay. And then they take these AI, which are running on basically the same kind of engine that powered Chatchy-P-T, and they give these things a little avatar, like a hedgehog or a pig or a panda, and then they just sort of place them in this 3D world and say, there is food in here. Find it. So it has to like navigate the digital world to find, I mean, I assume it's not really food. It's this green orb that they're looking for. Okay. And I mean, there are walls that they have to like figure out how to get around. There are transparent walls, but it's like physical world problem-solving. Absolutely. And I mean, well, this is the sort of test that mice or pigeons can pull off pretty easily for these AI agents. Things like manipulating objects and understanding gravity. It's like real... Challenge. Like they struggle to press a lever or perceive an edge. Which any animal can do? Or at least, you know, any mammal, say. And so effectively, these systems don't have the common sense of a mouse. Whereas higher reasoning, maths and so on, they can do a lot better than humans can. That's the more of X paradox, right? Like it's like easy things are hard and hard things are easy. Exactly. Yeah. And like we've known this for a long time, and it's pretty obvious at this point, but after running all of these AI's through this thing, dozens, hundreds of times, what Stephen has seen over and over is that? They have a completely different profile, capabilities and skills than any animal. They are not like us. No, I mean, one of its capabilities might be convincing us its human-like, but it isn't. Well, okay. So then what is it like? I mean, is the AI a little tadpole or what is it? Well, there is one metaphor that some people like to use, and that's the octopus. You know, what's wonderful about the octopus is they are phenomenally smart. They can use tools, for example, without being taught. They develop sophisticated tactics of all kinds. There are lots of wonderful octopus escape stories. Well, wait, because that doesn't sound like AI at all. No. Then why this metaphor? Well, it's helpful not because AI's like them, but because in a way it really shows how different intelligence can be. Okay. I mean, octopus is their intelligence is distributed through their tentacles. He says, you know, we and all mammals have this one central brain, but octopus is they have nine little brains, one in the center, and then one in each limb. So, tentacles can function much more independently, which is how they manage to have eight of them, all doing clever things all at once. And, you know, this kind of intelligence is fundamentally alien to us. And that's a good way of looking at AI. Alien. Profoundly alien. Which on the one hand makes this thing feel sort of unknowable, impossible to understand. But then on the other, well, it is alien. It did not evolve in some far off galaxy, or even the depths of the ocean. Right. Like, this is an alien we created year by year, transistor by transistor. And so this is what we're doing today. We are going to trace the evolution of this alien in our midst, this alien that we designed in the hopes, at least, of like coming to some deeper understanding of what it actually is today. And then maybe if we're lucky, that will give us some insight into this thing we are all, almost certainly going to have to face off with at some point or another. So. This is great. Like, I feel like we all need this, we all need this explainer. Great. Uh, fill your glass because here we go. Hey, you guys can hear me? Yes, I can hear you, Simon. Hello, Terry. How are you? Very good. Thank you. Sorry for the slight delayed start here. Some classic technical difficulties, you know? So there are a lot of different first contacts. Yeah, we could point to with this alien species. But the most fun place to start that I found is with this guy, Terry Sinovsky. Professor at the Salk Institute for biological studies, who are, uh, yeah, sort of like the midwife of AI. Is that the helpful way to think of you or know? Yes, yes, actually. Uh, well, that's obviously, uh, more complicated than that. But that's, that's, that's, that's not a bad analogy. Terry trained as a neurobiologist. He came up poking probes in monkey's heads to try to understand how the brain works. But then in the mid 80s, he teamed up with some computer scientists, trying to make computers do animal brain-like things, like here and recognize sounds or visuals. But, but, but, but, but, it was going nowhere. Okay. Because everything was based on rules at the time, like all computer programming at this point, it was this incredibly complicated set of like, if this, then that statement. So if you see this, and you see that, but you don't see that, then that means this, this sort of web of logic, right? Which when it comes to recognizing sounds or pictures was a problem. Because for each rule, there are, you know, tens of thousands, a hundred thousand exceptions, just too many nuances in the rules to hard code in. And so it was clear that this approach, this way of doing it through rules, was really hopeless. And so together with my friend and collaborator, Jeffrey Hinton, he started to wonder if there was a different way to tackle this. Learning. And so with a small group, with computers that were puny by today's standards, they set out to build a machine that could learn. And one of the first things they tried to teach it was how to pronounce English. You know, text to speech in computer science. And amazingly, demonstration of network learning by Terry Sinovsky and Charles Rosenberg. They have recordings from these early training sessions. Now, if you want to learn from experience, you have to have lots of data. And so, you ready? Ooh, ah, bad. Okay. Sometimes you want to look at me. They took a transcript of a kid talking, a transcript I had my friend and neighbor, Leavon Reanact. What we want from school, I go to my grandmother's house, because he just sounds candy. Nice, that's perfect. Ready for the next one? And then what Terry did was give the computer this text, and then also gave it the exact phonemes, like the symbols for the proper pronunciation for those words. No rules, just actual pronunciations. And then said to the computer, quiz yourself, like go ahead and try and then compare what you tried to the correct pronunciation. First recording, day, no go learning. And here it is. Wow. Wow. Wow. Right. So it has no idea what it's doing. Doesn't sound like a baby. Like, that just sounds like glitched out. Wow. Wow. Wow. Wow. Wow. Slowly, we could actually hear the learning. You could hear it figuring out the difference between vowels and consonants. And then it would start pronouncing small words, you know. And we would go to my cocoon, gate to one mile, no yo me, I buy and run, no come, down to it, we sleep. And, you know, it only took a couple of days. When we walk home from school, I liked to go to my grandma's house, we'll take her, she gave us candy. And it was icing it. And we eat their thin pies. And we eat there sometimes. Sometimes we sleep over night there. Sometimes we sleep over night there. Sometimes when I go to go to my cousins, I get up late, bang let them all back. But the really astonishing thing is that when they gave the program new words and new sentences that it had never seen before, it pronounced those two. And when he eats them, when a run gets tired, when he goes to bed, when he finally gets to sleep, it was phenomenal. Sometimes I get a goat of it at 12.30. Sometimes but most of our times I don't. What we didn't appreciate back then was that Nat Talk was a little bit of 21st century AI in the 20th century. That this process of learning was the future. Are we done? We're done. Thank you so much. Well, okay, but like what actually happened there? What is it doing? How do you get a machine to learn that? Well, take a baby human. You know, it's born with this clump of gray stuff in its head, which is really a bunch of neurons that are all connected in like a random messy way. Oh, they are connected. I just imagined a baby brain was like, nothing was connected. It was a blank slate. No, when the baby emerges, the neurons are all connected. They're just not connected in ways that make sense in terms of the world they've just popped into. But then when it gets some input, like it touches something hot, gets yelled at, gets cuddled. It starts to strengthen some of these connections and prune others back until you have this just unbelievably complicated network of connections that can recognize patterns in the world around it. And you know, know that this is a square or if you poke a cat, you get scratched. That's right. In the brain, you adapt to your world that you happen to be in by changing the strengths of connections between neurons. So basically, Terry and others wanted to create some version of that in a machine. Yeah, that you had it. The models we were developing, these neural network models were based on very simplified versions of brain circuits. Okay, but how did you, how did you do that? Like, what is, what is going on under the hood here that allows it to do this? Well, we understand mathematically how they work. And we're making progress now, trying to translate the mathematics into something that humans understand. And so, let's have here is my best attempt to translate this for us humans. Okay. I mean, so setting aside all of the technical setup on like, how does it even interpreting the data or what are you inputting it with the help of this guy, Grant Sanderson? Yeah, I run a YouTube channel that's named 3 Blue and Brown. I often talk about math, but math adjacent things as well. Great. We're just going to draw like a mental image of what one of these networks looks like. Okay. Let's go. Now, as we all know, these neural nets can do crazily complex things, but for now, we are going to give one a very simple problem. I'm going to draw a couple of shapes. What shape is this? A circle. Nice. Can we get a computer to see a circle? How about this? A circle. A very childlike task. Yeah, sure. First things first, to get an image into the computer, we're going to chop it up into a bunch of pixels, like a 10 by 10 grid of them. Okay. And we're going to imagine those pixels, 100 light bulbs, when light bulbs for every pixel and light bulbs that will be on, if their corresponding pixel is filled in with ink, and off, if their pixel is empty. Okay. We've got this circle of illuminated bulbs in this grid of bulbs that are off. Okay. I can see it. From there, for reasons that'll make sense in a minute, below that, we're going to add a smaller grid of 10 light bulbs, and then below that, just one bottom bulb. At the top, 100 light bulbs, and then another layer, 10 light bulbs, another layer, one bulb. Exactly. So, that is just the answer. The output that, when it turns on, says yes. Circles. Circles, that's right. There's a circle here. Okay. But, this last bulb, it's a little bit special. It's not like the other bulbs, in that it's actually on a dimmer. So it can also answer, like, maybe a circle, because it could be a square, if it's kind of bright, or I'm pretty sure, if it's pretty bright, or if it's all the way on, that means this is definitely a circle. As a side note, yeah, this feels like quite the challenge where we're torturing the poor audience members here, probably on their drive and not able to allocate their visual cortex to try to visualize all this, but setting aside all of the technical terminology. There's one last thing to do. We have to wire all of these bulbs together, so that electricity can flow from that top grid through that middle grid down to that last bulb, which will hopefully turn it on. So, we call up an electrician, we tell him go and connect every bulb in the top 100 to every bulb in the middle 10, and then go and connect every bulb in the middle 10 to that final bulb. So literally, every bulb is connected to every other bulb, basically. Exactly. So that, electricity can flow down from any bulb that's lit up and kind of cascade through all of them. Got it. And so, the electrician starts pulling wires, soldering, and they say, I'm done. But the thing about this electrician is, they're shit. Like they just do a terrible job. Some of the wires that they put in are like a strong copper. Others are just twine, so they can't even carry electricity. And so, when this is all said and done, this network we get is kind of like a fresh baby brain with just random neurons clumped together. Got it. And so, when we do send an image of a circle into it, into the machine. Hey, why did a microphone, I can't? To record your voice. Lighting up some of the bulbs in that top grid. What shape is this? The electricity passes down through these random connections from the top to the middle, down to the bottom, and in all likelihood. I don't know. Rapping. It's completely wrong on this. That final bulb might be a little lit up or half lit up or just completely off. Okay. Now, when a child gets something wrong, and like a parent scolds them, that is altering the connections between the neurons and the brain. They're putting some pruning others back, right? Right. And that is what we want to do with this machine. We want to mess with those wires, the strengths of those connections between the bulbs. Right. Right. Now, we could just go in there and rewire this thing by hand. Yeah. We could pick out the important bulbs because we know which ones are lit up for a circle and direct their current through the middle bulbs to that final bulb. But you know, that would take just as long as hard coding it. Right. And so instead, we're going to give this thing the chance to learn all this, to learn what the connections should be. So when it gives us that first random wrong answer, we're going to say bad robot. There is absolutely a circle in this image. Try again. Okay. I will try again. But then after that first try, instead of us standing there saying yes or no, we are going to set it up to learn all on its own. We're going to step away and let math be its babysitter, be its teacher. And so this is the moment where we have to dive into the math a bit. Uh, okay. It's not that complicated. It's mostly multiplication. All right. Okay. Well, let's go. First of all, these bulbs in the computer, they're really just numbers. One, two, three, four, five. And the wires, you can really just think of them as variables that multiply these numbers X times two as they pass through them. Y times point three. A good wire multiplies the electricity by five or whatever. A bad one divides it in half or even zeros it out. And that means we can just take this entire array of bulbs and wires and turn it into a giant equation. You know, eight times B plus C times D plus E times F. There's some other math strewn in there very artfully and deliberately. But the key here is with a bit of mathematical trickery, this equation can represent the difference between the output it is giving. There is a quail point two percent likely going on a circle. In the output we want it to give. There is a 100 percent likely going on a circle. And if we think, hey, I've got this function and I want to find a minimum of that. Like minimize the difference between your output and the output we want. There's a whole field of math that is just built ready to do exactly this kind of thing. This is what calculus is all about. Like Newton, if he was rising from the grave would just be like showing fireworks right now saying, hey, I got this. I know how to do this one. This is somehow the calculus tells you in math equation form. If you're getting closer to the right answer. Yeah, and don't worry, we're not going to go into the calculus other than to say we walk away and the calculus becomes the teacher. Okay, so. Where will percent likely go? After the first wrong answer, the equation says no, machine tries again. 25 percent likely go. And the equation says closer and the machine tries again. And each time it tries, it messes with the wiring, the weight of the connections between the bulbs, getting it closer and closer to right. Exactly. And what happens over time is that that middle grid of 10 bulbs, their connections back to that top grid are getting tweaked in such a way that it's like they're starting to pick up clues. Like maybe it's getting stronger signals from bulbs that are part of a curve. Or maybe it figures out that the corner bulb can't be on for it to be a circle. And like the thing is, we actually don't know. I mean, when people talk about these things being a black box, this is what they mean. It's this middle grid. It's all automated by math. It's picking up something. And we might think no what the clues are. We just know that they're right. That the clues are like they were. It's finding some signal that tells it there is a circleish thing here. And as it keeps giving answers and the equation keeps telling it whether it's right or not or closer or further away, eventually each of those middle bulbs is receiving the right electricity from the right top bulbs to know if these characteristics of a circle are there. And if they are, they pass that along to the final light bulb, which will light up if enough of those characteristics are present. And at that point, yeah, our little network here has learned to recognize this circle. That's actually kind of a must-honishing. That's pretty amazing. It is. But it's only this one circle. And so the important thing is that if you do this process, not with just this one circle, circle, circle. But with 10, 100, thousands of examples. Circle, circle, circle. You know, big circle, little circles, messy circles, circles drawn by you and me. And you have the machine tweak all those different wires for all those different examples. You can then take all of that and do one final, actually very simple bit of math. Just average it all together. All of the wire strengths you got from all the examples for wire one get average down to one value. All the wire strengths that you got for wire two get average down to one value. And if you've done this right, you can then send in any of the drawings it's seen before or new drawings it's never seen. It's just a circle drawn by a two year old or a picture of an orange. And it will say, yes, there is a circle there. Holy cow. Now that process we just went through can recognize way more sophisticated things than just a shape, like cats or dogs. And I mean, the only real difference in the model is instead of these three grids we just used, these three layers, you know, an input, a middle and an output, you just add more layers of bulbs in the middle. These multiple middle layers allow the computer to recognize progressively more complicated components of the picture. So like the first layer might just find the edges. The second might find textures. The third forms, the fourth, maybe eyeballs. Because it's like, has everything is made up of building blocks of the layer before it? Yes. Without crucially, without anyone labeling any of those intermediate, like it's figuring that out itself. Exactly. And then using the same mathematical reinforcement, it can tune and tweak to get shit right. Okay. Wow. Well, I need to drink after all of this to sort of let all this settle in. Okay. Okay. Like this, I'm, I wish my kids could learn like this. Like the way they learn is so physical, so emotional. It matters who's saying it. It matters how they're saying it. It matters the tone. It matters. All these different things. Like this is so clean and like crazy fast. I mean, what just took us 10, 15 minutes to explain. That all happens in seconds. So it can learn the circle thing at, at basically lightning speed. But like a circle recognizing a circle is one thing. And like now we're talking like actually making, like making a, you know, a, a sonnet as if Shakespeare wrote it. That seems like a, that seems like a very wide gulf. It seems like there's still a lot of place to go. For sure. And our little alien is going to have to evolve here. Yeah. But in terms of its, its architecture of how it does this, it's basically the exact same. The, the only real difference is we're shifting its, its focus from recognizing to a slightly different skill. And we're going to get to that. You want to predict what I'm going to say next? Right after a quick break. Exactly. Right after a quick break. I'm Andy and I'm Melissa and this is Moms and Mysteries. We're two Florida moms obsessed with true crime. From infamous cases like Ellen Greenberg to shocking Florida stories like the Dan Markelle killing with 55 million downloads. Listen to Moms and Mysteries on Apple podcasts, Spotify or wherever you get your podcasts. Lutth Simon, radio lab. So you, you asked this question to me before the break. Like how did this thing evolve from being able to recognize shapes to, to generate stuff? Yeah. And I posed that very question to Grant Sanderson, okay? Yeah. Okay. So I would say there's, there's many different ideas at play here who again, YouTuber has thought a hell of a lot about this stuff. And he says the important next step is to realize that yes, you could think of what we did with those circles as having the machine recognize them. Or you could say we were asking the machine to predict the answer we wanted. Like with the circle example, there's two things that it could predict. Circle or not. Okay. So that's, so it's not anything meaningfully different. It's just like, let's just call everything a prediction. Right. But it becomes important when we're talking about generative stuff. Okay. Like in the case of language predict what word comes next. So to explain, going all the way back to the 80s, IBM began playing around with these chatbots that you could type to and it would respond. Hello there. How are you today? And the way it would do what it was doing was it would take every word that you typed in as your question, turn those words into numbers. We're not going to go into how because that would take an hour in and of itself. But turn those words into numbers, send it through this multi layered set of bulbs. But in this case, those, those bulbs, those layers, it's passing through. They haven't been trained to categorize a sentence. Like we don't want it to say that was a question. Instead, it has been trained to spit out the word that is most likely to come next to predict the most likely next word. Just one word, just one, it's not even a word also that there's a nuance here between the notion of words and tokens, but excessive nuance. Yeah, but it's like, what is it even basing? Like how is it predicting that with a circle? You know, it's a circle. We know the right answer. We're giving it the right answer. It's calculating back to that right answer. Right. But like in a sentence that could go any million number of ways, how can it ever have a right answer to train back to? Well, so what IBM was doing was giving it a bunch of texts, books, transcripts, conversations, feeding that into this machine. And so then the right answer was the most likely word to follow the preceding words. So it's like, it's just like, here's a giant stack of human talking. And in this giant stack, what's the most likely thing that would have been said next in this exact scenario? Exactly. That's right. And just one brief aside, because it's sort of fun, I think I have this right that a word is a big long list of like 13,000 numbers. What? The computer has to turn a word, one word, just like a one word into 13,000 numbers. Yeah. And so like in the way that a pixel value in the circle example was like, basically a zero or a one, it's like every word is this list of 13,000 numbers. It's so weird that it like that's that's the simpler version for it. Right. Now let me turn it into this like a phone book of numbers, which is it. And again, like which points to how these things are so not us. Yeah, they're really not. Not at all. Wow. But they're using us though, right? Like it's our talk that's getting turned into numbers. And it literally does it one word at a time. So after it's written the first word of its response, it just does the whole process over again. It takes all the words in your question plus the first word it predicted, sends all that through the network again. And then it just predicts the next word after that. Sends that through those bulbs again. And then the next word after that does the whole thing again and plays the same game over and over and over. And one of the words in its vocabulary is the like in the conversation token. So it like it has some notion of when to stop, but the active stopping is itself just one more prediction. It's it's one more probability in that big list of things that should happen next. And as I said, this is how they were doing it all the way back in the 80s. And I mean, if you interacted with a chatbot, even in like the 20 teens, this is the way they were doing it as well. Really? Do you have any recollection of when you first came in contact as one? Oh, God. Um, I feel like it must have been one of those like customer service bots on a website kind of thing. And I'm sure not just because it's a customer service experience, but because it was an early chatbot experience, it, it wasn't very good. No, no, no, terrible. No, terrible. The big part of why they were bad was they had difficulty dealing with longer stretches of of of of text. This is Stephen Levy at a large word. He's been covering this stuff for. Yeah, yeah, I mean, I mean a long time. I published the book in 1992 called Artificial Life. I was two years old, by the way. Thanks for that. Sorry. Yeah, yeah, thanks. And he says because it predicted words one at a time and one after the other, the longer the question or the longer the answer, the more likely it was to miss or lose the larger meaning. And so eventually predict a word that just doesn't make sense or or is out of place. Exactly. Huh. And so just to give one very concrete example to illustrate it like the sentence, what sound does my dog make when I slam the door? It's like, that's so free. I can see why that would be confusing. Right. And to somehow know that in that sentence, dog is really the operative to right here. The important noun, it's not I or door. Right. Right. Right. And so in 2017, this guy, you know, Oscar E. who worked at Google, set out to solve this dog door problem. He thought that the thing should be able to figure out, oh, this is the most important part of the sense. This is what I should pay attention to. And now the question becomes like, how the heck does one go about doing? And what they figured out was the problem here is we're giving it one word at a time and we're having it predict one word at a time. And what we need to be able to do instead is have it somehow process the sentence as a whole so that, you know, something at the end of the sentence can sort of feed back on the weight or meaning it gives to something at the beginning of the sentence. And one way that you can just imagine it doing this is that instead of just making a prediction and giving an answer, you need to take in all the information, make a prediction, but then just like set that aside because you're going to take in all that information again. And then we're going to send it through again and again and again, each time focusing on a different word in the sentence, generating a different possible prediction before landing on some final prediction, which God willing would be bark. It's like the computer simultaneously lives in the multiverse of that sentence where each word in that sentence is the most important. Yeah. And like I've looked at this stuff for months and I still don't totally understand exactly how a machine does this. But I mean, something like that and also, you know, you can say no, you can tell me I mean, I mean, I in the role of sense, yeah, that's the idea. Like the complexity here, you can see it's going through the roof here, like where you're like, oh God, this is so much more computing you need to do. Totally. And this was a big barrier for a long time. I mean, that's why these chatbots were almost as bad in the early 2000s as they were in the 1980s. And this is where we get to the next step in the evolution of our little alien friend here, which as many evolutionary leaps are was mostly a hardware upgrade. I mean, if you have been following the news about AI at all, you've probably heard this term GPU, GPU's components that go into data centers or the company in video, the most valuable company in history that makes these things. It's story, of course, wrapped up in the frenzy around the future of artificial intelligence. These things in this company have been at the center of the conflict between China and the US when it comes to export controls. The idea here is for the US to kind of limit the ability for China to catch up when it comes to AI. And interestingly, what these GPU's, these graphical processing units were originally designed for was computer games, video games, things like that. And what they're really good at is just doing a bunch of different math problems all at once. Exactly. It's just all about multiplying and adding numbers as fast as you can. There's some other things, but like by and large, like just do those two things and we're off to the races and doing these math problems all at once, which is called parallel processing. That's exactly what these learning machines needed to do. Some version of that super complicated multiverse prediction thing we discussed. Sure. Sure. So with these GPU's and this new parallelized architecture that Google named a transformer, all of a sudden, they could get a machine to parse those longer sentences and give at least reasonable answers to more complicated questions. All right. But what really sent these AI chatbots into the stratosphere was a kind of knock on effect of this parallel processing. Because when you can process everything at the same time in parallel, you can actually train on a lot more material in the same amount of time. And so eventually they just gave it basically the entire internet. Almost everything we humans have ever said on the internet as it's training material and started sending that through this network of light bulbs and wires that was just unimaginably big, like to get a sense. In our smaller example with the circle, there's something like a thousand and some odd parameters, right, a thousand or so of those wires. GPT-3, which was kind of done by today's standards, but it came out had 175 billion parameters. 175 billion things that could be tweaked. Yeah. And the other ones that we have now, they're trillions of parameters. And as they said, basically all the things we humans have ever said on the internet into this thing. Throwing way more training examples and way more compute than anyone would reasonably think to do slowly, they started to notice that with a sufficiently large amount of data, on a sufficiently large model, run with sufficiently many cycles of training, these new computers do seemingly intelligent things. Now, a lot of what I just described was written up in a paper called Attention is All You Need. And these findings are really what unlocked these large language models like Chatchy P.T. And that's all it was really intended for. But there was a passage in there saying we think this can work for images and video. And indeed, that turns out to work. That same basic model of massive parallel processing with tons of input, that could predict the next part of an image or sound. The moment civilization was transformed. And that moment, that realization, is really what triggered the explosive proliferation of artificial intelligence, different kinds, practically different species of AIs that we are living amongst today. New artificial intelligence systems change that can teach themselves superhuman skills. Chatchy P.T. Three. G.P.T. Four. I think Apple intelligence. Stolly. An app called lensa. Bard. It's called mid-Journey. Text to video art generated by. It's crazy. Look at this. So I don't know what AI it is they're using. Yes, it feels like an episode of Black Mirror. So it's like, it's like all of these different apps doing all of these different things and all these different mediums. They're taking in a huge amount of examples. And then they're using fancy math to basically predict the next word, the next pixel, the next note. And from that, it's like generating this whole huge diversity of new stuff. Yeah, basically. And I mean, it's also just as we described doing something that I don't totally understand that's more holistic than just looking at the thing that happens next. But it is drawing on the examples it's been given to decide what should happen next, which suddenly sounds not so simple. And it does send you into a spiral because it's like, it's like, is what I do any different from that just spewing out some iteration of everything else I've seen before this. Yeah, but first of all, you're not pulling from the whole internet, right? Like you have to depend on just the limited things you've experienced or can even maybe remember. That's fair. And you're like math is also just way sloppier. It's not as accurate. Yeah. So at that point, and maybe we shouldn't even go here, but there's this one other thing that you can control in these models, which is called the temperature, which is like this final knob you get to tweak on the thing. And so if you have, I think it's if you have the temperature all the way down, it will give you the most likely thing to come next. If you turn the temperature up a little bit though, it then is going to pick like the second or third most likely thing to come next. So you can control like how precise you want the math. Like you can you can say I wanted a little stanky. Yeah, like there's a little bit of randomness in it then that it's then acting upon in what it does next. So maybe you just want the temperature turned up on like every third word. So that there's this almost spontaneous feeling serendipitous creation, active creation that that comes out of this rigid math. Like it's like something startlingly creative might just be a less right answer, a less right answer. Wow. Yep. And just by doing that, it's going to keep doing stuff that we are going to get increasingly uncomfortable with. Yeah. Like right right now there is an AI generated song on the billboard country charts. Really? I didn't hear about that. Like if that's the case, I see no way that eventually a fully AI generated film won't hit the box office like that that's just going to happen. But when it happens, it will be only because of all of this math. To me, I think the thing that makes me it makes me realize is when you see under the hood, that you see is less like something spooky and ethereal. Yeah. Like there are times when it gets spooky when like there will be a time like I'll be listening to like an AI generated podcast and then one of the hosts breathes. And I'm like, wait, that's so weird. Like it doesn't even need oxygen. Why is it breathing? And now it's like, oh, because you know that like that's just the next statistical thing that would come in that sentence is a breath. That to me, that to me is like it's much less eerie because you can see where I got it from. Right. But well, okay, I do have one bit more for you because I don't know, I still found myself wondering how it will feel as these things get better and better. And in particular, what it'll feel like in the moments we sit across from it and it is better than us at something we have spent our lives working on, that it is better than us at something we truly love. Yeah, many, many people or also all my friends tell me like, wow, you are the first professional go player be famous because you lost the game. No. Right. Yeah, it's me. And so I got in touch with this guy. And I'm a professional go player. So three times European champion. So real quick, go. It is an ancient Chinese game considered probably the most complicated board game in the world to teach a computer to play because of just how open ended it is. All you really need to know is you are trying to control as much of the board as possible. You go back and forth with your opponent placing one stone at a time and you control portions of the board or territory by either like cordoning off sections of it or encircling your opponent's stones. It's very simple idea, but it's difficult because with such simple rules, there are just this crazy number of ways the game can play out. In fact, folks like to say that there are more possible ways for a go game to go than there are atoms in the known universe. Yes. Anyhow, back to fun. I remember I discovered a girl aged like six in my school in Xi'an. And I feel something. Oh, this game I can play and I progress very quick. One year after I learned go for my school, I'm number one. Three years after that, I'm in the best team in the province. And not long after that, I stopped my school. I only learn go game. I mean, for years. Every day, only thing you do is just a play go game. 12 hours. 12 hours a play. Yeah, it's no joke. Around age 15, he went pro. And somewhere along the way, he says he noticed this almost magical quality of the game. Go for me, it's like mirror. A mirror. Mirror? Yeah. Because when you play, you can see you mind on the board. He says all the choices you make, whether you're aggressive and attack or are patient and waiting, you know, in a sense, how you think stares back up at you. Exactly. Print. Mind print. And your opponent's mind, he says, it's printed there too. So I play with someone. I don't know him. I never talk with him. I play one game. I know him. This is a magical. But this mirror of his, well, it was about to get shattered. 2015, that Mr. Hassabes. Researcher at Google. Some email like we have some very exciting Go project. Can you go to our office visit? We will show you our project. I tell you, yes, okay, why not? And what they showed him was this thing called AlphaGo. It was a computer that had learned how to play the game and they asked him, like, will you play against this? So I tell him, okay, we complete the geyser because I will win. It's just a program. What can you do? You can win with me. Never. It's like zero percent chance to win this. Zero percent. And why were you so confident? Because I know the best program this moment, I can give six stone handicapped. Handicapped. Handicapped. Handicapped. Handicapped. Handicapped. Got it. Got it. Got it. Got it. Got it. So how you can possibly make the technique make the huge difference just months? It's impossible. A month later in this windowless office room, fun faced off with this computer and its human stone placing helper in a best of five game match. That first game, all the game, I feel good. I think I will win. But and of a game with just a few stones left to play. Oh, I still bit. I make some mistake and I lost my first game, okay? You know, he's thinking I was sort of arrogant going into this. I was overly confident. So next game, I will be careful. I will play more seriously. I will win the game. So the next day, next game, he sits down at the board, starts carefully placing his stones and it's looking good on the board. But inside his head, I feel something really difficult. Very difficult. Very difficult. I like fight, but Africa don't fight with me. And if I want to take something, Africa gave me very easily. Looking down at the board, he was not able to see his opponent's mind in the way he always had. No. There was no bravery. There was no subterfuge that he could sense. I see Africa want to this. Africa want to that. But why? He want to this. You cannot find it. You can't. And so he didn't know how to respond to it. His mind started to race. Good move, bad move. Good move, what mean? Bad move, what mean? Good move, what's in my teacher? The good move, what's in my student? Everybody, all my friends. And he realized that with all these emotional pushes and pulls that eventually. I will make mistake. But Africa, no. Never. When you think about this, the confidence is crash. It's crash, all crash. And I lost again, very, very badly. And I lost again for third, fourth and the last one. Yeah, damn. But you know, this experiment is really good for me. This is the moment I really see myself. Really, you think Africa taught you to be more... Myself, yeah. I think this is Africa teaching me about that. And why? Because I see myself. So it's like Africa teach me that our life, where we'll always lost, lost, lost, lost. Sorry, it's real life. It's our life. I think this is a human. This is important for us. I think what he saw in that game as he was losing. Was kind of what you were saying about seeing under the hood, making AI less spooky. Like you could see it wasn't magic. It was math with no mistakes. Right. And when he saw himself, you know, like not being the perfect go player in any given moment or in every given moment. Like that's what makes him a person. A person who could love something but still lose at it. Maybe feel bad about that. And then use that feeling to figure out what to do next. Today I teach in the go in China with still that. Right, why are you teaching go with the computer will always win? And yes, yes, but be careful. Because I think that all you experiment to learn is still useful. So don't worry. It will be coming. You can do nothing. Accept it and just learn. Yeah, I get that. Before we wrap this thing up, I wanted to put all of this in front of someone and not not an AI person, but somebody with a really wide scope on technology and history. And so I went to this guy. Tom Malini, a professor of modern Chinese history at Stanford University. I worked with him years back on a story about typing in Chinese. And he's just one of the most thoughtful and informed people I know. Yeah, that means a lot to me. So how would you respond? Well, I mean, every day life is at its core a study of this awful, amazing, horrifying, never ending surprise of what it means to be born and live and die as a human. And even if at the end of the day an AI is orders of magnitude smarter, AI, just my definition, cannot suffer and rejoice and live and die in quite the same way that humans can. In the same way that we cannot live and die and suffer and comprehend and feel the way an octopus can. I mean, the only thing an AGI will be able to do is contemplate my goodness, what does it mean to be an AI? And so I am not worried at all about what AI means with regard to meaning human identity, what it means to be human or any of that. Well, that was very beautiful. And while I love that, I'm still like, but this is going to mess everything up so badly. I don't know if this is going to get weird down to the fabric. But fast forward this, you know, 20, 30 years if we're still around at a sort of climate change level. And another future human is sitting in this fabric altered world. It will still be a group of humans rejoicing, suffering. It will still be that condition. And so it's kind of a, for me, it's a little bit of a liberatory time. It's a great, maybe we'd get to free up a little bit more space to get back to work thinking about how to be human. Because we have not, we have not even come close to solving that issue. So, I'm going to have to go back to work thinking about how to be human. And then I'm going to have to go back to work thinking about how to be human. And then I'm going to have to go back to work thinking about how to be human. And then I'm going to have to go back to work thinking about how to be human. So, I'm going to have to go back to work thinking about how to be human. And then I'm going to have to go back to work thinking about how to be human. And then I'm going to have to go back to work thinking about how to be human. And then I'm going to have to go back to work thinking about how to be human. Special thanks to Stephanie Yinn and the New York Institute of GO for teaching us the game to Mark Daria and Leavon, to Barbara Svenich. And of course, thank you to Grant Sanderson for his unending patience, explaining the math of neural nets to us. Grant is kind of like your favorite math nerds, favorite math nerd. His YouTube channel is 3 Blue 1 Brown. Check it out. This story was reported and produced by Simon Adler with original scoring and sound design by Simon Adler, which brings me to the last unsavory thing I have to say, which is goodbye to Simon Adler, who happens to be one of our best reporter producers here at the show and also a friend. He's going off to, among other things, pursue his music career and this was his last episode on staff with us. Chances are if you list out your favorite episodes from the last 11 years at the show, more than a couple will be his. Could be some of the tech stories he did. He did stories about drones in Ukraine, about content moderation on Facebook, could be some of the international stories he did. He reported about the hunt for an endangered rhino in Namibia. He did a story about a species of raccoon in the Caribbean islands of Guadalupe. He did a lot of stories about democracy as well, covered a town, Sena Kanna Braska that voted itself out of existence. He did a story back in 2017 about a New York City Council race where the campaign manager was a little known guy named Zora Mumbani. Besides being a killer reporter, not to mention composer and interviewer, Simon has also spent so many hours coaching an entire generation and staffers and interns. He's so generous with his expertise and his time, really someone who makes everyone around him better. Anyway, we have been so lucky to have him as part of our nerdy band for 11 years. Check out his band, Windstar Enterprises on Instagram. If that's Simon and another fellow former radio lover, Alex Overington. We already miss you, Simon, and good luck out there. Oh, you want me to say this? That's fun. Hi, I'm Cordelia and I'm from New York City and here are the staff credits. Radio Lab is hosted by Lulu Miller and La Tif Nasser. Soren Wheeler is our executive editor. Sarah Sandbach is our executive director. Our managing editor is Pat Walters. Dylan Keef is our director of sound design. Our staff includes Simon Adler, Jeremy Bloom, W. Harry Fortuna, David Gabel, Maria Paz Gutierrez, Sindun Yannasam Bandhan, Matt Kielsi, Mona Madgavkar, Annie McEwan, Alex Niesen, Sarah Cari, Anisa Vizza, Arian Wack, Molly Webster, and Jessica Young. With help from Rebecca Rand, our fact checkers are Diane Kelly, Emily Krieger, Anna Pujol, Mazzini, and Natalie Middleton. Leadership Support for Radio Lab Science Programming is provided by the Simon Foundation and the John Templeton Foundation. Foundational Support for Radio Lab was provided by the Alfred P. Sloan Foundation. Every day, WNYC Studios is working to get closer to New York and to New Yorkers. The underwriting we get from businesses helps power our independence. Learn how your organization can join in at sponsorship.wnyc.org.