What's Your Problem?

The AI Pioneer Developing New Kinds of Medicine

47 min
Jul 17, 20259 months ago
Listen to Episode
Summary

Jakob Uszkoreit, co-inventor of the Transformer model that powers ChatGPT and modern AI, discusses how he developed the breakthrough 'attention is all you need' concept at Google in 2017. He now runs Inceptive, a company using AI to design new RNA-based medicines by creating training data and applying machine learning to molecular design.

Insights
  • The Transformer breakthrough came from reimagining language processing - instead of reading sentences word by word sequentially, the model reads everything simultaneously and refines understanding through multiple passes
  • Google's decision to openly publish the Transformer research rather than keep it proprietary was based on a belief that advancing AI would 'lift all boats' and benefit everyone
  • The lack of training data is the biggest bottleneck for applying AI to drug discovery - while there are 200,000 known protein structures, there are only ~1,200 known RNA structures
  • AI-designed therapeutics are moving from theory to practice, with the first fully AI-designed molecules entering clinical trials within the next five years
  • Biology's solutions are often more robust and resilient than human-designed alternatives, suggesting AI may be better suited than traditional approaches for understanding biological complexity
Trends
AI-designed therapeutics moving from research to clinical trialsShift from sequential to parallel processing in AI model architecturesGrowing focus on RNA-based medicines following mRNA vaccine successNeed for cryptographic certification of human-generated contentAI democratization of personalized educationFoundation models being adapted for biological applicationsGenerative AI expanding beyond language to images, proteins, and moleculesIncreased concern about AI manipulation versus existential riskData generation becoming as important as algorithm developmentOpen research culture shifting toward more proprietary approaches
Companies
Google
Where Uszkoreit developed Transformers; made strategic decision to publish research openly
Inceptive
Uszkoreit's AI drug discovery company focused on designing RNA-based medicines
OpenAI
Built ChatGPT using Transformer architecture; shifted from open to closed research model
Nvidia
Manufactures GPUs that enable parallel processing crucial for Transformer models
DeepMind
Developed AlphaFold2 that solved protein folding problem using machine learning
Khan Academy
Example of impressive AI applications in democratizing education
Pushkin
Podcast network producing the show
iHeart
Podcast platform and advertising network mentioned in show sponsorships
People
Jakob Uszkoreit
Co-inventor of Transformers, CEO of Inceptive, main guest discussing AI and drug discovery
Jacob Goldstein
Host of What's Your Problem podcast interviewing Uszkoreit
Ilya Sutskever
Co-author on Transformer paper who was leaving Google, enabling risk-taking on the project
Neal Stephenson
Author of 'The Diamond Age' featuring AI tutor concept that inspired Uszkoreit
Quotes
"If I were gonna pick one paper from the past decade that had the biggest impact on the world, I would choose one called Attention Is all you need, published in 2017."
Jacob Goldstein
"What if instead of reading the sentence one word at a time, from left to right, we read the whole thing all at once?"
Jakob Uszkoreit
"We had the best number and we also at that point were able to establish that we've gotten there with about 10 times less energy or training compute spend."
Jakob Uszkoreit
"DNA is merely the place where life takes its notes. Maybe the hard drive and the memory. It's the book, right? It's the book."
Jakob Uszkoreit
"We still haven't built machines that can fix themselves, which is fundamentally the miracle of being a human being."
Jakob Uszkoreit
Full Transcript
4 Speakers
Speaker A

This is an iHeart podcast.

0:00

Speaker B

Guaranteed Human this dog salon operational excellence thanks to genius from global payments, scheduling, personalized checkouts, instant, absolutely genius big league reliability for any business. That's genius.

0:02

Speaker C

Running for office means having something important to say. Radio is the most personal medium out there.

0:19

Speaker D

Voters hear the real you exactly the.

0:24

Speaker C

Way you want to be heard.

0:26

Speaker D

No filters, no AI. Just your voice, your message.

0:27

Speaker C

And it's one tenth the time and cost of video.

0:31

Speaker D

Don't just campaign. Connect with millions all over the country.

0:34

Speaker C

Even thousands in the smallest communities with radio.

0:38

Speaker D

Call now and be on the air.

0:40

Speaker C

In just 48 hours.

0:42

Speaker D

844-844-Iheart. That's 844-844-Iheart.

0:43

Speaker A

Run a business and not thinking about podcasting? Think again. More Americans listen to podcasts than ads. Supported streaming music from Spotify and Pandora. And as the number one podcaster, iHeart's twice the next two combined. Learn how podcasting can help your business. Call 844-844-IHeart.

0:49

Speaker D

Pushkin. If I were gonna pick one paper from the past decade that had the biggest impact on the world, I would choose one called Attention Is all youl need, published in 2017. That paper basically invented Transformer models. You've almost certainly used a transformer model if you have used ChatGPT or Gemini or Claude or Deep Seq. In fact, The T in ChatGPT stands for transformer. And Transformer models have turned out to be wildly useful, not just at generating language, but also at everything from generating images to predicting what proteins will look like. In fact, Transformers are so ubiquitous and so powerful that it's easy to forget that some guy just thought them up. But in fact, some guy did just think up Transformers. And I'm talking to him today on the show. I'm Jacob Goldstein and this is what's yous Problem? The show where I talk to people who are trying to make technological progress. My guest today is Jakob Uskereit. And just to be clear, Jakob was one of several co authors on that Transformer paper. And on top of that, lots of other researchers were working on related things at the same time. So a lot of people were working on this. But the key idea did seem to come from Jakob. Today. Jakob is the CEO of Inceptive. That's a company that he co founded to use AI to develop new kinds of medicine. And the company is particularly focused on rna. We talked about his work at Inceptive in the second part of our conversation. In the first part, we talked about his work on Transformer models at the time he started Working on the idea for Transformers. This is around a decade ago now. There were a couple of big problems with existing language models. For one thing, they were slow. They were in fact so slow that they could not even keep up with all the new training data that was becoming available. A second problem, they struggled with what are called long range dependencies. Basically, in language, that's relationships between words that are far apart from each other in a sentence. So to start, I asked Jakob for an example we could use to discuss these problems and also how he came up with his big idea for how to solve them. So pick a sentence that's going to be a good object lesson for us.

1:11

Speaker C

Okay, so we could have the frog didn't cross the road because it was too tired.

3:40

Speaker D

Okay, so we got our sentence.

3:45

Speaker C

Yep.

3:47

Speaker D

How would the sort of big powerful but slow to train algorithm in 2015 have processed that sentence?

3:48

Speaker C

So basically it would have walked through that sentence word by word. And so it would walk through the sentence left to right. The frog did not cross the road because it was too tired.

3:56

Speaker D

Which is logical, which is how I would think a system would work. It's more or less how we read, right?

4:11

Speaker C

It's how we read, but it's not necessarily how we understand. That is actually one of the integral, I would say, for what we then, how we then went about trying to speed this all up.

4:17

Speaker D

I love that I want you to say more about it when you say it's not how we understand. What do you mean?

4:28

Speaker C

So on one hand, right, linearity of time forces us to almost always feel that we're communicating language in order and just linearly. It actually turns out that that's not really how we read. Not even in terms of our saccades, in terms of our eye movements. We actually do jump back and forth quite a bit while reading. And if you look at conversations, you also have highly non linear elements where there's repetition, there's reference, there's basically different flavors of interruption. But sure, by and large we would say we certainly write them left to right. You write a proper text, you don't write it as you would read it and you also don't write it as you would talk about it. You do write it in one linear order. Now, as we read this and as we understand this, we actually form groups of words that then form meaning. So an example of that is adjective, noun, or say in this case an article noun. It's not a frog, it's the frog. Right. We could have also said it's the green frog or the Lazy frog.

4:34

Speaker B

Right?

5:46

Speaker D

Language has a structure, right? And things can modify other things and things can modify the modifiers.

5:46

Speaker C

Exactly, exactly. But the interesting thing now is that structure as a tree structured, clean hierarchy only tells you half the story. There's so many exceptions where statistical dependencies, where modification actually happens at a distance.

5:53

Speaker D

So, okay, so just to bring this back to your sample sentence, the frog didn't cross the road because it was too tired. That word, it is actually quite far from the word frog. And if you're an AI going from left to right, you may well get confused there. Right? You may think it refers to road instead of to frog. So this is one of the problems you were trying to solve, and then the other one you were mentioning before, which is these models were just slow because after each word, the model just recalculates what everything means. And that just takes a long time.

6:10

Speaker C

They can't go fast enough. Exactly. It takes a long time. And it doesn't play to the strengths of the computers of the accelerators that we're using there.

6:44

Speaker D

And when you say accelerators, I know Google has their own chips, but Basically we mean GPUs now, right? We mean GPUs, we mean the chips that Nvidia sells. What is the nature of those particular chips?

6:53

Speaker C

Exactly. So the nature of those particular chips is that instead of doing a broad variety of complex computations in sequence, they are incredibly good. They excel at performing many, many, many simple computations in parallel. And so what this hierarchical or semi hierarchical nature of language enables you to do is instead of having, so to speak, one place where you read the current word, you could now imagine you actually read every. You look at everything at the same time, and you apply many simple operations at the same time to each position in your sentence.

7:05

Speaker D

So this is the big idea. I just want to. This is the big idea because this is it, right? This is the breakthrough happening.

7:52

Speaker C

Yes.

7:57

Speaker D

It's basically what if instead of reading the sentence one word at a time, from left to right, we read the whole thing all at once?

7:58

Speaker C

All at once. And now the problem is clearly something's got to give, right? So there's no free lunch in that sense. You have to now simplify what you can do at every position when you do this all in parallel. But you can now afford to do this a bunch of times after another and revise it over time or over these steps. And so instead of walking through the sentence from beginning, beginning to end, an average sentence has like 20 words or so. Average sentence in prose, instead of walking those 20 positions. What you're doing is you're looking at every word at the same time, but in a simpler way. But now you can do that maybe five or six times, revising your understanding. And that turns out, is faster, way faster on GPUs. And because of this hierarchical nature of language, it's also better.

8:06

Speaker D

So you have this idea, and as I read the little note on the paper, it was in fact your idea. I know you were working with the team, but the paper credits you with the idea. So let's take this idea, this basic idea of look at the whole input sentence all at once, a few times, and apply it to our frog sentence. Give me that frog sentence again.

8:59

Speaker C

The frog did not cross the road because it was too tired. Good.

9:19

Speaker D

Tired is good because that's unambiguous. Hot could be either one. It could be the road or the frog.

9:24

Speaker B

Right?

9:28

Speaker C

Hot could be. Hot could be either one. Exactly. Yes. In fact, hot could actually be either one. And non referential. And non referential. Because it was too hot outside.

9:28

Speaker D

Outside, it could be any of three things. The weather or the frog or the road.

9:37

Speaker C

Exactly.

9:41

Speaker D

I love that. Tired solves the problem. So your model, this new way of doing things, how does it parse that sentence? What does it do?

9:42

Speaker C

So basically, let's look at the word it and look at it in every single step of these, you know, say, a handful of times repeated operation. Imagine you're looking at this word it. That's the one that you are now trying to understand better. And you now compare it to every other word in the sentence. Okay? So you compare it to the to frog, to did not cross the road because too. And tired was too tired. And initially in the first pass. Already, a very simple insight the model can fairly easily learn is that it could be strongly informed by frog, by road, by nothing, but not so by to, or by the. Or maybe only to a certain extent, by was. But if you want to know more about what it denotes, then it could be informed by all of these.

9:54

Speaker D

And just to be clear, that sort of understanding arises because it has trained in this way on lots of data. It's encountering a new sentence after reading lots of other sentences with lots of pronouns with different possible antecedents.

11:00

Speaker C

Yeah, exactly, exactly. So now the interesting thing is that which of the two it actually refers to doesn't depend only on what those other two words are. And this is why you need these subsequent steps, because let's start with the first step. So what now happens is that, say, the model identifies frog and Road could have a lot to do with the word it. So now you basically copy some information from both frog and road over to it. And you don't just copy it, you kind of transform it also on the way, but you refine your understanding of it. And this is all learned. This is not given by rules or in any way pre specified. Right.

11:15

Speaker D

Just by training on logical.

12:03

Speaker C

Just by training, this emerges.

12:05

Speaker D

And so sort of the meaning of it after this first step is kind of influenced by both frog and road.

12:07

Speaker C

Yes, both frog and road. Okay, so now we repeat this operation again and we now know that it is unsure. Or the model basically now has this kind of superposition, right? It could be road, it could be. Could be frog. But now in the next step, it also looks at tired. And somehow the model has learned that when it means something inanimate, that tired is not the thing. And so maybe in context of tired, it is more likely to refer to frog. And now you know, well, it is more likely. Or now maybe the model has figured out already, maybe it needs a bit more, a few more iterations, that it is most likely to refer to frog because of the presence of tired.

12:14

Speaker D

So it has solved the problem.

13:00

Speaker C

But it has solved the problem.

13:02

Speaker D

So you have this idea, you try it out. There's a detail that you mentioned that's kind of fun and we kind of skipped it, but you mentioned that another one of the co authors who has also gone on to do very big things, was about to leave Google when you sort of want to test this idea. And that fact that he was about to leave Google was actually important to the history of this idea. Tell me about that.

13:04

Speaker C

That was important. So this Ilya Pulasuchin, at the time that this started to gain any kind of speed, Ilya was managing a good chunk of my organization. And the moment he really made the decision to leave the company, he had to wait ultimately for his co founder and for them to then actually get going together in earnest. And so he had a few months where he knew, and I also knew that he was about to leave and where the right thing would of course be to transition his team to another manager, which we did immediately, but where he then suddenly was in a position of having nothing to lose and yet quite some time left to play with Google's resources and do cool stuff with interesting people. And so that's one of those moments where suddenly your appetite for risk as a researcher just spikes, right? Because for a few more months you have these resources at your disposal. You've transitioned your responsibilities at that stage, you're just like, okay, let's try this crazy shit. And that's literally, in so many ways, was one of the integral catalysts because that also enabled this kind of mindset of, we're going for this now. Whatever the reason, it still affects other people. And so there were others who joined that collaboration really, really early on who I feel were much more excited and as a result, much more likely to really work on this and to really give it their all because of his nothing left to lose. I'm going to go for this attitude at this point.

13:27

Speaker D

Was there a moment when you realized it worked?

15:17

Speaker C

There were actually a few moments. And it's interesting because on one hand, it's a very gradual thing initially. Actually, it took us many months to get to the point where we saw significant first signs of life, of this not just being a curiosity, but really being something that would end up being competitive. So there certainly was a moment when that started. There was another moment when we for the first time had one machine translation challenge, one language pitch of the W and T task, as it's called, where our score, our model performed better than any other single model. The point in time when I think all of us realized this is special was when we not only had the best one in one of these tasks, but in multiple. And we didn't just have the best number. We also at that point were able to establish that we've gotten there with about 10 times less energy or training compute spend.

15:19

Speaker D

Wow. So you do 1/10 the work and you get a better result.

16:23

Speaker C

1/10 the work and you get a better result. Not just across one specific challenge, but across multiple, including the hardest or one of the harder ones. And then at that stage, we were still improving rapidly. And then you realize, okay, this is for real. There's. Because it wasn't that we had to squeeze those last little bits and pieces of gain out of it. It was still improving fairly rapidly to the point where actually, by the time we actually published the paper, we again reduced the compute requirements not quite by an entire order of magnitude, but almost. It still was getting faster and better at a pretty rapid rate. So in the paper, we had some results that were those roughly 10x faster on 8 GPUs, and what we demonstrate in terms of quality on those eight GPUs, by the time we actually published the paper properly, we were able to do with one GPU.

16:27

Speaker D

One GPU, meaning one chip of the kind that people buy 100,000 of now to build a data center.

17:25

Speaker C

Exactly.

17:32

Speaker D

So the Paper actually at the end mentions other possible uses beyond language for this technology. It mentions images, audio and video, I think explicitly. How much were you thinking about that at the time? Was that just like an afterthought or were you like, hey, wait a minute, it's not just language.

17:32

Speaker C

By the time it was actually published at a conference, not just the preprint, by December, we had initial models on other modalities on generating images. We had the first, at that time they were not performing that well yet, but they were rapidly getting better. We had the first prototypes actually of models working on genomic data, working on protein structure.

17:53

Speaker D

That's good foreshadowing.

18:16

Speaker C

Good foreshadowing as well. Exactly. But then we ended up, for a variety of reasons, we ended up at first focusing on applications in computer vision.

18:17

Speaker D

The paper comes out, you're working on these other applications, you're presenting the paper, it's published in various forms. What's the response like?

18:26

Speaker C

It was interesting because the response built in deep learning AI circles, basically between the preprint that I think came out in, I want to say June 2017 and then the actual publication, to the extent that by the time the poster session happened at the conference, there was quite a crowd at the poster. So we had to be shoved out of the hall in which the poster session happened by the security and had very hoarse voices by the end of the evening.

18:35

Speaker D

You guys were like the Beatles of the AI conference.

19:12

Speaker C

I wouldn't say that because we weren't the Beatles, because it was really, it was still very specific.

19:16

Speaker D

You were more of the cool hipster band. You were the IT hipster band.

19:22

Speaker C

Certainly more of the cool hipster band. But it was an interesting experience because there were some folks, including some greats in the field who came by and said, wow, this is cool.

19:26

Speaker D

What has happened since has been wild.

19:36

Speaker C

It seems wild to say the least. Yes.

19:40

Speaker D

Is it surprising to you?

19:43

Speaker C

Of course. Many aspects are surprising, for sure. We definitely saw pretty early on already back in 2018, 2019, that something really exciting was happening here. Now I'm still surprised by, with the advent of ChatGPT, something that didn't go way beyond those language models that we had already seen a few years before, was suddenly the world's fastest growing consumer product ever, right?

19:46

Speaker D

I think. Ever.

20:20

Speaker C

Ever. Yes.

20:21

Speaker D

And by the way, GBT stands for Generative Pre Trained Transformer, right? Transformer is your word.

20:22

Speaker C

That's right.

20:29

Speaker D

So there's an interesting, I don't know, business side to this, right? Which is you were working for Google when you Came up with this. Google presumably owned the idea.

20:30

Speaker C

Yep.

20:41

Speaker D

Had had intellectual property around the idea.

20:41

Speaker C

Has filed many a patent.

20:44

Speaker D

Was it just a choice Google made to let everybody use it? Like when you see the fastest growing consumer product in the history of the world, not only built on this idea, but using the name like, and it's a different company that was five years later. Five years later, but a patent's good for more than five years. Is that a choice? Is that a strategic choice? What's going on there?

20:45

Speaker C

So the choice to do it in the first place, to publish it in the first place is really based on and rooted in a deep conviction of Google at the time. And I'm actually pretty sure it still is the case that it is. Actually these developments are the tide that floats all boats, that lifts all boats.

21:07

Speaker D

Like a belief in progress.

21:31

Speaker C

A belief in progress, exactly.

21:33

Speaker D

It's a good old fashioned belief in now.

21:34

Speaker C

It's also the case that at the time, organizationally, that specific research arm was unusually separated from the product organizations. And the reason why Brain, or in general the deep learning groups were more separated was in part historical. Namely that when they started out there were no applications and the technology was not ready for being applied. And so it's completely understandable and just a consequence of organic developments that when this technology suddenly is on the cusp of being incredibly impactful, you're probably still underutilizing it internally and potentially also not yet treating it in the same way as you would have maybe otherwise treated previous trade secrets, for example, because it.

21:36

Speaker D

Feels like this out there research project, not like what's going to be this consumer product.

22:31

Speaker C

Exactly, exactly. And to be fair, it took OpenAI, in this case a fair amount of time and to then turn this into this product. And most of that time it also from their vantage point, wasn't a product. Right. So up until all the way through ChatGPT, OpenAI published all of their GPT developments. Maybe not all, but you know, a very large fraction of their work on this.

22:39

Speaker D

Yeah, their early models, the whole models were open.

23:07

Speaker C

Exactly. They were more true to their name really also believing in the same thing. And it was only really after ChatGPT and after this, to them also surprise to a certain extent, success that they started to become more closed as well when it comes to scientific developments in this space.

23:09

Speaker D

We'll be back in just a minute. Coffee Genius here. Most people see a busy cafe, but I see precision at every step thanks to Genius from global payments transactions, Instant inventory, Precise operations in sync. Absolutely. Genius from sold out crowds worldwide to managing the morning rush. Genius keeps operations running smoothly. One Portado Flawless, Poor, Perfectly timed. Just beautiful. Big league reliability for any business. That's genius. Sometimes work can just feel like a series of distractions, distracted from distraction by distraction. As the poet said, it's tempting to use AI to keep up even a good idea. But there's a trick. If you use AI, you gotta use it to help you, not replace you. And everything you send out, all your emails, all your messages, they still have to be you. They still have to sound like you, not like generic AI.

23:33

Speaker C

Slop.

24:42

Speaker D

Grammarly gives you one place to think, write and finish your work where you already write. It's loaded with agents that help you sound natural and engaging and like you. Grammarly has a writing surface with specialized agents that fit your workflow, built for every step of the process. And it works seamlessly across hundreds of thousands of apps and websites. So your support's always there when you need it. This is AI that works with you, not over you. In a world of generic AI, don't sound like everyone else. With Grammarly, you never will. Download Grammarly for free@Grammarly.com that's Grammarly.com run.

24:42

Speaker A

A business and not thinking about radio. Think again. Because more people are listening to the radio and iHeart today than they were 20 years ago. And only iHeart broadcast radio connects with more Americans than TV, digital, social media, any other media, even twice as many teens than TikTok. And that reach means everything. Just think about the universal marketing formula. The number of consumers who hear your message times the response rate equals the results. Now let's get those results growing for your business. Radio's here now more than ever. And iheart's leading the way. Think radio can help your business. Think iheart streaming, podcasting and radio where the reach is real. Let us show you@iheartadvertising.com that's iheartadvertising.com or call 844-844-Iheart one more time. Just call 844-844-iheart and get radio working for you.

25:20

Speaker D

Let's talk about your company. When did you decide to start inceptive?

26:22

Speaker C

The decision took a while and was influenced by events that happened over the course of about three months, two to three months in late 2020, starting with the birth of my first child. So when Amre was born, two things happened. Number one, witnessing a pregnancy and a birth during a pandemic where there's a pathogen that's rapidly spreading. And so all of that was a pretty daunting experience and everything went great. But having this new human in my arms also really made me question if I couldn't more directly affect people's lives positively with my work. And so I was at the time quite confident that indirectly it would have effect also on things like medicine, biology, et cetera. But I was wondering, couldn't this happen more directly if I focused more on it? The next thing that Happened was that AlphaFold2 results at CASP14 were published. CASP14 is this biannual challenge for protein structure prediction and some other related problems.

26:26

Speaker D

This is the protein folding problem and.

27:38

Speaker C

This is the protein folding problem. Exactly.

27:40

Speaker D

The machine learning, solving the protein folding problem, which had been a problem for decades, given a chain of amino acids predict the 3D structure of a protein.

27:42

Speaker C

Precisely.

27:49

Speaker D

And humans failed and machine learning succeeded. Just amazing.

27:50

Speaker C

Yes, it's a great example. Humans failed despite the fact that we actually understand the physics fundamentally. But we still couldn't create models that were good enough using our conceptual understanding of the processes involved.

27:54

Speaker D

You would think an algorithm would work on that one. Right? You would just think an old school set of rules, like we know what the molecules look like, we know the laws of physics. It's amazing that we couldn't predict it that way. Right. All you want to know is what shape is the protein going to be. You know, all of the constituent parts, you know, every atom in it. And you still couldn't predict it with a set of rules. But AI machine learning could. Amazing.

28:08

Speaker C

Yes. And it is amazing actually, when you put it like this. It's important to point out that when we say we understand it, we make massive oversimplifying assumptions because we ignore all the other players that are present. When a protein folds, we ignore a lot of the kinetics of it because we say we know the structure, but the truth is we don't know all the wiggling and all the shenanigans that happen on the way there. Right. And we don't know about, you know, chaperone proteins that are there to influence the folding. We don't know around all sorts of other.

28:33

Speaker D

I'm doing the physics one. I'm doing the assume a frictionless plane version of protein.

29:04

Speaker C

Precisely.

29:08

Speaker D

Which is why it did work.

29:09

Speaker C

Precisely. And the beauty is that deep learning doesn't need to make this assumption. AI doesn't need to make this assumption. AI just looks at data and it can look at more data than any human or even humanity eventually could look at together. It's such a good example problem to demonstrate that these models are Ready for primetime in this field and ready for lots of applications, not just one or two, but many.

29:10

Speaker D

Sold.

29:32

Speaker C

And so that happened. Sold. Exactly. And then the third thing was that these Covid MRNA vaccines came out with astonishing 90 plus percent efficacy out of the gate. How fast it out?

29:33

Speaker D

Still so underrated. At the beginning of the pandemic, people were like, it'll be two or three years, and if they're 60% effective, that'll be great.

29:49

Speaker C

Exactly, exactly. And so everybody forgets. Everybody forgets it. And when you look at it, this is a molecule family that was, for most of the time that we've known about it, since the 60s, I suppose we've treated it like. Like an addicted stepchild of molecular biology.

29:56

Speaker D

Because you're talking about RNA in general.

30:12

Speaker C

RNA in general.

30:15

Speaker D

Yeah. Everybody loves DNA, right? DNA is.

30:16

Speaker C

Everybody loves DNA.

30:19

Speaker D

Movie star.

30:19

Speaker C

Yeah, exactly, exactly. Even though now looking back, DNA is merely the place where life takes its notes. Maybe the hard drive and the memory.

30:20

Speaker D

It's the book, right?

30:30

Speaker C

It's the book. But at the end of the day, it was this molecule family that was about to save, depending on the estimate, tens of millions of lives and in rapid time. So all these things hold, but we have no training data to apply anything like alphafold to this specific molecule family. No training data to speak of. We had 200,000 known protein structures at the time, I believe, maybe, optimistically, we had maybe 1200 known RNA structures. And on top of that, it was also fairly clear that for rna, going directly to function would be much, much more important because it's, in a certain sense, a weak, less strongly structured molecule and other aspects of the molecule might play a bigger role. And then on top of that, the attention that generative AI was receiving overall also now in the field of pharma or of medicine, was building. And so I ended up finding myself in a conversation where a very, and, say, wise longtime mentor of mine pointed out that, you know, maybe 10 years from now or so, somebody could tell my daughter that there was this perfect storm where this macromolecule with no training data was about to save the world and could do so much more in the direction of positively impacting people's lives. We didn't have training data. It would be very expensive to create it. But using the technology that I've been, or technologies that I've been working on for the last, I don't know, 10 plus years, and the ability, because of the attention that people were now giving to AI in this field, the ability to raise quite a bit of money I, in that position, chose to stay back at my cushy dream job in big tech and not actually take this opportunity to really positively impact people's lives. And that idea was not one I was willing to entertain.

30:31

Speaker D

You couldn't just coast it out at Google and let somebody else go figure out rna.

32:27

Speaker C

Yeah. And it's not just rna. I think RNA is a great starting point at the end of the day. But building models that learn from, first of all, all the publicly available data that we can possibly get our hands on, but also from data that we can reasonably effectively create in our own lab. How to design molecules for specific functions is something that now is within reach and that will in the next years and in the years to come, have completely transformational impact on how we even think about what medicines are. That any opportunity to speed this up, to make this happen, even just a day sooner than it could have otherwise happened, is incredibly valuable, in my opinion.

32:32

Speaker D

As you're talking about this idea that the absence of training data is kind of seems. Seems to be at the center of it. Right. Seems to be the core problem. Which makes sense. Right. Like the reason language works so well is basically because of the Internet. I know now we're going beyond it, but like it just happened to be that there was this incredibly giant set of natural language that became available. We don't have anything like that for rna. So are you. I mean, it's kind of step one at Inceptive, creating the data. Is that kind of what's happening?

33:17

Speaker C

So step one at Inceptive is learning to use all the data, or was. I think we've made a lot of progress in that direction, learning to use all the data that is available already and identify what other data we're missing and then see how far we can get with just the publicly available data and at the same time scale up, generating our own data. And it turns out that actually because of the nature of evolution, because of how evolution isn't actually incentivized to really explore the entire space of possibilities. It is almost always a given that if you are trying to design exceptional molecules, especially ones that are not, say, natural formats, you are basically guaranteed to need novel training data.

33:51

Speaker D

Yeah, basically you're saying you build RNAs that don't exist in the world, that have therapeutic uses, and there's no, kind of definitionally no training data for that, because they don't exist.

34:40

Speaker C

The funny thing is we have a few of them and so we have existence proofs of RNA molecules, for example, RNA viruses that actually exhibit incredibly Complex different functions in ourselves that do all sorts of things that we don't usually like. But if we could use those for good, if we could use those in ways that would actually be aimed at fighting disease rather than creating them, those kinds of functions, even just a small subset of them, would really transform medicine already, and so we know it's possible.

34:49

Speaker D

What are you dreaming of when you say that? What are you thinking of specifically?

35:26

Speaker C

Okay, so, for example, one estimate is that in order for Covid to infect you, you would need potentially as few as five Covid genomes inside your organism. That's already five.

35:29

Speaker D

Five viral particles.

35:44

Speaker C

Five viral particles?

35:45

Speaker D

Yeah.

35:46

Speaker C

You. You inhale those, you wouldn't have to inject it. You wouldn't even have to swallow it. You inhaled them.

35:47

Speaker D

What if we could have a medicine that worked as well as a disease, is a version of your dream?

35:54

Speaker C

Exactly, exactly. So at the end of the day, right, this medicine is able to spread in your body only into certain types of organs and tissues and cells. It does certain things there that are really quite complex, changing the cell's behavior. Again, not usually, in this case in favorable ways, but still in ways that wouldn't have to be modified that much in order to potentially be exactly what you would need for complex multifactorial medicine. And if you could make all of that happen by just inhaling five of those molecules, then again, that would completely change how you think about medicine. But you have viruses that aren't immediately active, but that are inactive for long periods of time in your organism, and only under certain conditions, say under certain immune conditions, really start being reactivated. Why can't we have medicines that work in a similar way where you actually, not only in a vaccination sense, but where you take a medicine for a genetic predisposition for a certain disease that you are able to take a medicine, medicine that you can take, and that waits until the disease actually starts to develop. And only then, and only where that disease then starts to develop, becomes active, and actually fights it, and potentially also then alarms the doctor through a blood.

35:58

Speaker D

Test, like for cancer cells or something. So you have some kind of prophylactic medicine in your body, and it is encoded in such a way that it just hangs out there like herpes, to take a pathological example, for example. And only in certain settings does it do anything. And those settings are, if you see a cancer cell, destroy it, otherwise just sit there.

37:13

Speaker C

Precisely. And if you can design those also in ways where you can just make them all go away, when you take, say, a completely harmless Small molecule. And that's again entirely feasible.

37:35

Speaker D

Sure. So I mean, you're dreaming big. These are wonderful, big science fictiony dreams and I hope you figure them out on a practical level. What's happening at the company right now, how many people work there, what are they doing and what have they figured out so far?

37:46

Speaker C

We're around 40. What we're doing is really exactly what we just talked about. We're basically scaling data generation experiments in our lab that allow us to assess a variety of different functions of different mostly RNA molecules, actually mostly MRNA molecules at the moment that are relevant to a pretty broad variety of different diseases. And so this ranges from things like infectious disease vaccines to cell therapies that can be applied in oncology or against autoimmune disease. We have MRNA's that we hope will eventually be effective in enzyme replacement as enzyme replacement therapies for families or a large family of rare diseases. And the list goes on. And so we're creating this or growing this training data set that, that eventually on top of foundation models that we pre trained on, all publicly available data allow us to tune those foundation models towards designing exceptional molecules for exactly those applications and many more sharing similar properties.

38:00

Speaker D

So you basically build new MRNA molecules and test them and then you give that data to your model and. And presumably it tells you what to build next or it helps you figure out what to build next. It's sort of a loop in that way.

39:14

Speaker C

The models are definitely one interesting source for proposals if you wish for what to synthesize and test next. They're not the only such source. So we basically also explore kind of in maybe less guided or heuristically guided ways, but. Exactly. So in some of the cases it's really quite iterative for some of those functions and for some of those modalities and diseases or disease targets, we're actually already at a point where our models can spit out entirely novel molecules that really are unlike anything they've ever seen or we've ever seen in nature that very consistently perform quite favorably compared to pretty strong baselines by incumbents in the field?

39:28

Speaker D

When you say perform quite favorably compared to baselines by incumbents in the field, I mean, does that on some level mean better than what experts would think.

40:13

Speaker C

Up, better than what experts can think up, and also better than more traditional machine learning tools can easily produce?

40:22

Speaker D

It's like that famous moment in the Go match when AlphaGo made some move that like no human being ever would have thought of.

40:30

Speaker C

37 yes. So I would say we've long passed the move 37 in the sense that our understanding of the underlying biological phenomena is so incomplete that for most of the things that we're able to design for, we don't really understand why they happen.

40:37

Speaker D

When you say weed, you mean at inceptive or do you mean just medicine in general?

40:58

Speaker C

I would say just medicine in general.

41:02

Speaker D

Okay. So inceptive is doing this very kind of high level work. Right. I mean, building what will hopefully be the foundation. What's the right amount of time in the future to ask about when will we know if it works, you think? Five years.

41:04

Speaker C

So the general idea of using generative AI and similar techniques to generate therapeutics. There are some things in clinical trials that were largely designed with AI, as far as I know. We're still maybe now we have the first trials just now starting for molecules that were truly entirely designed by AI.

41:19

Speaker D

As opposed to sort of selected from a library.

41:46

Speaker C

Selected, influenced, exactly, selected, adjusted, tuned, tweaked, et cetera. Right. So that's really still only happening just now.

41:49

Speaker D

Okay.

41:58

Speaker C

But we will see, I believe, the first success or a first success of such molecules certainly within the next five years.

41:58

Speaker D

What about more narrowly the project at inceptive?

42:07

Speaker C

It's a similar time frame. We should be able to get molecules into the clinic in the next few years, certainly in the next handful of years. Now, these will not be molecules where the objective that we used in their design is even remotely as complex. You know, kind of the different functions that we're designing for are not going to be even remotely as diverse as say, what you would find in. Because we used this example earlier in RNA virus, these will really be more, you know, simpler. Those will be molecules that don't do things that we couldn't possibly have done before, but that do them much better in ways that are more accessible, in ways that come with less side effects.

42:10

Speaker D

What biotech largely is is they make protein drugs. And so if you could make an MRNA drug where you put the MRNA into the body and the body makes the protein, it wouldn't be some crazy sleeper cell that sits in your body for 20 years or whatever, but it might be a more practical alternative to today's biotech drugs.

42:59

Speaker C

Absolutely.

43:18

Speaker D

So you've had a kind of crash course in biology in the last few years.

43:19

Speaker C

Yes.

43:23

Speaker D

And I'm curious, like, what is, what is something that has been particularly compelling or surprising or interesting to you that you have learned about biology?

43:23

Speaker C

There are countless things. The biggest one, or the red thread across many of them is really just how effective life is at finding solutions to Problems that on one hand are incredibly robust, surprisingly robust, and on the other hand are so different from how we would design solutions to similar problems that really, this comes back to this idea that we might just not be particularly well equipped in terms of cognitive capabilities to understand biology, that basically we would never think to do it this way. And how we think to do it is oftentimes much more brittle.

43:32

Speaker D

Brittle is an interesting world. Less resilient, less able to persist under different conditions.

44:28

Speaker C

Exactly, exactly. I mean, we still haven't built machines that can fix themselves, for one, which.

44:33

Speaker D

Is fundamentally the miracle of being a human being.

44:38

Speaker C

Just fundamentally the miracle after going through all this. Exactly, exactly, exactly. And of course, this is true across the scales, from single cells all the way to complex organisms like ourselves. And really just how many. Also very different kinds of solutions life has found and, or constantly is finding. And you see this all over the place. And it's both daunting, humbling, but also incredibly inspiring when it comes to applying AI in this area. Because again, I think that at least so far, it's the best tool and maybe actually the only tool we have so far in face of this kind of complexity, really design interventions that medicines that go way beyond what we were able to do or are able to do just based on our own conceptual understanding.

44:41

Speaker D

We'll be back in a minute with the lightning round.

45:43

Speaker B

This dog salon. Operational excellence thanks to genius from Global Payment Scheduling, Personalized checkouts, Instant, absolutely genius. Big league reliability for any business. That's genius.

45:58

Speaker A

Run a business and not thinking about radio. Think again, because more people are listening to the radio on iHeart today than they were 20 years ago. And only iHeart broadcast radio connects with more Americans than TV, digital, social, any other media, even twice as many teens than TikTok. And that reach means everything. Just think about the universal marketing formula. The number of consumers who hear your message times the response rate equals the results. Now let's get those results growing for your business. Radio's here now more than ever, and iheart's leading the way. Think radio can help your business. Think iheart Streaming, podcasting and radio. Where the reach is real. Let us show you@iheartadvertising.com that's iheartadvertising.com or call 844 844, iheart one more time. Just call 844-844, iheart and get radio working for you at Charmin. We heard you shouldn't talk about going to the bathroom in public, so we decided to sing about it.

46:13

Speaker D

Light a candle, pour some wine. Grab a roll the soft kind for a little me time. Charmin Ultra Soft Smooth Tear Wavy edges for my rear so let the softness caress your soul. Just relax, you're on a roll. Let her rip. Charmin Ultra Soft Smooth Tear.

47:18

Speaker A

Charmin Ultra Soft Smooth Tear has the same softness you love now with wavy edges that tear better than the leading one. Ply Brand Enjoy the go with Charmin.

47:36

Speaker D

Let's finish with the Lightning Round. As an inventor of the transformer model, are there particular possible uses of it that worry you? Make you sad?

47:45

Speaker C

I am quite concerned about the P. Doom. Doomerism, whatever you want to call it, existential fear instilling rhetoric that is in some cases actually also promoted by people, by entities in this space.

47:56

Speaker D

So just to be clear, you're not worried about the existential risk, you're worried about people talking?

48:14

Speaker C

I'm worried about the existential risk being inflated or the perception being inflated to the extent that we actually don't look enough at some of the much more concrete and much more immediate risks. I'm not going to say that the existential risk is zero. That would be silly.

48:19

Speaker D

What is a concrete and immediate risk that is, you think under discussed?

48:42

Speaker C

These large scale models are such effective tools in manipulating people in large numbers already today, and it's happening everywhere for many, many different purposes by in some cases benevolent and in many cases malevolent actors that I really firmly believe we need to look much more at things like enabling cryptographic certification of human generated content, because doing that with machine generated content is not going to work. But we definitely can cryptographically certify human generated content as such.

48:47

Speaker D

Basically watermarking or something. Some way to say a human made this. Exactly what would you be working on if you were not working in biology.

49:24

Speaker C

On drug development, education, using artificial intelligence to democratize access to education.

49:33

Speaker D

What have you seen that has been impressive or compelling to you in that regard?

49:41

Speaker C

There are lots of little examples so far and really countless. It's what's happening at the Khan Academy. There are many examples of AI applied to education problems in places like China, for example. You have a bunch of very compelling examples in fiction, a book I really like by a guy named Neal Stephenson, the Diamond Age or Young Ladies Illustrative Primer that I recommend if you just.

49:46

Speaker D

Everybody in AI talks about that.

50:11

Speaker C

Well, now they do. Yeah.

50:13

Speaker D

Yeah. Well, now they do. You liked it before? It was cool, I'm sure.

50:15

Speaker C

At one point I thought it was really, really important to ensure that Neil Stevens knows that we are about to be able to Build the primer. And so I ended up having coffee with him to tell him, oh, that's great. So at the end of the day, maybe the biggest inspiration there is my daughter. She's four and a half now and I think she could today read. She could read okay, but she could read grade school level if she had access to an AI tutor teaching her how to read.

50:18

Speaker D

Does your daughter use AI use AI chatbots?

50:55

Speaker C

Not directly without me, but we've actually used ChatGPT to implement an AI reading tutor that works reasonably well. I mean we basically, you know, kind of as they call it now, vibe coding. We vibe coded and Amway wasn't there for all of it. Took some time, but she was there for some of it.

51:00

Speaker D

Oh, you vibe coded it with her?

51:20

Speaker C

Yeah, well, I mean she was there. She witnessed a good chunk of it, yes. Although she was more interested in the image generation parts. But yeah, we have a sketch of one that she quite enjoys. That's kind of like the extent of her at this age using AI directly.

51:22

Speaker D

Jakob Usgereit is the CEO and co founder of Inceptive and the co author of the paper Attention Is all youl Need. Just a quick note, this is our last episode before a break of a couple of weeks and then we'll be back with more episodes. Please email us at Problemushkin fm. We are always looking for new guests for the show. Today's show was produced by Trina Menino and Gabriel Hunter Chang. It was edited by Alexander Garrison and engineered by Sarah Brugiere.

51:44

Speaker A

This plant shop a perfectly balanced ecosystem thanks to genius from Global Payments, Tracked inventory, seamless payments and reviews in one place. Big league reliability for your business. That's genius.

52:24

Speaker B

There are moments in each of our lives that seem to change everything. An unexpected diagnosis, the sudden end of a relationship, the loss of a job. As our lives veer off course, it can feel like time is dividing into a before and an after. I'm Dr. Maya Shankar, a cognitive scientist and my new book, the Other side of who We Become When Life Makes Other Plans is all about how we navigate these inflection points. The Other side of Change pairs singular real life stories with scientific insights to help us find meaning in the tumult of change. What if we saw the hardest moments in our lives not simply as something to endure, but as an opportunity to reimagine who we can be. I'm thrilled to share that Booklist gave the Other side of Change one of its coveted starred reviews, saying it's impossible not to be moved. The Other side of change is out. Now get your copy today, wherever you like. To buy books.

52:39

Speaker A

This is an iHeart podcast. Guaranteed Human.

53:39