The AI-Powered Biohub: Why Mark Zuckerberg & Priscilla Chan are Investing in Data, from Latent.Space
Mark Zuckerberg and Priscilla Chan discuss the 10-year anniversary of the Chan Zuckerberg Initiative and their pivot to focus primarily on AI-powered biology through the Biohub network. They announce the acquisition of Evolutionary Scale and their vision for creating virtual cells that could revolutionize precision medicine by enabling N-of-1 treatments tailored to individual biology.
- The convergence of frontier AI labs with frontier biology labs creates unprecedented opportunities for scientific discovery when designed to work in sync rather than separately
- Building comprehensive biological data sets now is critical infrastructure for future AI models, similar to how decades of protein data enabled AlphaFold's success
- The traditional scientific funding model fails to bring together interdisciplinary teams needed for major breakthroughs, requiring new institutional approaches like the Biohub model
- Precision medicine's future lies in moving from clinical trial-and-error to predictive modeling based on individual genetic variants and cellular behavior
- Private philanthropy plays a unique role in funding long-term, high-risk scientific research that government and traditional funding mechanisms cannot support
"Our mission is to cure, prevent all diseases. And that's not going to happen just in our four walls. So the strategy has to be how do we make every single scientist and everyone better and more effective?"
"I think part of what we're trying to unlock here with biohub is the idea of what happens if you do frontier biology and frontier AI in sync together"
"That's the future I want to live in, where we can actually understand individuals as individuals and use the biology and science very directly to keep them well"
"When you ask biologists, there's a lot of questions around, okay, that's really ambitious. Are we going to be able to do that? And then when you ask AI people, it's like, that should be really easy."
Hello and welcome back to the Cognitive Revolution. Today I'm excited to share a special crossover episode from the Latent Space Podcast. Latent Space Surveys say is the number one podcast for AI engineers, and I find hosts Swix and Alessio a consistently outstanding source of insight into the latest trends in AI powered coding and AI application development. Today's episode, though, is a bit different A conversation with Mark Zuckerberg and Priscilla Chan, who are celebrating the 10 year anniversary of the Chan Zuckerberg Initiative about why they are doubling down on the interdisciplinary Bio Hub with the goals of leading a new era of AI powered biology and ultimately equipping scientists to cure or prevent all disease in the coming decades. In this conversation, Mark and Priscilla describe their perspective on the current state of biology, the role they see AI playing going forward and the strategy underlying their investments, with highlights including why the traditional funding model fails to bring scientists, engineers and AI experts together to tackle the most important problems in the way we might hope their vision for a frontier biology lab that works in sync with a frontier AI lab the acquisition of evolutionary scale creators of leading protein model ESM3 and the appointment of CEO Alex Rivas to lead the combined program Their plan to develop new data collection techniques which will naturally give rise to massive data sets on which new AI models can be trained the roadmap to a virtual cell capable of simulating biological responses in silico, potentially revolutionizing not just drug discovery but our understanding of biology in general and their ultimate vision for precision medicine, moving from clinical trial and error to true n of 1 Treatments designed based on each individual's unique biology. While the conversation itself focuses on the intersection of AI and biology, for me it also serves as an important reminder of the unique role that private capital often plays in scientific progress and, considering the current moment, the importance of classical liberal values more broadly. Ironically, for all, we hear that the US must win the AI race to ensure that the best AI models project American rather than Chinese values around the world. I see actors across the political spectrum pushing America toward a more Chinese model of state dominance. The civil rights violations and abuses of power we're seeing right now from the federal government are plainly un American, and I've been glad to see prominent voices in the AI space, including Jeff Dean at Google, Dario and Chris Ola at Anthropic, and various researchers at OpenAI speaking up against them. If anything, I think the AI industry ought to consider doing more, starting by signaling that they would be willing to withhold their technology from a government that proves itself unworthy to wield such power. But at the same time, and certainly to a much lesser degree right now, I do also worry that recent proposals for confiscatory taxes, if enacted, would make moonshots like the biohub much rarer. I do agree, of course, with Warren Buffett that it's absurd that he pays a lower tax rate than his secretary. But at a time when the federal government is cutting research budgets and generally acting against medical advice, society stands to benefit tremendously when self made tech billionaires turn their formidable talents and immense resources to solving global problems and providing public goods. And even more generally, as fears of concentration of AI power grow, it only seems prudent that society should maintain a diverse set of independent power centers which can hopefully balance and exist in equilibrium with one another. Such checks and balances were of course central to the framer's vision for the US Government, and in my view, they remain essential for societal dynamism and resilience today. With that, I'm grateful to SWIX for allowing me to cross post this conversation. I of course recommend subscribing to the Latent Space feed where they've just brought on new hosts to cover AI for science on a dedicated basis. And I hope you enjoy this preview of the future of AI driven biology and medicine with Mark Zuckerberg and Priscilla Chan from the Latent Space Podcast.
0:00
Hey everyone, welcome to the Latent Space Podcast. This is Alessio, Founder of Kernel Labs and I'm joined by Swix, Editor of Latent Space.
4:28
Hello. We're so delighted to be in the Imaging Institute of CZI with literally C and Z.
4:34
Welcome Mark and Priscilla, thanks for having us. Thanks for getting nerdy.
4:40
Yeah, we're excited to do this.
4:43
We so don't often get to see this side of you and so thank you for taking some time out to talk about this. And it's like sort of the 10 year anniversary kind of of CZI, so I just wanted to introduce people if people have not been caught up. One of the interesting things that we found out just from talking to your teams. There's an interesting difference between how you guys started CZI and the Gates foundation and I heard that Bill Gates is a mentor of yours, so maybe you could tell that story of deciding to start CZI and deciding to pursue basic science instead of translational work.
4:45
Well, I mean I think one of the core things for us with CZI was just getting started earlier or we got some advice that basically philanthropy and doing science just like any other discipline requires practice and you're not going to be good at it overnight. So we should just kind of dig in and start doing a few different iterations on it and see what we enjoy and where we think we can have an impact and go from there. So, yeah, I mean, like you mentioned, I mean, this is. We're coming up on in November, the 10 year anniversary of when we started CZI. And you know, there's a lot of work that we're really proud of that we've been a part of, including work in education and supporting communities. But when we reflect on it, we feel like the work that we've done in science really has had the biggest impact and in a lot of ways is accelerating. And especially with all the advances in AI that are coming, I think the ability to have an even bigger impact over the coming decade, it just seems really clear like this is coming into focus. So for the next period, we really want to make science the main focus of what we're doing. And specifically the biohub organization that we're really proud of, this model that we've helped pioneer that we can go into detail on is really going to be like the main focus of our philanthropy. And it's just something that we're very excited about.
5:19
Yeah, when we started 10 years ago, we had this idea like, okay, I bring experience as a physician. Mark's an engineer and he builds things. And we have an opportunity to give back resources to make an impact on this world. And we sort of just, we tried a bunch of things. And the thing that in running a philanthropy, I'm incredibly envious of people who run companies is that like, you guys can have a dashboard and there's like financial results and people tell you if you're on the right track, on the wrong track, and there's clarity. But in philanthropy, there's so much you can do and it takes a long time for you to get a sense of like, what has momentum, what are we doing that is actually bringing all of our both skills and resources to maximal impact. So over the past 10 years, I would say we've been getting a sense of what is that thing that really allows us to have the impact and makes the most of what we bring to the table. And it's really been around AI and biology where we're like, oh my gosh, this is it. And you know, the ecosystem is big. We really think our ability to bring great scientists, great AI researchers together between the wet lab and the compute, the ability to bring physicians and patients into the picture, that's a unique niche for us at the biohub. And that's, you know, we need others to take the work to translation. The Gates foundation has a strong focus on translation and the field. And we have had a number of really awesome collaborations and continue to where we really look at sort of the basic fundamental research. And being able to partner with someone who's thinking about the translation layer is incredible.
6:41
We kind of see the first decade, and I would love to get your take as a decade of creating data, creating a science ecosystem, and then starting to work on some of the models. And the next decade maybe is more of the applied modeling side. At what point did you decide that just doing the tooling was better versus you could have cured malaria in Africa too, or some other disease?
8:30
Yeah, I mean, take a step back. And this is kind of related to your first question too. Like Priscilla was saying, the space is huge. You know, there are lots of other philanthropies, including Gates, who I think they would say that they're primarily focused on public health and sort of administering like once you know what a cure is, just getting it out to the world is a huge thing too. And someone needs to do that. And that's a lot of work and a lot of resources, and it's good that they're doing that. Basic science is another completely different part of the kind of innovation funnel to enable that. And our view is that the federal government basically dwarfs everyone else in terms of how much they invest through nih. But there's a certain pattern to how they invest, which is really enabling a lot of individual investigators to do work. And our kind of observation was that if you look at the history of science, a lot of major advances are basically preceded by new tools or new ways of observing things. So the initial telescope allowed a lot of advances in astronomy. The microscope, the invention of that allowed a lot of understanding of biology. And similarly, I think we're at a point in history where a lot of new tools are being built, computational tools, tools to instrument the body in different ways and understand things. And often those tool development just takes a longer term timeframe and sometimes a larger commitment of capital. Including the way to do it isn't necessarily just to make grants to a lot of different people. You need to really operate it yourself. Which I think is one thing that's different about the way that we've operated than others is most times when you think about philanthropy, you think about giving money away in terms of grants. And a lot of what we're doing is actually building up these institutes and building labs to do that kind of research ourselves by bringing in leading scientists and engineers and all that. But that's kind of the strategy. We feel like there's a lot of new tools to develop. There's sort of been a hole in the ecosystem where tool development and kind of the 10 to 15 year Runway that you need to do that and often hundreds of millions of dollars to build things like the microscopes and imaging that you're seeing in this institute here. I think that that's been sort of underfunded and that's where we think that if we do that kind of work, it can just give all these other scientists way more tools to accelerate the pace of research, hopefully discover cures and. And then you have folks who are focused on public health who bring that out to the world and kind of deploy it to everyone.
8:54
Yeah, I mean our mission is to cure, prevent all diseases. And that's not going to happen just in our four walls. So the strategy has to be how do we make every single scientist and everyone better and more effective? And you know, the strategy Mark talked about is sort of where we landed on how to actually maximally move the field forward.
11:24
Yeah.
11:46
The mission is cure, prevent all diseases. By the way, a lot of people outside of the CZI world are still kind of find this concept very alien, but talking to the CZI people, they really truly believe it. And it's impressive how you pick the right mission to motivate everyone to work towards this enormous task.
11:47
Well, it's kind of a funny thing. We like to talk about the mission as like helping scientists do it.
12:03
Right.
12:09
Because we're not actually curing the diseases, we're just trying to build the tools.
12:09
Tools, data models.
12:12
Yeah, like basically accelerating scientists work towards that. But you know, a funny thing about it is we had this initial time frame of by the end of the century. And you know, when you ask biologists, there's a lot of questions around, okay, that's really ambitious. Are we going to be able to do that? And then when you ask AI people, it's like, that should be really easy. Like, why are you so unambitious that you're shooting for just the end of the century? And I do think that at the pace that AI is improving things, I think it might be possible significantly sooner than that. I mean, I don't think it's necessarily worth putting a number on it or a date. But I think that to your point about the first decade was sort of about doing work like the cell atlas to be able to help understand basically all of the kind of specifics and data about all the different configurations of every cell in the body. When we did that, we kind of had this vague notion that that would be useful to advance science. But I think that, like a lot of people in the tech industry, we have even been impressed by how quickly AI has accelerated. But that ended up being a really valuable thing to have done over the last 10 years, especially for where AI is now, and now the models that can get built with that.
12:14
But the thing that's interesting, don't you agree, is like, okay, so from a. I totally agree that in our intersection of AI and biology, the AI folks are like, yep. The biologists are like. And I think it's actually that confluence of conversations that lead both the biologists to be like, okay, I'm really uncomfortable about this idea and timeline, but if I'm really pinned down to think about it, what are. Like, you really force people to think through? Like, okay, what are actually the barriers? What would you need to do? And you're forcing that conversation from the biologist side and from the AI side, really getting a sense of, okay, data is not just data. You guys know this. You need to know sort of how the data was collected and from where. And being able to connect the AI researchers to the folks who are actually gathering the data on a daily basis makes their work better. And so it's that conversation that's happening here that I think makes people outside so excited about this because it's credible, and they sort of have worked, really dug in and thought through how that would work, and they're excited and they believe. And believing is the first step.
13:28
Believing is the first step. There's a general pattern of software eating the world, and I think AI eating the world is kind of like the next version of this. I was talking with Garrett outside, who says he's a biologist, but I think he's using models like Sam from Meta.
14:43
You're like, you don't look like only a biologist.
14:55
What does a biologist look like?
14:58
I don't know.
15:00
I'm just working on models out there. And that's like, biologists are working using models, right? They're not just in imagination, like, just using in the wet lab.
15:00
Yeah, totally. Yeah. I think one of those things that referencing the wet lab is, the key approaches that you're pursuing is turning things. Pursuing the virtual cell, turning things from mostly wet lab into something in Silico. How far along are we?
15:09
I mean, it's pretty early, right? I mean, I think the first step, which I think is easy to overlook is basically what Priscilla was talking about of just getting these folks together. It almost. It's worth taking a beat just to talk about this, just because I think most people assume that this is like, obviously you would go do that, but it's somewhat novel in science because of, I think, the way that a lot of funding has been done that is basically you grant individual teams, relatively small grants, and people do a lot of science independently. It is, I think, pretty amazing how much progress you can make if you just have people from different disciplines sit together. I mean, this is like over my career, both at Meta and here. It's like you have teams that are not working together for some reason or they disagree on something. It's like, okay, physically just have them next to each other. And it actually is super helpful. So here, what are we doing? It's not just bringing together the biologists and the engineers, which was a core part of the initial biohub model, but it was also unlocking the ability for people to work together across institutions. So the first biohub that we started out here between Stanford, UCSF and Berkeley, allowed a lot more collaboration between scientists and engineers at those universities than was in practice happening before. And it's like, you can look at this and be like, all right, that seems really obvious. But it actually was sort of an interesting and novel experiment and one that I'm really happy to see others also implementing, because I think it's just such a clear win, just the kind of the human side of bringing people together and having them sit together. So anyway, that I would say is kind of step one or step zero and is probably quite overlooked, but is sort of a fundamental part of the model that I guess also goes back to this idea of we're not just kind of granting funds to other people, we're building an institution and we're having people sit together. So then you get that, and then you get these people who are like half biologist, half AI engineer, because they kind of have some experience doing it. And I don't know. I mean, we can talk through the specific models, and there's a lot of exciting stuff there, but I'd say it's an early glimpse of where this is all going. I think you want to kind of build up these models hierarchically, so you give them a lot of data about specific proteins, and they can model specific proteins in the cells, and then you can model different cell behavior. And then eventually you kind of zoom out and you're modeling a virtual immune system or something like that. And it's sort of hard to simulate the immune system without having a good understanding of how a cell might work. And it's kind of hard to understand or simulate how a cell might work if you don't really understand how the proteins interact. So you kind of need systems that understand data at all different levels of this and then you kind of pull them together. And then if you look at the different models, there are versions that are kind of focused on, all right, like which parts of the genome are kind of being expressed in different ways. The cryo model that I think is very interesting, that's built off of the data here. The only model that I'm aware of that's like a, a spatial model of like, of basically like how, how these cells work and, and you kind of, you just want to be able to look at stuff from different perspectives and then put them together and you build like a richer and richer model of, of kind of how these cells work. But we are definitely at the beginning of this journey.
15:24
But it's like slow and fast. Slow and fast, right? So when we built the human cell Atlas, we started 10 years ago, it was one of our first, first RFAs. And we actually, the first RFA was to fund the methodologies of how you would get a single cell transcriptome. And it took us about 10 years to get to a place where we now have one of the largest corpus of RNA transcriptomes. 125 million cells cost a lot of money. And the really cool thing we discovered through that process was if we could seed the effort and make it easy for people to contribute, it happened. That's cell by gene. We actually were responsible for maybe 25% of the data and the rest of the ecosystem contributed 70 of that. That's an incredible asset and has been very important in modeling work. Similarly, if you look at alphafold, they, they, they built off publicly available data that was collected for 30 years prior. Right. So that takes a long time. But now we're doing the billion cell project and that is taking months and at a fraction of the price. You know, really slow to fast, but it's a single dimension and cells are so complicated and here we're looking, like Mark said, at the three dimensional imaging str and it's slow and expensive, but with the cryo model it will get fast again and you just have to repeat it. And so I think we'll get growth spurts, but it's all happening just faster and faster. Hey.
18:43
We'll continue our interview in a moment after a word from our Sponsors want.
20:15
To accelerate Software development by 500% meet Blitzy, the only autonomous code generation platform with infinite code context purpose built for large complex enterprise scale code bases. While other AI coding tools provide snippets of code and struggle with context, Blitzi ingests millions of lines of code and orchestrates thousands of agents that reason for hours to map every line level dependency with a complete contextual understanding of your code base.
20:19
Blitzi is ready to be deployed at.
20:50
The beginning of every sprint, creating a bespoke agent plan and then autonomously generating enterprise grade premium quality code grounded in a deep understanding of your existing code base, services and standards. Blitzi's orchestration layer of cooperative agents thinks for hours to days autonomously planning, building, improving and validating code. It executes spec and test driven development done at the speed of computer. The platform completes more than 80% of the work autonomously, typically weeks to months of work, while providing a clear action plan for the remaining human development used for both large scale feature additions and modernization work. Blitzi is the secret weapon for Fortune 500 companies globally, unlocking 5x engineering velocity and delivering months of engineering work in a matter of days. You can hear directly about Blitzi from other Fortune 500 ctos on the modern CTO or CIO classified podcast or meet directly with the Blitzi team by visiting blitzi.com that's B L I T Z Y.com schedule a meeting with their AI Solutions consultants to discuss enabling an AI native SDLC in your organization. Today your IT team wastes half their day on repetitive tickets and the more your business grows, the more requests pile up. Password resets, access requests, onboarding all pulling them away from meaningful work. With Servl, you can cut help desk tickets by more than 50% while legacy players are bolting AI onto decades old systems. Servil was built for AI agents from the ground up, your IT team describes what they need in plain English and Servil AI generates production ready automations instantly. Here's the transformation A manager onboards a new hire. The old process takes hours pinging Slack, emailing it, waiting on approvals. New hires sit around for days. With Serval, the manager asks to onboard someone in Slack and the AI provisions access to everything automatically in seconds with the necessary approvals. It never touches it. Many companies automate over 50% of tickets immediately after setup, and Serval guarantees 50% help desk automation by week four of your free pilot. As someone who does AI consulting for a number of different companies, I've seen firsthand how painful manual provisioning can Be it often takes a week or more before I can start actual work. If only the companies I work with were using servl, I'd be productive from day one. Servil powers the fastest growing companies in the world like Perplexity, Verkada, Merkor and Klay. So get your team out of the help desk and back to the work they enjoy. Book your free pilot@serval.com cognitive that's S E R V A L.com cognitive how.
20:52
Do you think about the layers? So you have compute and we'll talk about that later. On the data side, you build these amazing microscopes. I learned that they're all built for you by spec. They're not off the shelf things design. How much of a bottleneck is that still? Like can we convert the world of atoms into bits now at the right acceptable or do we need, do we need more work on the microscopes themselves too?
23:45
I mean you're never done, right?
24:12
Yeah, well speed for here, speed has been a big question of how just getting the process through. So here we've worked on sort of the speed at which we can look at tomograms and the sort of contrast in resolution and that's where the laser phase plate comes in. So to be able to make the data better and faster to get the data. But it's a bottleneck in so much as there's only, I don't know the exact number. There are like maybe tens of these microscopes in the world. So that's one bottleneck and I think really is like when I was saying it's slow and then fast. There's so many other dimensions that we don't have yet of like the cool thing here is with the transcriptome work, we're looking at cellular expression and with the imaging work, you're able to localize it in space and now you want to connect. But that's still like two dimensions connected. Time is another dimension. We need to get dynamic imaging in place.
24:14
Oh God, that's so much resolution.
25:11
Yeah, right. But like really cool biological innovation. We need innovation in the way we can look at things like stain free, dye free, so we can look at things without sort of human intervention. With time as a dimension is another because like we are not frozen slices. So I think it's just continuously looking at what the next dimension. We want to be able to either understand deeply or connect to our existing corpus of data and knowledge.
25:14
And obviously the ideal would be you want to increasingly be able to image things inside living cells. Right. So I mean you can simulate it a bit by, okay, you can take a cell out or some culture. It's all destructive. Yeah, it's like, okay, it's living for a little bit or something. But I mean, you really want to be able to kind of as much as possible actually understand what's going on in living organisms.
25:44
Can that be done? Is there what, what it approaches?
26:06
Well, the better it gets.
26:08
Well, there's this cool methodology, so there is a really high intensity X ray methodology you can use. The organ has to be dead. So like you can just show shoot X rays, high intensity X rays at like a lung and understand at like a sort of molecular level how the lung is assembled. And then you can correlate that with living imagery. Right. MRIs of the lungs, CTs of the lungs and look at the associations between the living images in real patients with the sample that you put into the high intensity X ray. So that's another example of like correlating data types so that we can get that sort of high level specificity with clinical data that impacts humans.
26:09
But I mean in some level that's sort of the point about building these AI biological models is you can have a lot of data and you can interpolate on that space and understand that.
26:55
Yes.
27:06
So one of the models that again, I mean it's really early work, but the RBIO model, the idea of doing reasoning is that then you don't just get correlation, but you get some understanding, understanding of logic over how these things get together too. So yeah, I mean, I think it's probably going to be a while and people don't have great hypotheses on how you'd actually do molecular imaging of a cell deep inside a living organism. But the goal is to be able to approximate that as much as possible with this kind of surround view of different things that you can image.
27:07
You guys like to see cool stuff. It's not here, but at our San Francisco Skype we do image see through fish called zebrafish.
27:43
Zebrafish, yeah.
27:52
That's another. It's another good example. It's another good model. All right. It's like what's a good way to imagine a living thing? It's like take a see through thing.
27:53
Take a see through thing and then use a model to say how does this see through thing actually relate to us? Right. Like I'm like not that interested in curing disease, cure, prevent, manage all disease for zebrafish. I am very interested. Mark's Pro zebrafish. I'm okay on zebrafish, but you need to use Another application of large language models is looking at what is conserved and what is actually relevant and important to the way human biology works. In a fish model, being able to have that translation be more effective so we don't waste our time on things that won't apply in a model organism is another really interesting way to elevate biology.
28:00
On the data side, can you just give an overview of how far we are. Like, what percentage of all cells that we image and do we have? What's the distribution of them? When you say 150 million to 1 billion cells, is that a lot? Is that 10%?
28:43
The funny thing is, until recently we didn't know how many cell types.
29:00
Yeah, I mean, it's kind of a wild thing. I mean, this was a big part of the cell atlas project. Is like there wasn't even. It's kind of like imagine the periodic table in chemistry. But you, you know, it doesn't end well.
29:03
It's.
29:13
You don't have the squares.
29:14
We know it's billions. We know there are billions of cell types in a human and we've only truly looked at a fraction of them. And we looked at it in largely healthy cells. And so like just the number of permutations of like age, well, species, because not all research is in humans. Right. So species ancestries, like, what is your sort of genetic background? Age, like babies are different than old people, Gender, all of those things actually are permutations. Environmental exposures, all of those things are permutations on the cell that actually you want to be able to understand in healthy and disease states. I feel confident that we are at the beginning of this.
29:15
I'll ask a little bit of obvious question in terms of the intersection of AI and bio, which is don't we want precision in biology? Don't we want some grounding in a world model maybe that we don't normally get in a language model?
29:58
Yeah, I mean, I think that that's sort of the point of doing all the measurement and being able to have all this real. So you have the diffusion model for generating cells that we put out. It's like one of the recent models. And it's cool because you can basically, you have a model now that you can describe the conditions and it'll basically give you a synthetic cell. But yeah, you want it to be increasingly grounded. And that's a lot of the point of the biology and the engineering that we're doing is to be able to have these different facets of that. So the Imaging Institute is one part that gets you the spatial data that's very helpful. And the work that we're doing in the other biohubs on cellular engineering and instrumenting inflammation and things like that, it's basically, it's scientific work to build new types of tools that allow us to measure new types of things that generate data that allow us to ground the models in different ways. One framing that we have on this that I think is pretty interesting is that there's this concept of a frontier AI lab that is like, okay, it's building AI models that are sort of at the frontier of what's possible. And I think you can think about biology in that way too. And there's sort of a concept of a frontier biology lab. Like, what is the idea of labs that are kind of at the cutting edge of building the most advanced imaging, measuring inflammation, or doing cellular engineering in the most advanced ways, whatever the problem space is that you're at. And then I think that there's this interesting problem space of what happens if you're at the intersection of those two areas. You mentioned the work that DeepMind did on AlphaFold, which is great. That's an example of a Frontier AI lab using a data set that was just generated by other scientists over decades. But I think part of what we're trying to unlock here with biohub is the idea of what happens if you do frontier biology and frontier AI in sync together, and you're designing the tools on the frontier biology side in order to specifically collect and be able to learn types of data that you then want to feed into specific types of models that you want to build so that it can understand the cells and the body at different types of resolution. I think you can just kind of, I don't know, it's like a much more integrated approach that allows designing the things that you need that should eventually get towards more grounding and not just allowing folks who are good at AI to do the best they can with whatever biological data happens to be available.
30:13
What's the hill climbing in this scenario? So, like with language models, you have benchmarks. You look at the benchmark, you just make that go better. With these things, you have to bring it back to the real world. So as you build these models, how do you bring the two teams together to get feedback?
32:51
I think it's very similar to what Mark just said. You want to be able to validate on the accuracy question. We don't expect that these models, they will get increasingly accurate, but you want to be able to have feedback. And it's not as easy as being like, you know, this output doesn't make sense. You have to actually take it to the wet lab, run the experiment, find out if it actually happened as predicted, and feed it back into the model. And that's the virtuous cycle we want to build to help the AI best serve the biologists and the biologists be part of continuously improving the models from.
33:07
Like a numbers perspective, in a language model, you can run tens of thousands of tests.
33:42
Very false.
33:48
Yeah.
33:49
And we have to build a lot of them out. Yeah.
33:49
And then on going to the wet lab, what do you think that's going to be like the feedback cycle? As you start to have more of these things to be tested in the wet lab, do you feel like that's going to be a bottleneck, that we cannot take that many?
33:52
I don't know the answer to that yet. I think the throughput on sort of established metrics in the wet lab is actually getting quite fast. You can run paralyzed a lot of experimentation, but it's not easily at the tens of thousands of verifications. But we actually have to see, and we'll probably need to be smart about how we do it.
34:05
But I mean, there's a lot of people, I think often take these things to the extreme and are like, okay, pretty soon if you have these models, you're just going to be able to run experiments with the models without even having to go to a wet lab. And it's like, no, I mean, I think that that's sort of the biological version of eventually AI is going to automate every single thing in society. It's like, look, maybe you get there, right. And I think that there's some chance over time, but, well, before you do, you're going to be able to have models that can help generate hypotheses and scientists can apply their taste on which ideas or kind of suggestions come from. This are worth testing. And then you test them and then you feed it back into the model, which I think is basically the way that every AI model is deployed into, even in code in other places. Totally.
34:30
Right now, because the wet lab is so expensive and relatively slow compared to sort of computational experimentation, people are choosing, like, I need something to hit. So people are going for hypotheses or ideas that are, like, to use a sports analogy, like singles or doubles. But like, they. It's just too risky. They only have so much grant funding and they need something to help move their work along. But like, if we have a model that can help de risk some of the bigger Riskier ideas that's going to move science faster and I think makes the science and those ideas both, you know, can be sourced with AI as a tool, but really it's really about making the scientists less hesitant to explore big ideas.
35:20
Yeah, obviously that's a lot of the success of the model czi, which is serving this part of research that is underserved because there was basically no benefactor or no funding mechanism by which to do this. One thing that we're announcing when we release this podcast is this unification of the sort of biohub model. I think it's very analogous to the foundation model and the Frontier lab approach where you bring together people of different disciplines. You have much longer time horizons than anyone else. Are there any other key elements to the strategy of the biohub that you're taking?
36:08
Well, I mean, one thing that we haven't talked about is the evolutionary scale team and Alex Reeves and his team.
36:41
Joining and they're like, let's talk about the announcement.
36:47
Yes, this is probably the most talented team working on AI and biology at the intersection of doing of basically good biology background. And also they've just been working on ESM3. Yeah. Some of the top protein models for a long period of time. Yeah. I mean, I think if you want to build an organization that is doing frontier biology and frontier AI, you need to have world leading AI researchers. And we're doing that by basically combining the team that we have that's already put out all the models that we're talking about today, plus having the evolutionary scale team, which is just like very renowned, join. And Alex is basically going to be running the program. So I think it's sort of an interesting decision, I think, to have the AI person basically be running the overall program. Partnering with these leading biologists, I think gives a sense of how optimistic we are about the AI work being very fundamental to this. But we're very serious about building out like a leading part, a leading lab on the AI side as well. That goes for both the talent and the compute. I think we were probably the first to build out a large scale compute cluster for biological research. I think now there are some others who are doing it too, but we're also building on that and we plan to release Frontier models.
36:49
Hey.
38:10
We'll continue our interview in a moment after a word from our sponsors.
38:10
Your IT team wastes half their day on repetitive tickets, password resets, access requests, onboarding, all pulling them away from meaningful work. With Servil, you can cut help desk tickets by more than 50% while legacy players are bolting AI onto decades old systems. Servl allows your IT team to describe what they need in plain English and then writes automations in seconds. As someone who does AI consulting for a number of different companies, I've seen firsthand how painful and costly manual provisioning can be. It often takes a week or more before I can start actual work. If only the companies I work with were using Servil, I'd be productive from day one. Servil powers the fastest growing companies in the world like Perplexity, Verkada, Merkor and Clay. And Servil guarantees 50% help desk automation by week four of your free pilot. So get your team out of the help desk and back to the work they enjoy. Book your free pilot@servol.com cognitive that's S-E-R-V-A-L.com.
38:14
Cognitive the worst thing about automation is how often it breaks. You build a structured workflow, carefully map every field from step to step, and it works in testing. But when real data hits or something unexpected happens, the whole thing fails. What started as a time saver is now a fire you have to put out. Tasklit is different. It's an AI agent that runs 24 7. Just describe what you want in plain English, send a daily briefing, triage support emails or update your CRM. And whatever it is, Tasklit figures out how to make it happen. Tasklit connects to more than 3,000 business tools out of the box, plus any API or MCP server. It can even use a computer to handle anything that can't be done programmatically. Unlike ChatGPT, Tasklet actually does the work for you. And unlike traditional automation software, it just works. No flowcharts, no tedious setup, no knowledge silos where only one person understands how it works. Listen to my full interview with tasklit founder and CEO Andrew Lee. Try Tasklet for free at Tasklet AI and use code COGREV to get 50% off your first month of any paid plan. That's code COGREVasklet AI.
39:21
Do you see that as the 10 year output? Like in the next 10 years?
40:34
We look back at that yesterday. They say it's faster than that.
40:38
But AI people are always, they're always in a hurry.
40:42
We have AGI in two years. Would that be a satisfactory result for you guys? You fast forward 10 years, you have the three best models in biology. Or is there a further goal that you want to have as an output of the foundation?
40:45
I have to bring it back to the patient. I think the AI models us I think we will be very excited both if we have great models and scientists are using them. But you really want to make sure that it's accelerating clinical impact. That's the goal. Right? The AI models is a very challenging milestone that we are working very hard on, and we will get there. But how do you actually take those models and apply them to actually change the way people live? And there's. There's two variants that I think about in the application of these models. Why are they important? One is, like, each one of our genetics is incredibly diverse and different. First of all, we are just. All the four of us are unique people, but we also have things that are known indicators of disease and unknown indicators of disease. And I actually find the variants of unknown significance to be the most interesting and the most frustrating. Say someone that you love. It's sort of a diagnostic mystery. They need to go in and look at the genetics. Most likely they'll come back and be like, there are these three things that are not usual, but we also don't know why. And you're like, okay, should I panic? Should I not panic? What do I do now? And what you really want to do, and I think these models will be able to do is look at those variants and actually model out what is the impact in the different cells, how it influences cellular behavior, and whether or not that is tied to a pathway to disease or not. Like, that's a big deal. And I think we should be doing that. That is actually the future of medicine, where we think about each one of your biology based on your genetics, your exposure, and how that predisposes you or not to disease. Like, that's huge. And we want to be able to see that clinical application, but we can't. It's too expensive, too hard to model each person, impossible to model each person in the lab. But if we can build models around this, it is possible. And then we can start thinking with extreme precision. And I'm not just talking about rare disease. There are common diseases. I'll just say depression right now. It's empirical, right? We just say, you're depressed. Let's try this antidepressant. And it's usually the one that the doctor's more familiar with or maybe one that you've heard of. But, like, and then you have to try it for months before it's like, did it work? Did it not work?
40:59
Months, yes, that's the cycle. I don't have familiarity with this. It's horrible.
43:28
And meanwhile, if it doesn't work, it means the person's suffering. And this applies to, like, almost every disease. Right. There has to be some biological explanation as to why some medications work and don't. So can we actually then look at each patient and say, based on who you are, we think this medication is going to work best for you. That's the future I want to live in, where we can actually understand individuals as individuals and use the biology and science very directly to keep them well, yeah.
43:32
So if there's a name for this tool that has the clinical impact that is on the scale of the electron, how do you envision it? I guess I feel like it's almost going to be the CZI app, I guess.
44:04
Oh, well, it won't be. First of all, that's not what we're building right now. We're building the basics we're understanding, like cells and molecules. So I'm painting someone else will do it. We're painting a picture like we need partnerships. This is. You asked about the ecosystem before, like, there are experts along the way of this pathway. And so we sort of are at the fundamental research side and you need to be able to partner with folks to bring this all the way through impact. But the way I think about people call it different things, but essentially you want to get to medicine where we. It's truly precision medicine. It's N of 1. We're understanding you and designing therapeutics for you.
44:19
Yeah, I like the mission of rare as one as well. That's a great framing.
44:58
Do you feel like that's possible, like almost treating the body as like a compiler? It's like, because I know exactly what it looks like, I know exactly what's going to happen. Or is the body just like there's too many outside inputs and over time it kind of deviates from what you have?
45:02
Well, I think we'll see how far we can get, but I'm pretty optimistic that we'll be able to make a bunch of progress and. Yeah. What format does this take? Technologically, I would imagine you're taking these different types of virtual cell models and eventually merging them into the equivalent of a biological omni model. Kind of like how on the. The language model side you had people that did language and then people who did different kinds of media models and perception and all that, and then eventually you just kind of merged that and then you aim to get positive transfer by merging it. So that way it's not just combining capabilities, but getting everything else to be stronger. So, yeah, I mean, technologically, I think that's basically what it looks like is over whatever it is, a five or ten year period. We're building up a series of biohub models that increasingly get all these different dimensions of data and capabilities that can be used to help run individual science experiments and potentially eventually help with finding individual therapies for patients. Although we're going to be less on the clinical side, we're going to be more on the scientific tool development side. And the main tool, if you will, is this like these biohub virtual cell.
45:17
Models, I would say five years ago, without sort of the large language model supporting this, I don't think it would have been possible to really, because biology is incredibly complex and what we're essentially trying to do is break it down from a discovery based science where you kind of get lucky, you kind of get clever and you sort of figure out a hack to learn something new to really making it closer to an engineering problem of like this is how the system works and when this breaks, what happens to the rest of the system? But like you said, there's just, there's far too many dimensions for, for us to hold in our brains. That's why we're so excited about this intersection at this moment, because it is possible to consider so many more dimensions matching the complexity of biology.
46:34
What is the role of the doctor in that future? Right. If you can predict everything out and then if you take personal superintelligence seriously, do you kind of distribute some of the diagnosis and all of that work or how do you envision that?
47:22
I've been thinking about this a lot and I think one is the model's not going to take you all the way. You're still going to need to really look at individual clinical situations and, and the doctor is going to be a form of data input into the model. Right. And so the doctor, there's some judgment that comes into place, but there's already a lot of models that make doctors really good at what they do. For instance, looking at your skin, like AI is really, really good at detecting lesions in your skin that are concerning. It's excellent retinal issues, it is excellent. So the AI modeling and mapping is really, really good. So it's already happening. So I think about like, what should future doctors be trained to do? And I really think care and compassion and sort of walking patients through understanding, I think understanding why leads to trust in both the science and in the clinical pathway. And really walking alongside patients on that journey is. It was the original calling of physicians to be healers and to be using great tools to heal Patients.
47:37
Well, so bedside manner, ultimately.
48:52
I mean, I also think you can zoom out, though, from the role of a doctor to. I think everyone wants the health system to be more proactive and less reactive. Right. So today it's like you show up when you're sick and then you have someone treat you or understand what's going on. I think the goal with a lot of these systems is to be much more proactive about this. So when we say that the vision is to try to help scientists cure and prevent all diseases, it doesn't mean that there's going to be no bacteria in the world and no one ever starts to get an infection. It's just that, all right, ideally you can kind of understand all of that really early. Right. Similarly, if someone gets a mutation, it looks like it might become cancerous, then you can just treat it a lot better if you know that early, rather than showing up to a doctor when it's already metastasized and you have a bunch of issues on that. So, I don't know. I think that there are going to be a lot of opportunities to fundamentally improve the healthcare system overall. But I agree with everything that you said on this. And I also just think that when we say that we think it's going to be possible to prevent and cure all diseases, it's not literally that no one ever gets the beginning of a sickness. It's just that it kind of can be managed in a way where everything is sort of manageable.
48:55
I think we discover more diseases the longer we live. Is it possible to not die? Obviously, that's a meme that's coming to fruition. If you theoretically cure all diseases, maybe death is a disease.
50:16
Mark just said we had extreme alignment, which I love. Thank you, honey.
50:30
This is one that we don't necessarily.
50:36
This is one that I'm not sure we have experience extreme alignment on. I, in fact, just haven't thought about this one very much because I think there is so much.
50:37
There's other things to do.
50:45
There's so much to do in terms of, like, you know, I'm a pediatrician, I think about babies and, like, very sad things happen to very small people. And, like, I think a lot about that. And how do we, like, maximize life quality and the things that harm small people. I'm biased and I haven't thought as much on the other end of the spectrum, but I don't know. I'm 40, maybe I should, but I feel like I can still focus on the little ones.
50:47
I think the strategy is the same it's like we're basically choosing to not focus on any specific disease and verticalize. Our strategy is one of trying to accelerate scientific progress overall. And I think that there are a lot of people who are going to focus on each of these individual things. So I don't know, but we don't.
51:14
Have to because that's not our strategy. Our strategy is to make sure that we have tools that make people do the best science possible out there.
51:33
I'll put to you that because aging and environments and mutations are so diverse, you have a high concentration of grouping in the early years and it should have more diversity in terms of the cell types and the problems that you face in the later years. And, and so there might be some imbalance in terms of where all these things happen, but I'm not pitching it any particular direction.
51:40
No, I think it's clearly, if you look at the trend over the last, I don't know what it is, 100 years, I mean, there was this flip, if you pay attention to the history of science, where it changed to kind of hypothesis driven scientific method of like we're going to run tests and have controlled experiments. And since that happened, the average life expectancy has basically increased by, I think it's about a quarter of a year every year over the last hundred years. Now a lot of that, like Priscilla said, is basically making it so that a lot of people don't die young. So far had somewhat less of an impact on extending the maximum human life expectancy. Although the oldest people today, I do think in general are older than the oldest people, you know, 20 or 30 or 40 years ago. But there's been a little bit less of an increase there and more just kind of making it so that people don't suffer and die prematurely from things. But I mean, I think there's other things that you want to focus on here too. It's not just like how long you live, it's like quality of the life while you're, you know. So I think it's like you can live a full life and have that be high quality or you can get sick in different ways that kind of add up over time. And I think there's lots of different ways to improve. There's all these different analogies that you could throw at this, but I think there's just a lot of room to improve here.
52:06
And then the other element I wanted to come back to on the engineering side, which is when you presented a high dimensionality problem, you want to reduce things into little boxes that you can sort of manipulate at a higher abstraction. And that's something I try to do with, with the folks outside. And we really struggled because over here you're imaging on the atomic level and then you're also worrying about proteins and then you're also trying to build a cell model. Is every abstraction leaky? Where's the boxes I can move around and not worry about it? My physics analogy is in the regular world you don't have to worry about quantum physics, but here we kind of do.
53:33
I think you want to build it up a little bit hierarchically and, and when you're trying to understand proteins, understanding molecules makes a big difference. But at some level you can kind of just look at correlations in cells. But if you want to really have the most accurate model and if you want to be able to reason about things, then you probably also want to understand proteins. Well, and then I think that kind of extends. But yeah, I mean that's part of the interesting challenge of this is that it's not just like one resolution that you're looking at it. I think in order to do it well, it's. Yeah, I mean you have some amount of abstraction, but I think you want the models just like language models or I think how our brains work to basically build up different levels of abstraction and pattern matching. And that's here too. And you basically just need to kind of have some basic excellence and understanding at each of these different levels.
54:07
It's weird the number of levels at which you have to telescope up and down. It's mind boggling. And I think when people say dimensions they typically mean orthogonal dimensions. But here it's sort of like nested.
54:57
Just different scales that are oddly different disciplines to understand each specific scale. And it's in a way that the people who are good at understanding one scale are like, I've never spoken to.
55:10
People at the next scale.
55:22
Yeah.
55:23
Physics is there, chemistry is here, bio is there. It's nice to hear about it. But when you see it and you meet the people, you're like, oh, this is real. And they are actually working together.
55:25
And then there's this goal of the virtual immune system that you're working towards. I would love for you to chat about that. And also if that happens, what should other people build? So there's obviously CRISPR and some of that technologies that people should maybe ramp throughput for. How do you think about the future?
55:36
The virtual immune system I think is obviously, I think of a subset of the generalized model eventually we'll get to. But the virtual Immune system is super interesting for a couple of reasons. Reasons one, it's individual cells interacting with each other. There's you know, a number of cells that we don't even fully understand what they do. B cells, T cells, NK cells. And so we can use our current technologies to understand these cells at a more granular level. So that's cool from a biology standpoint, but the clinical impact is huge of understanding the immune system because biology turns out has already given us a way to keep the body healthy. And it also sometimes goes awry and causes disease with autoimmune disease. Right. And so it's a very complex system that has to stay in balance. And if it goes out of balance in either direction, you get sick. It can also go into your body and it's a privileged system that is mobile and can go into places like your brain, your pancreas, your heart to sort of either do maintenance or to collect signal. It's built in. So if we can understand this system, we can use it to keep people healthy. We already kind of do. So there's car T cells where we reprogram T cells to go in and fight cancer. And our New York biohub, we're doing cellular engineering to say like hey, can you go in to this person's heart, check if they have plaques that are causing problems, read it into your DNA self lyse and then we can read out the signal as cell free DNA and give us a binary answer, yes or no. Then we can put in other engineered cells and imagine where you go in and you clear out the plaques using engineered immune cells that are your own. That is incredible. That is a tool that is realistic too. I know it sounds sci fi. It is realistic, it is happening. And then on the other end of understanding the balance, like so many autoimmune diseases, Ms. lupus, those are the ones examples of ones we know. I think there are other things that are autoimmune that we don't understand. Like dementia can have an autoimmunity can play a large role in that. And so if we can understand the fine balance that the system needs to be kept in, then we can actually impact a lot of the ways the, the human body is maintained. So I think it's both interesting from a biology perspective and feasible to model and probably one of the highest sort of now impact systems if we can learn how to manipulate.
55:53
Amazing.
58:30
But it's only one system, right? I mean it's, I think it's like the.
58:31
So it's a subset.
58:33
If you're focused on curing and preventing diseases. The immune system is a pretty important one. And I think it's also interesting for all the, and unique in a lot of the ways that you said. But there's like, like lots of other parts of the body to understand too.
58:34
And I think we're running out of time. So we have two questions to close one again. 100 years, maybe it's too long, right? What would it take to do it in 50, in 25 and to make those happen? Like, what should other people build to support your work?
58:46
I mean, I think a lot of this is going to end up coming down to how far a lot of these AI methods get.
59:01
Right.
59:06
I think that there's like, people have. There's just this constant ongoing debate around what are the time frames for getting to very strong AI. And I think if you get that, then I think it's pretty optimistic that with the right investments in frontier biology, you should be able to get these systems that can allow you to have virtual cells that allow you to do the kind of precision treatments and preventative care that can achieve this kind of mission significantly sooner. But at the end of the day, I think a lot of that time frame will probably come down to the AI timeframe. There's obviously a ton of stuff to do in biology, but it's not. I mean, I think that what should other people do? I mean, other people doing more frontier biology and helping to collect this type of data and solve these problems is super helpful to that too. It doesn't automatically happen, but I guess if we're predicting whether it's going to take 10 or 20 or 40 years, that is probably more a function of the pace of AI development than it is a pace of the pure biology side.
59:06
Yeah, I was going to agree with you. I think it's a lot needs to. I think we're on a path to get a lot of important biological data through advances in laboratory technique, but it's not a given. And there are different groups that are expert at this all across the nation and across the world. And so we need to be continuing to push the research and the methodologies. And I want to say that like, you know, the cell atlas was not glamorous work. People were not going to get their tenure track paper by sort of analyzing the hundredth and twenty millionth cell. That is just not it. Right. And so rethinking the way that this work gets done in a collaborate, like doing big things together in science, that's what is going to need to happen to get the knowledge we need to build models that give us this type of insight.
1:00:11
I guess one thought on the type of biology that I think should get done is there is a certain orientation around choosing problems that will help generate data that can help make the models a lot smarter. I think that there's a. You do that when you are very optimistic about the pace of progress and what AI is going to enable. Because the classic reason that scientists generated data sets is so that they could basically look through the data sets to make advances. So it is a little bit of an inversion in the thinking, which is like, I'm now going to do this so I can help train this other thing to be better and create more advances. And I think in a world where you really believe that there's going to be very significant AI progress, I think more frontier biology should be done in that way. But these data sets aren't going to get created by themselves. There's a lot of work that needs to get done and a lot of investment there. And at some level, you could probably have the smartest AI model in the world. But if it doesn't actually have the data to understand this stuff, it's like, okay, you can't just reason from first principles about all these things. I mean, a lot of. Of human knowledge comes empirically, not from first principles reasoning. I think that more, this is kind of the whole biohub network idea that we're building. And I've been really happy to see other folks, especially a lot of people in technology I think have this orientation too. They believe a lot in AI. They believe in the technological progress. They've generated some significant wealth. We're building their companies and now they're investing in science research. And I think that's great. And I think doing it in this way where you're building up these networks to basically build specific tools that generate data that make the models better. It's one approach. It's not that all science should go in that direction, but it's one of the things that I'm quite optimistic about that I think is going to make a very big difference. Cool.
1:01:05
That's probably all the time we have, but I'll just leave it to you guys for any calls to action. Anything that you want biologists or engineers to check out.
1:03:01
I mean, check out the models. Check out the models.
1:03:10
The tooling.
1:03:12
Yeah. I mean, they're early, but I think it's kind of an interesting sense of where things are going and we'd love feedback on it and it'll kind of just help this feedback loop of, like, what we should build next.
1:03:13
Yeah, I would say let's do this together. We need lots of people coming together to do this work.
1:03:28
Well, thank you for organizing it and solving and curing all diseases, trying to.
1:03:34
Help others do it. All right, thank you.
1:03:39
Thank you.
1:03:42
If you're finding value in the show, we'd appreciate it if you'd take a moment to share it with friends, post online, write a review on Apple Podcasts or Spotify, or just leave us a comment on YouTube. Of course, we always welcome your feedback, guest and topic suggestions and sponsorship inquiries either via our website Cognitiverevolution AI or by DMing me on your favorite social network. The Cognitive Revolution is part of the Turpentine Network, a network of podcasts which is now part of a 16Z where experts talk technology, business, economics, geopolitics, culture, and more. We're produced by AI Podcasting. If you're looking for podcast production help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement@aipodcast.ing. and thank you to everyone who listens for being part of the Cognitive Revolution.
1:03:53