Latent Space: The AI Engineer Podcast

🔬Beyond AlphaFold: How Boltz is Open-Sourcing the Future of Drug Discovery

81 min
Feb 12, 20262 months ago
Listen to Episode
Summary

Gabriela Corso and Jeremy Volvind from Boltz discuss their journey from AlphaFold 2's breakthrough in protein folding to creating open-source alternatives like Boltz-1 and building a commercial platform for drug discovery. They cover the technical evolution from structure prediction to protein design, the importance of experimental validation, and their mission to democratize access to AI-powered molecular design tools.

Insights
  • Open-source AI models in biology can build thriving communities that accelerate research while still supporting viable commercial products through infrastructure and user experience layers
  • Experimental validation across diverse targets and labs is crucial for establishing credibility in computational biology, requiring significant coordination and partnership efforts
  • The transition from structure prediction to molecular design represents a shift from problems with evolutionary hints to truly novel design challenges requiring different validation approaches
  • Inference-time scaling and ranking models are becoming critical for improving molecular design results, similar to trends in other AI domains
  • Specialized architectures still outperform general transformers in structural biology despite the 'bitter lesson', due to domain-specific physics and geometric constraints
Trends
Shift from regression to generative modeling in protein structure predictionInference-time scaling becoming critical for molecular design qualityOpen-source foundation models enabling commercial platform businessesIntegration of multiple AI models into agentic workflows for drug discoveryExperimental validation becoming a competitive differentiator in AI biologyDemocratization of advanced molecular design tools beyond large pharmaCommunity-driven development accelerating AI biology researchSpecialized GPU infrastructure becoming essential for molecular screeningCollaborative interfaces enabling medicinal chemist adoption of AI toolsExpansion from single-chain proteins to complex molecular interactions
Companies
DeepMind
Created AlphaFold 2 and 3, the breakthrough protein folding models that revolutionized structural biology
Boltz
Public benefit company founded by the guests to democratize AI-powered molecular design tools
Isomorphic Labs
DeepMind spinoff that kept AlphaFold 3 proprietary for commercial drug discovery applications
MIT
Academic institution where both guests completed their PhDs and conducted foundational biology AI research
Genesis
Provided computational resources to help complete Boltz-1 model training when academic compute was insufficient
Adaptive Biotechnologies
CRO partner that conducted experimental validation testing for Boltz models across multiple targets
Harvard University
Collaborated through Nick Polizzi's group on developing better benchmarks for protein-small molecule interactions
People
Gabriela Corso
Co-founder of Boltz, MIT PhD graduate who transitioned from theoretical ML to structural biology after AlphaFold
Jeremy Volvind
Co-founder of Boltz, MIT PhD graduate focused on generative biology and molecular design validation
Hannes Stark
Boltz team member who developed the innovative atomic encoding approach for simultaneous structure and sequence predi...
Sergey Ovchinnikov
MIT researcher who provided insights into AlphaFold's pairwise architecture and contact prediction mechanisms
Nick Polizzi
Harvard researcher who collaborated on developing improved benchmarks for protein-small molecule binding prediction
Andrew White
Researcher mentioned for discussing the extensive computational efforts that preceded AlphaFold's breakthrough
Tim O'Donnell
Community member who proposed innovative inference-time search techniques for improving antibody-antigen predictions
Devon
CEO of Genesis who provided crucial computational resources to complete Boltz-1 model training
Quotes
"Actually we only trained the big model once. That's how much compute we had. We could only train it once."
Jeremy Volvind
"It's impossible to reproduce now. Yeah, yeah, no, that model is like, has gone to such a curriculum that, you know, learned some weird stuff. But yeah, somehow a miracle it worked out."
Jeremy Volvind
"When we say that we design new proteins or we say that we design new molecules, go and bind these particular targets, we should be very clear these are not drugs, these are not things that are ready to be put into a human."
Gabriela Corso
"I think at the end of the day, for people to be consistent, convinced, you have to show them something that they didn't think was possible."
Jeremy Volvind
"The great thing about kind of structure prediction is that a bit like CASP was doing, basically the way that you can evaluate them is that you train the model on a structure that was released across the field up until a certain time."
Gabriela Corso
Full Transcript
4 Speakers
Speaker A

Actually we only trained the big model once. That's how much compute we had. We could only train it once. And so like while the model was training, we were like finding bugs left and right, a lot of them that I wrote. And like I would. I remember like us like sort of like, you know, doing like surgery in the middle, like stopping the run, making the fix, like relaunching and yeah, we never actually went back to the start. We just like kept training it with like the bug fixes along the way, which was.

0:00

Speaker B

So it's impossible to reproduce now.

0:26

Speaker A

Yeah, yeah, no, that model is like, has gone to such a curriculum that, you know, learned some weird stuff. But yeah, somehow a miracle it worked out.

0:29

Speaker C

It's a pleasure to have with us today Gabriela Corso and Jeremy Volvind. They are recently founded Volt, a company trying to democratize and bring art structure prediction in biology to, you know, the masses. They were both recent PhD grads from MIT and have been working on all sorts of foundational papers in like generative biology. Anyway, pleasure to have you here. Thanks for coming.

0:38

Speaker A

Thank you.

1:04

Speaker C

I guess we're maybe what, six years post AlphaFold 2 right now, which was kind of a big moment, is that right?

1:06

Speaker A

I think. Was it 20, 21? So, yeah, going on five years.

1:13

Speaker D

Five years.

1:18

Speaker C

Five years, yeah. So maybe for the audience, let's go back to that moment in time and explain what was this big moment and why was it interesting? Why was everyone so excited? And I think you've two were probably quite excited. So why were you personally excited?

1:18

Speaker D

I would start on kind of why that was interesting from a scientific standpoint. So Alphafold, so maybe first as a kind of introduction for the ones in the audience and not structured biologists. So the idea of structural biology is that we want to try to understand how proteins and other molecules take shape inside our cells and how they interact. And structural biology, sort of this beautiful discipline where we are somehow able to understand this minuscule structure at kind of atomic details using these incredibly complex methods like X ray crystallography. And the dream has always been of computational biology. Can we understand kind of the structures without having to resolve this crystal shoot X rays and so on. And so Alphafold was a real breakthrough in this problem of protein folding, which is trying to understand the structure of a single protein. And to me it was exciting across kind of many dimensions. One, I was computer scientist, I was working a lot on machine learning and I saw kind of the impact that kind of the work similar, somewhat similar to what I was doing could have on long standing scientific problem. And on the second perspective, from a more personal side, seeing kind of the structures coming out of these models, where you see kind of this beautiful creation of life, is something that was very inspiring to me. And so that was kind of one of the things that led me to start working on structured biology and in particular with machine learning.

1:33

Speaker C

Were you a structural biologist before AlphaFold came out? I mean, you did machine learning, but it was not in structural biology. So that actually shifted your career quite dramatically.

3:26

Speaker D

Yeah, very dramatically. I was working on some pretty kind of theoretical methodological things and I was starting to see kind of some of the challenges in kind of doing somewhat theoretical or methodological work and seeing kind of the potential impact of doing excellent. AlphaFold was really a machine learning breakthrough, but an applied machine learning. And so that led me to want to start working and apply them.

3:36

Speaker A

Our group at the time was working a lot on small molecules already. And I think AlphaFold is kind of what triggered, I think, this shift to working on biologics. And at the time, I think it opened as many questions as it answered. In a sense, we. The immediate follow ups were, okay, can we do this on other things than proteins? Can we do interactions of small molecules with proteins, nucleic acid with proteins? Can we model more complex protein systems? And I think very rapidly, I think after AlphaFold people realized, I think that machine learning could really target this problem very differently than previous methodologies.

4:06

Speaker C

Going Back to the AlphaFold2 moment, I remember this very well. I was at Neurips when I guess the results of this famous competition came out. So why don't we talk about CASP and what it is and why it was so interesting and exciting.

4:48

Speaker A

I think every couple of years the goal has always been to find protein structures that are a little bit different from what's known. So CASP over the years has put in a lot of effort to gather structures from academic groups and even industry groups to try to create sort of a test set that would be difficult for different methods. And CASP14 was when AlphaFold2 really blew everything out of the water. The improvement was so large over the previous method and also over the previous competitions. And now CASP continues. We've had CASP 15, we had cats 16. And what's happened now is that it's really expanding to also all these other modalities, like I was mentioning, like protein, small molecule nucleic acid. But the goal remains to really challenge the models, like how well do these models generalize? And we've seen in some of the latest CASP competitions like, while we become really, really good at proteins, especially monomeric proteins, other modalities still remain pretty difficult. So it's really essential in the field that there are these EFF efforts to gather benchmarks that are challenging. So keeps us in line about what the models can do or not.

5:05

Speaker C

It's interesting you say that in some sense at CAS14, a problem was solved and pretty comprehensively. Right. But at the same time, it was really only the beginning. So can you explain what was the specific problem you would argue was solved, and then what is remaining, which is probably quite open.

6:27

Speaker A

I think we'll steer away from the term solved because we have many friends in the community who get pretty upset at that word, and I think fairly so. But the problem that a lot of progress was made on was the ability to predict the structure of single chain proteins. So proteins can be composed of many chains, and single chain proteins are just a single sequence of amino acids. And one of the reasons that we've been able to make such progress is also because we take a lot of hints from evolution. So the way the models work is that they sort of decode a lot of hints that comes from evolutionary landscapes. So if you have some protein in an animal and you go find the similar protein across different organisms, you might find different mutations in them. And as it turns out, if you take a lot of these sequences together and you analyze, you see that some positions in the sequence tend to evolve at the same time as other positions in the sequence, sort of this correlation between different positions. And it turns out that that is typically a hint that these two positions are close in three dimension. So part of the breakthrough has been our ability to also decode that very, very effectively. What it implies also is that in absence of that coevolutionary landscape, the models don't quite perform as well. And so I think when that information is available, maybe one could say the problem is somewhat solved from the perspective of structure prediction. When it isn't, it's much more challenging. And I think it's also worth also differentiating the. Sometimes we confound a little bit. Structure prediction and folding. Folding is the more complex process of actually understanding how it goes from this disordered state into a structured state. And that I don't think we've made that much progress on. But the idea of like, yeah, going straight to the answer, we've become pretty good at.

6:49

Speaker B

So there's this protein that is like just a long chain and it folds up, and so we're good at getting from that Long chain in whatever form it was originally to the thing. But we don't know how it necessarily gets to that state. And there might be intermediate states that it's in sometimes that we're not aware of.

8:49

Speaker A

That's right. And that relates also to our general ability to model. The different proteins are not static. They move, they take different shapes based on their energy states. And I think we are also not that good at understanding the different states that the protein can be in and at what frequency, what probability. So I think the two problems are quite related in some ways. Still a lot to solve. But I think it was very surprising at the time, you know, that even with these evolutionary hints that we were able to, you know, to make such, such dramatic progress.

9:10

Speaker B

So I want to ask, why does the intermediate states matter? But first I kind of want to understand why do we care what proteins are shaped like?

9:45

Speaker D

Yeah, I mean, the proteins are kind of the machines of our body. You know, the way that all the processes that we have in our cells work is typically through proteins, sometimes other molecules, sort of intermediate interactions. And through that interactions, we have all sorts of cell functions. And so when we try to understand a lot of biological, how our body works, how disease work, we often try to boil it down to, okay, what is going on right, in case of our normal biological function and what is going wrong in case of the disease state. And we boil it down to proteins and other molecules and their interaction. And so when we try predicting the structure of proteins, it's critical to have an understanding of those interaction. It's a bit like seeing the difference between having kind of a list of parts that you would put in a car and seeing kind of the car in its final form. Seeing the car really helps you understand what it does. On the other hand, kind of going to your question of why do we care about how the protein falls or how the car is made to some extent, is that sometimes when something goes wrong, there are cases of proteins misfolding in some diseases and so on. If we don't understand this folding process, we don't really know how to intervene.

9:55

Speaker A

There's this nice line in the. I think it's in the AlphaFold 2 manuscript where they sort of discuss also why we even hopeful that we can target the problem in the first place. And then this notion that four proteins that fold the folding process is almost instantaneous, which is a strong signal that, yeah, we might be able to predict this very constrained thing that the protein does so quickly. And of course, that's not the case for all Proteins and there's a lot of really interesting mechanisms in the cells. But yeah, I remember reading that and thought, yeah, that's somewhat of an insightful point.

11:30

Speaker D

I think one of the interesting things about the protein folding problem is that it used to be actually studied and part of the reason why people thought it was impossible, it used to be studied as kind of like a classical example of an MP problem. There are so many different type of shapes that this amino acid could take and so this grows combinatorially with the size of the sequence. And so there used to be a lot of actually kind of more theoretical computer science thinking about and studying problem protein folding as an MP problem. And so it was very surprising also from that perspective, kind of seeing machine learning so clear there is some signal in those sequences through evolution, but also through kind of other things that us as humans, we're probably not really able to understand, but that these models have learned.

12:10

Speaker B

So Andrew White, we were talking to him a few weeks ago and he said that he was following the development of this and that there were actually ASICs that were developed just to solve this problem and that there were many, many, many millions of comput hours spent trying to solve this problem before alphafold. And just to be clear, one thing that you mentioned was that there's this kind of co evolution of mutations and that you see this again and again in different species. So explain why does that give us a good hint that they're close by to each other?

13:07

Speaker A

Yeah, like think of it this way, that, you know, if I have some amino acid that mutates, it's going to impact everything around it. Right. In three dimensions. And so it's almost like the protein, through several probably random mutations in evolution ends up sort of figuring out that this other amino acid needs to change as well for the structure to be conserved. So this whole principle is that the structure is probably largely conserved because there's this function associated with it. And so it's really sort of like different positions compensating for each other.

13:40

Speaker B

I see those hints in aggregate give us a lot of information about what is close to each other. And then you can start to look at what kinds of folds are possible given the structure and then what is the end state and therefore you can make a lot of inferences about what the actual total shape is.

14:16

Speaker A

Yeah, that's right. It's almost like you have this big three dimensional valley where you're sort of trying to find these low energy states and there's so much to search through that's almost Overwhelming. But these hints, they sort of maybe put you in an area of the space that's already kind of close to the solution, maybe not quite there yet. And there's always this question of how much physics are these models learning versus just pure statistics. And I think one of the things, at least I believe, is that once you're in that sort of approximate area of the solution space, then the models have some understanding of how to get you to the low energy state. And so maybe you have some light understanding of physics, but maybe not quite enough to know how to navigate the whole space well. So we need to give it these hints to get it into the right.

14:34

Speaker B

Valley and then it finds the minimum or something.

15:28

Speaker D

One interesting explanation about our Upfold 3 works that I think it's quite insightful of, of course, doesn't cover kind of the entirety of what AlphaFold does that I'm going to borrow from Sergey Chinnikov at mit. So he sees kind of Alphafold. And the interesting thing about AlphaFold is it's got this very peculiar architecture that we have since used. And this architecture operates on this pairwise context between amino acids. And so the idea is that probably the MSA gives you this first hint about what potential amino acids are close to each other.

15:31

Speaker B

MSA is multiple sequence alignment.

16:06

Speaker D

Exactly. This evolutionary exact. This evolutionary information. And from this evolutionary information about potential contacts, then it's almost as if the model is sort of running some kind of dicero algorithm where it's sort of decoding, okay, these have to be closed. Okay, then if these are closed and this is connected to this, then this has to be somewhat closed. And so you decode this, that becomes basically a pairwise kind of distance matrix. And then from this rough pairwise distance matrix, you decode kind of the actual potential structure.

16:09

Speaker B

Interesting. So there's kind of two different things going on in the kind of coarse grain, and then the fine grain optimization is interesting. Yeah, very cool.

16:44

Speaker C

Yeah, you mentioned AlphaFold 3, so maybe it's a good time to move on to that. So Alpha Fold 2 came out and it was, I think, fairly groundbreaking for this field. Everyone got very excited. A few years later, AlphaFold 3 came out, and maybe for some more history, what were the advancements in AlphaFold 3? And then I think maybe after that we'll talk a bit about how it connects to bolts.

16:53

Speaker D

But anyway, yeah, so after AlphaFold 2 came out, Jeremy and I got into the field and with many others, the clear problem that was obvious after that was, okay, now we can do individual chains. Can we do Interactions interaction different proteins, proteins with small molecules, proteins with other molecules.

17:16

Speaker C

And so why are interactions important?

17:38

Speaker D

Interactions are important because to some extent that's kind of the way that these machines, these proteins have a function. The function comes by the way that they interact with other proteins and other molecules. Actually in the first place, the individual machines are often, as Jeremy was mentioning, not made of a single chain, but they're made of the multiple chains. And then these multiple chains interact with other molecules to give the function to those. And on the other hand, when we try to intervene of these interactions, think about a disease, think about a biosensor, or many other ways we are trying to design molecules or proteins that interact in a particular way with what we would call a target protein or target. This problem after Apple 2 became clear one of the biggest problems in the field to solve, many groups, including ours and others, started making some contributions to this problem of trying to model these interactions. And Alphavol 3 was a significant advancement on the problem of modeling interactions and one of the interesting thing that they were able to do. While some of the rest of the field really tried to model different interactions separately, how protein interacts with small molecules, how protein interacts with the proteins, how RNA or DNA have to structure. They put everything together and train very large models with a lot of advances, including kind changing some of the key architectural choices and managed to get a single model that was able to set a new state of the art performance across all of these different kind of modalities. Whether that was protein, small molecules is critical to developing new drugs. Protein, protein. Understanding interactions of protein with RNA and DNAs and so on.

17:41

Speaker B

To satisfy the AI engineers in the audience. What were some of the key architectural and data changes that made that possible?

19:41

Speaker D

Yeah, so one critical one that was not necessarily just unique to AlphaFold 3, but there were actually a few other teams, including ours in the field that proposed this was moving from modeling structure prediction as a regression problem. So where there is a single answer and you're trying to show, shoot for that answer, to a generative modeling problem where you have a posterior distribution of possible structures and you're trying to sample this distribution. And this achieves two things. One starts to allow us to try to model more dynamic systems. As we said, some of these structures can actually take multiple structures. And so you can now model that through modeling the entire distribution. But on the second hand, for more kind of core modeling questions, when you move from a regression problem to a generative modeling problem, you are really tackling the way that you think about Uncertainty in the model in a different way. So if you think about, I'm undecided between different answers, what's going to happen in a regression model is that I'm going to try to make an average of those different, different kind of answers that I had in mind. When you have a generative model, what you're going to do is sample all these different answers and then maybe use separate models to analyze those different answers and pick out the best. So that was kind of one of the critical improvements. The other improvement is that they significantly simplified to some extent the architecture, especially of the final model that takes kind of those pairwise representations and turns them into an actual structure and now looks a lot more like a more traditional transformer than a very specialized equivariant architecture that it was enough of all three.

19:49

Speaker B

So this is a bitter lesson a little bit.

21:42

Speaker D

There is some aspect of a bitter lesson, but the interesting thing is that it's very far from being a simple transformer. This field is one of the argue very few fields in applied machine learning where we still have kind of architecture that are very specialized. And there are many people that have tried to replace these architectures with simple transformers. And there is a lot of debate in the field, but I think most of the consensus is that the performance that we get from the specialized architecture is vastly superior than what we get through a single transformer. Another interesting thing that I think staying on the modeling machine learning side, which I think is somewhat counterintuitive, seeing some of the other kind of fields and applications, is that scaling hasn't really worked kind of the same in this field. Now, models like AlphaFold 2 and AlphaFold 3 are still very large models, but at the same time, in terms of parameters, they're actually not very big. They are definitely below a billion parameters. If you hear these days in LLM space, a model with less than a billion parameters, you'd think can't do anything. But on the other hand, when you look at the computational costs of running these models, they are actually a lot more expensive than it is to run language models because as Jeremy was saying, we go from instead of having quadratic operations to now a cubic operation. And so it's interesting how right now in the field, and this is maybe related to, you know, having kind of less data or, you know, needing more inductive biases, but we have this ratio of, you know, amount of computation to parameters that is much, much higher than in other places.

21:45

Speaker C

If I recall, AlphaFold2 is like what, 70 million parameters? Something like that.

23:36

Speaker A

Yeah, it's something like that. It's quite. Yeah, it's quite small, around 100 or so. Yeah.

23:41

Speaker C

These decisions of triangle layers and like these for AlphaFold2, this interesting equivariant architecture really were priors that baked in a lot of the physics of the system. And also co evolution data is I think people have argued that is kind of like almost like a database lookup of some sorts. It also sort of. So that provides in some sense more parameters as well.

23:46

Speaker A

Yeah, I mean it's more definitely the amount of pure compute flops is very high and it's almost more like reasoning based maybe than more just like information extraction. I think one of the things that part of the reason the LLMs are so large isn't just because of their reasoning capability, but it's also because of the sheer quantity of information that they store. I think here there's a little bit less of that and I think it's more about decoding this input rather than maybe memorizing as much of it.

24:09

Speaker B

So is there a loop in the architecture that allows it to compute more per parameter? How does that work?

24:39

Speaker D

Part of it is just exclusively this fact that instead of having operations that operate on the single chain, they operate on the pairwise. And so instead of having quadratic number of interactions, you have a cubic number of interactions. And so that on its own leads you to have smaller representation sizes but more representation that you leads to more flops but fewer parameters. On the other hand there is actually also this idea of somewhat similar to reasoning where you recycle these operations from AlphaFold 2 but also kind of AlphaFold 3. They have this interesting framework where as we were discussing the input to the model is sort of like this initial understanding of the interactions either from the evolution of the multiple sequence, but also potentially from what we call templates that are basically database lookup of similar structures. And so how the model works is that it decodes these and tries to understand a good potential rough structure of the pairwise interaction. And then what you can do is basically do this recycling where you feed this understanding back to the input of the the model and then try to decode it again. And people do this three or four times and in some cases I've even tried to do it tens of times. And so you can see it as a very early version of reasoning or trying to get.

24:46

Speaker C

Yeah, so AlphaFold2 really cool. AlphaFold3 really cool. But AlphaFold3 came with a catch and I think this catch was important for the development of bolts and so on.

26:17

Speaker D

Yeah, the catch was that it was an amazing paper, nature paper, but unfortunately they decided not to Release the model. AlphaFold 2 was open source and since then was used. I think the reported numbers is more than a million scientists. AlphaFold3 for commercial reasons that did mine has since spin off Isomorphic lab that is now trying to become sort of like a new pharmaceutical company had decided to keep this model internal and only use it internally. And now both we were in the field and building on top of models like AlphaFold and so now we no longer add kind of the base starting point to build on top. But even more importantly, everyone in both kind of academic research and in industry no longer had access to these incredible models that was really useful to try to understand biologists, but also try to develop new therapeutics. I decided to take the matter in our own hands and decided to try to obtain a model that was of similar accuracy. And so largely also using a lot of the information that was in the app for free manuscript, we went ahead and built boltz one, which was the first fully open source kind of model to approach the level of accuracy of AppWall 3. And along the way, and we can talk about it more, but we realized that it was probably two ambitions to see this as an academic project and there are a lot of things that were kind of missing. And so we decided to also start a public benefit company to push kind of this mission of democratizing access to these models that we started with Bolt Swan Quick Interjection.

26:30

Speaker C

I mean, I remember this. It was actually shocking how fast you got Bolt Swan out. It was just like two or three months, right?

28:27

Speaker A

I think we started in late May and it came in November if I remember correctly, so slightly longer. But yeah, it was relatively quick. I mean, for what it's worth, we were working on some similar ideas at the time. I think, for example, this idea of having a diffusion model on top of this pairwise strong was something that we were exploring independently. Now when the paper came out, it was really clear, especially for example on the data pipelines, there was so much that we were not really doing. And so there was a lot to catch up on. But we were already in a place, I think, where we had some experience working with the data and working with these type of models. And that put us already in a good place to produce it quickly. And I would even say I think we could have done it quicker. The problem was for a while we didn't really have the compute and so we couldn't really train the model and actually we only trained the big model once. That's how much compute we had. We could only train it once. And so while the model was training, we were finding bugs left and right, a lot of them that I wrote. And I remember us sort of doing surgery in the middle, stopping the run, making the fix, relaunching and yeah, we never actually went back to the start. We just kept training it with the bug fixes along the way, which was interesting.

28:36

Speaker B

Impossible to reproduce now.

30:02

Speaker A

Yeah, yeah, no, that model is like, has gone through such a curriculum that, you know, it's learned some weird stuff. But yeah, somehow a miracle it worked out.

30:04

Speaker D

The other funny thing is that the way that we are training most of that model was through a cluster from the Department of Energy, but that's sort of like a shared cluster that many groups use. And so we were basically training the model for two days and then you would go back to into the queue and stay a week in the queue. And so it was pretty painful. And so we actually kind of towards the end with Devon, the CEO of Genesys, and basically I was telling him a bit about the project and kind of telling him about this frustration with the compute. And so luckily he offered to kind of help and so we got the help from Genesys to finish up the model. Otherwise it probably would have taken a couple of X or weeks of weeks.

30:13

Speaker A

Yeah, yeah.

30:57

Speaker B

Bolt one, how did that compare to AlphaFold three? And then there's some progression from there.

30:58

Speaker D

Yeah, so I would say kind of the bolts one, but also kind of these other kind of set of models that came around the same time were kind of approaching, were a big leap from kind of the previous kind of open source models and kind of really kind of approaching the level of Alpha 3. But I would still say that even to this day there are some specific instances where alpha fold three works better. I think one common example is antibody antigen prediction, where Alpha Fold 3 still seems to have an edge in many situations. Obviously these are somewhat different models. They are, you know, you run them, you obtain different results. So it's not always the case that one model is better than the other, but kind of in aggregate we still, especially at the time. So half of all three is still having a bit of an edge.

31:07

Speaker B

We should talk about this more when we talk about both gen. But like, how do you know one model is better than the other? So I make a prediction, you make a prediction. Like how do you know?

32:03

Speaker D

Yeah, so easily. You know, the great thing about kind of structure prediction and, you know, once we're going to go into the design aspect space of designing new small molecules, new proteins, this becomes a lot more complex. But a great thing about structure prediction is that a bit like CASP was doing, basically the way that you can evaluate them is that you train the model on a structure that was released across the field up until a certain time. And one of the things that we didn't talk about that was really critical in all this development is the pdb, which is the protein data bank, is this common resources, common database where every biologist publishes their structures. And so we can train on all the structures that were put in the PDB until a certain date. And then we basically look on recent structures. Okay. Which structures look pretty different from anything that was published before, because we really want to try to understand generalization. And then on this new structure, we evaluate all these different models.

32:12

Speaker B

So you just know when AlphaFold was three was trained, you know, when you intentionally trained to the same data or something like that.

33:17

Speaker D

Exactly.

33:23

Speaker B

Right, yeah.

33:23

Speaker D

And so this is kind of the way that you can somewhat easily kind of compare these models. Obviously that assumes that, you know, the.

33:24

Speaker C

Training set, you've always been very passionate about validation. I remember like Diff Doc and then there was like Diff Doc L and docgen. You've thought very carefully about this in the past. Like, actually, I think Doc Gen is like a really funny story that I think. I don't know if you want to talk about that. That's an interesting.

33:32

Speaker D

Yeah. I think one of the amazing things about putting things open source is that we get a ton of feedback from the field and sometimes we get kind of great feedback of people really liking the model. But honestly, most of the times, to be honest, that's also maybe the most useful feedback is people sharing about where it doesn't work. And so at the end of the day, it's critical. And this is also something across other fields of machine learning. It's always critical to do progress in machine learning, set clear benchmarks. And as you start doing progress of certain benchmarks, then you need to improve the benchmarks and make them harder and harder. And this is kind of the progression of how the field operates. And so the example of Doctrine was we published this initial model called DiffDoc in my first year of PhD, which was sort of like one of the early models to try to predict bio interactions between proteins, small molecules, that about a year after AlphaFold 2 was published. And now on the one hand, on these benchmarks that we were using at the time, DIVDOC was doing really well. Kind of outperforming some of the traditional physics based methods. But on the other hand, when we started giving these tools to many biologists, and one example that we collaborated with was the group of Nick Polizzi at Harvard, we started noticing that there was this clear pattern where four proteins that were very different from the ones that we're trained on. The models was struggling. And so that seemed clear that this is probably kind of where we should put our focus on. And so we first developed with Nick and his group a new benchmark and then went after and said, okay, what can we change and kind of about the current architecture, architecture to improve this pattern of generalization? And this is the same that we're still doing today. Where does the model not work? And then once we have that benchmark, let's try to throw everything, any ideas that we have at the problem.

33:50

Speaker A

There's a lot of healthy skepticism in the field, which I think is great. And I think it's very clear that there's a ton of things the models don't really work well on. But I think one thing that's probably undeniable is just the pace of progress and how much better we're getting every year. And so I think if you assume any constant rate of progress moving forward, I think things are going to look pretty cool at some point in the future.

36:15

Speaker C

ChatGPT was only three years ago.

36:42

Speaker A

Yeah, I mean, it's wild, right? It's one of those things like even being in the field, you don't see it coming, you know, and like, I think, yeah, hopefully we'll, you know, we'll, we'll continue to have as much bug as we've had the past few years.

36:44

Speaker B

So this is maybe an aside, but I, I'm really curious. You get this great feedback from the, from the community, right? By being open source. My question is partly like, okay, yeah, if you open source, then everyone can copy what you did. But it's also maybe balancing priorities. Right where you, you like all my customers are saying, I want this. There's all these problems with the model. Yeah, yeah, but my customers don't care.

36:56

Speaker C

Right.

37:23

Speaker B

So like, how do you, how do you think about that?

37:23

Speaker D

Yeah, so I would say a couple of things. One is, you know, part of our goal with Bolts, and you know, this is also kind of established as kind of the mission of the public benefit company that we started is to democratize the access to these tools. But one of the reason why we realized that Bolt's not needed to be a company, it couldn't just be an academic project is that putting a model on GitHub is definitely not enough to get chemists and biologists across both academia, biotech and pharma to use your model in their therapeutic programs. And so a lot of what we think about bolts beyond just the models is thinking about all the layers that come on top of the models to get from those models to something that can really enable scientists in the industry. And so that goes into building the right kind of workflows that take in, for example, the data and try to answer directly those problems that the chemists and the biologists are asking and then also kind of building the infrastructure. And so this to say that even with models fully open, we see a ton of potential for products in the space. And the critical part about a product is that even for example, with an open source model running, the model is not free. As we were saying, these are a pretty expensive model. And especially, and maybe we'll get into this, these days we're seeing kind of pretty dramatic inference time scaling of these models where the more you run them, the better the results are. But there you start getting into a point that compute cost becomes a critical factor. And so putting a lot of work into building the right kind of infrastructure, building the optimizations and so on, really allows us to provide a much better service potentially to the open source models. That to say, even though with a product we can provide a much better service, I do still think, and we will continue to put a lot of our models open source because the critical kind of role I think of open source models is helping the community progress on the research from which we all benefit. And so we'll continue to, on the one hand, put some of our base models open source so that the field can build on top of it. And as we discussed earlier, we learn a ton from the way that the field uses and builds on top of our models, but then try to build a product that gives the best experience possible to scientists so that a chemist or a biologist doesn't need to spin off a GPU and set up our open source model in a particular way, but can just, you know, a bit like, you know, I, even though I am a computer scientist, machine learning scientist, I don't necessarily take open source LLM and try to kind of spin it off. But you know, I just Maybe open the ChatGPT app or cloud code and just use it as an amazing product. We kind of want to give the same experience to scientists from the world.

37:25

Speaker B

I heard a good analogy yesterday that a surgeon doesn't want the hospital to design a scalpel, right, sir? Just buy the scalpel.

40:40

Speaker A

You wouldn't believe like the number of people even like in my short time, you know, between a full three coming out and the end of the PhD, like the number of people that would like reach out just for like us to like run up a full three for them or things like that just because like, or bolts in our case, you know, just because it's like not that easy, you know, to do that, you know, if you're not a computational person. And I think like part of the goal here is also that we continue to obviously build an interface with computational folks, but the models are also accessible to a larger, broader audience. And that comes from good interfaces and stuff like that.

40:50

Speaker C

I think one really interesting thing about Bolt is that with the release of it, you didn't just release a model, but you created a community. Did that community, it grew very quickly. Did that surprise you? And what is the evolution of that community and how is that fed into boltz?

41:27

Speaker A

If you look at its growth, it's very much like when we release a new model, there's a big jump. But yeah, it's been great. We have a Slack community that has thousands of people on it and it's actually self sustaining now, which is the really nice part, because it's almost overwhelming, I think, to be able to answer everyone's questions and help. It's really difficult with the few people that we were, but it ended up that people would answer each other's questions and sort of help one another. And so the slack has been kind of self sustaining and that's been really cool to see. And that's for the Slack part. But then Also obviously on GitHub as well, we've had a nice community. I think we also aspire to be even more active on it than we've been in the past six months, which has been a bit challenging for us. But yeah, the community has been really great. And there's a lot of papers also that have come out with new evolutions on top of bolts. And it surprised us to some degree because there's a lot of models out there and I think sort of people converging on that was really cool. And I think it speaks also, I think, to the importance of when you put code out to try to put a lot of emphasis in making it as easy to use as possible. And something we thought a lot about when we released the code base, you know, it's far from perfect, but, you.

41:45

Speaker B

Know, do you think that that was one of the factors that caused your community to grow is just the focus on easy to use, make it accessible.

43:07

Speaker A

I think so, yeah. And we've heard it from a few people over the, over the years now and you know, and some people still think it should be a lot nicer. And they're right. And they're right. But yeah, I think it was, you know, at the time maybe a little bit easier. Easier than other things.

43:13

Speaker D

The other part I think led to the community and to some extent I think somewhat the trust in the community and kind of what we put out is the fact that it's not really been kind of one model and maybe we'll talk about it. After Bolt one, there were maybe another couple of models kind of released or open source. Soon after we kind of continued kind of that open source journey release bolts 2, where we are not only improving kind of the structure prediction, but also starting to do affinity predictions. Understanding kind of the strength of the interactions between these different models, which is this critical component, critical property that you often want to optimize in discovery programs and then more recently also kind of protein design model. And so we've sort of been building this suite of, of models that come together, interact with one another where there is almost an expectation that we take very at heart of always having kind of across the entire suite of different tasks, the best or across the best model out there. So that it's sort of like our open source tool can be kind of the go to model for everybody in the, in the industry.

43:29

Speaker C

I really want to talk about Boltzgen, but before that, one last question in this direction. Was there anything about the community which surprised you? Were there any, like someone was doing something and you're like, why would you do that? That's crazy. Or that's actually genius. And I never would have thought about that.

44:46

Speaker A

I mean, we've had many contributions, I think like some of the interesting ones, like, I mean we had, you know, this one individual who wrote a complex GPU kernel for part of the architecture. The funny thing is that piece of the architecture had been there since AlphaFold 2. And I don't know why it took bolts for this person to decide to do it, but that was a really great contribution. We've had a bunch of others people figuring out ways to hack the model to do cyclic peptides. I don't know if there's any other.

45:01

Speaker D

Interesting one, cool one. And this was something that initially was proposed as a message in the Slack channel by Tim o' Donnell was basically there are Some cases, especially, for example, we discuss antibody antigen interactions where the models don't necessarily get the right answer. What he noticed is that the models were somewhat stuck into predicting kind of the antibody to interact with that part of the antigen. That was incorrect. And so he basically ran the experiments in this model, you can condition, basically you can give hints. And so he basically gave random hints to the model. Basically, okay, you should bind to this residue, you should bind to the first residue, or you should bind to the 11th residue, or you should bind to THE 21st residue. Basically everything, 10 residues. Scanning the entire antigen residues are the amino acids, so the first amino acids, the 11amino acids, and so on. So it's sort of like doing a scan and then conditioning the model to predict all of them and then looking at the confidence of the model in each of those cases and taking the top. And so it's sort of like a very somewhat crude way of doing kind of inference time search. But surprisingly for antibody antigen prediction, it actually kind of helped quite a bit. And so there's some interesting ideas that obviously as kind of developing the model, you say kind of, wow, this is. Why would the model be so dumb? But it's very interesting. And that leads you to also kind of start thinking about, okay, how do I can I do this? Not with this brute force, but in a smarter way. And so we've also done a lot.

45:40

Speaker A

Of work on that direction that speaks to the power of scoring. We're seeing that a lot. I'm sure we'll talk about it more when we talk about pulse gen. But our ability to take a structure and determine that that structure is good, somewhat accurate, whether that's a single chain or an interaction is a really powerful way of improving the models. Sort of like if you can sample a ton of, and you assume that if you sample enough, you're likely to have the good structure, then it really just becomes a ranking problem. And now we're part of the inference time scaling that Gabriel was talking about is very much that. It's like the more we sample, the more the ranking model ends up finding something it really likes. And so I think our ability to get better at ranking, I think is also what's going to enable the next big breakthroughs.

47:23

Speaker B

Interesting. But I guess there's a. My understanding there's a diffusion model and you generate some stuff and then you. I guess it's just what you said, right. Then you rank it using a score and then you finally. And so like, can you talk about those different parts?

48:17

Speaker D

Yeah. So first of all like the, one of the critical kind of beliefs that we had. You know, also when we started working on Bolt, Swan was sort of like the structure prediction models are somewhat R field version of some foundation models learning about how proteins and other molecules interact. And then we can leverage that learning to do all sorts of other things. And so with Boltz 2 we leverage that learning to do affinity prediction. So understanding kind of if I give you this protein, these small molecules, how tightly is the interaction? For Boltzgen, what we did was taking kind of the foundation models and then fine tune it to predict kind of entire new proteins. And so the way basically that works is sort of like instead of, for the protein that you're designing, instead of feeding in an actual sequence, you feed in a set of blank tokens and you train the models to predict both the structure of kind of that protein and with the structure also what the different amino acids of that proteins are. And so basically the way that pulsechain operates is that you feed a target, a protein that you may want to kind of bind to or another DNA, rna, and then you feed the high level kind of design specification of what you want your new protein to be. For example, it could be like an antibody with a particular framework, could be a peptide, could be many other things.

48:34

Speaker B

And that's with natural language or, and.

50:10

Speaker D

That'S basically prompting and we have kind of this sort of like spec that you specify and you feed kind of this spec to the model and the model translates this into a set of tokens, a set of conditioning to the model, a set of blank tokens, and then basically decodes, as part of the diffusion models, decodes a new structure and a new sequence for your protein. And basically, and then we take that, and as Jeremy was saying, trying to score it and how good of a binder it is to that original target.

50:12

Speaker B

That you give, you're using basically bolts to predict the folding and the affinity to that molecule. And then that kind of gives you a score.

50:51

Speaker D

Exactly. So you use this model to predict the structure and then you do two things. One is that you predict the structure with something like Boltz 2 and then you basically compare that structure with what the model predicted, what boltzstren predicted. And this is sort of like in the field called consistency. It's basically you want to make sure that the structure that you're predicting is actually what you're trying to design. And that gives you a much better confidence that that's a good design. And so that's the first filtering and the second filtering that we did as part of kind of the Boltzschend pipeline that was released is that we look at the confidence that the model has in the structure. Now, unfortunately, kind of going to your question of predicting affinity. Unfortunately, confidence is not a very good predictor of affinity. And so one of the things that we've actually done a ton of progress since we released Pulsechain and we have some new results that we are going to announce soon, is the ability to get much better hit rates when instead of trying to rely on confidence of the model, we are actually directly trying to predict different affinity of that interaction.

51:03

Speaker B

Okay, just backing up a minute. So your diffusion model actually predicts not only the protein sequence, but also the folding of it.

52:23

Speaker D

Exactly. And actually kind of the way one of the big different things that we did compared to other models in the space, and, you know, there were some papers that already kind of done this before, but we really scaled it up was, you know, basically somewhat merging kind of the structure prediction and the sequence prediction into almost the same task. And so the way that Boltzgen works is that you are basically the only thing that you're doing is predicting the structure. So the only sort of supervision is we give you a supervision on the structure. But because the structure is atomic and the different diagnostics have a different atomic composition, basically from the way that you place the atoms, we also understand not only kind of the structure that you wanted, but also the identity of the amino acid that the models believed was there. And so we've basically, instead of having these two supervision signals, one discrete, one continuous, that somewhat don't interact well together, we sort of build kind of an encoding of sequences in structures that allows us to basically use exactly the same superficial signal that we are using to bolt 2. That largely similar to what Alphavol 3 proposed, which is very scalable, and we can use that to design new proteins.

52:31

Speaker B

Interesting.

53:58

Speaker A

Maybe a quick shout out to Hannes Stark on our team, who did all this work.

53:58

Speaker C

Yeah, that was a really cool idea. I mean, looking at the paper and there's just this encoding, or you just add a bunch of, I guess, kind of atoms, which can be anything, and then they get sort of rearranged and then basically plopped on top of each other. And then that encodes what the amino acid is. And there's sort of like a unique way of doing this. That was such a cool, fun idea.

54:04

Speaker A

I think that idea had existed before. Yeah, there were a couple of papers.

54:29

Speaker D

That had proposed this and Alice really took it to the large scale in the Paper.

54:34

Speaker B

A lot of the paper for both gen is dedicated to actually the validation of the model. In my opinion, all the people we basically talk about feel that this sort of like in the wet lab or whatever, the appropriate, you know, sort of like in real world validation is the whole problem, or not the whole problem, but a big giant part of the problem. So can you talk a little bit about the highlights from there? That really. Because to me the results are impressive both from the perspective of the, you know, the model and also just the effort that went into the validation by a large team.

54:40

Speaker D

First of all, I think I should start saying is that both when we were at MIT and Thomas Yakoula and working at Barzillai's lab as well as at Boltz, you know, we're not a biolab and you know, we are not a therapeutic company. And so to some extent, you know, we were first forced to look outside of our group, our team to do the experimental validation. One of the things that really honest in the team pioneer was the idea, okay, can we go not only to, you know, maybe a specific group and you're trying to, to find a specific system and maybe overfit a bit to that system and trying to validate, but how can we test these models across a very wide variety of different settings so that anyone in the field and printing design is such a kind of wide task with all sorts of different applications from therapeutic to biosensors and many others that. So can we get a validation that kind of goes across many different tasks? And so he basically put together, I think it was something like 25 different academic and industry labs that committed to testing some of the designs from the model and some of this testing is still ongoing and giving results kind of back to us in exchange for hopefully getting some new grade sequences for their task. And he was able to coordinate this very wide set of scientists. And already in the paper, I think we shared results From, I think eight to 10 different labs kind of showing results from designing peptides, designing to target ordered proteins, peptides targeting disorder proteins. We showed results of designing proteins that bind to small molecules which showed results of designing nanobodies and across a wide variety of different targets. And so that sort of gave to the paper a lot of validation to the model, a lot of validation that was kind of of white.

55:18

Speaker B

And so those would be therapeutics for those animals or are they relevant to humans as well?

57:39

Speaker D

They're relevant to humans as well. Obviously you need to do some work into quote unquote humanizing them, making sure that, you know, they have the right characteristics too, so they're not toxic to humans and so on. There are some approved medicine in the market. There are non.

57:44

Speaker A

There's a general pattern, I think, and in trying to design things that are smaller, it's easier to manufacture at the same time, that comes with potentially other challenges. Maybe a little bit less selectivity than if you have something that has more hands. But there's this big desire to try to design mini proteins, nanobodies, small peptides that are just great drug modalities.

58:02

Speaker B

Okay, I think we left off. We were talking about validation in the lab, and I was very excited about seeing, like, all the diverse validations that you've done. Can you go into some more detail about them? Specific ones?

58:27

Speaker A

Yeah, the nanobody one, I think we did. What was it, 15 targets. Is that correct?

58:43

Speaker D

14.

58:49

Speaker A

14 targets testing. So typically, the way this works is like we make a lot of designs, all right, on the order of tens of thousands, and then we rank them and we pick the top N. In this case, N was 15 for each target. And then we measure the success rates both on how many targets we were able to get a binder for, and then also more generally, out of all of the binders that we designed, how many actually proved to be good binders? Some of the other ones, I think involved we had a cool one where there was a small molecule or design a protein that binds to it that has a lot of interesting applications. For example, Gabriel mentioned biosensing and things like that, which is pretty cool. We had a disordered protein, I think you mentioned also. And yeah, I think maybe those were some of the highlights.

58:49

Speaker D

Yeah. So I would say that the way that we structure some of those validations was, on the one hand, we have validations across a whole set of different problems that the biologists that we were working with came to us with. So we were trying to, for example, in some of the experiments, design peptides that would target RAX C, which is a target that is involved in metabolism. We had a number of other applications where we were trying to design peptides or other modalities against some other therapeutic relevant targets. We designed some proteins to bind small molecules. And then some of the other testing that we did was really trying to get a more broader sense of how does the model work, especially when tested on somewhat generalization. So one of the things that we found with the field was that a lot of the validation, especially outside of the. The validation that was done on specific problems was done on targets that have a lot of known interactions in the training data. And so it's always a bit hard to understand how much are these models really just regurgitating what they've seen or trying to imitate what they've seen in training data versus really be able to design new proteins. And so one of the experiments that we did was to take nine targets from the PDB filter into things where there is no known interaction in the pdb. So basically the model has never seen this particular protein bound or similar protein bound to another protein. So there is no way that the model from its training set can sort of say, okay, I'm just going to tweak something, tweak something, and just imitate this particular kind of interaction. And so we took those nine proteins, we worked with Adaptive CRO and basically tested 15 mini proteins and 15 nanobodies against each one of them. And the very cool thing that we saw was that on two thirds of those targets, we were able to, from these 15 designs, get nanomolar binders. Nanomolar, roughly speaking, is just a measure of how strongly the interaction is, roughly speaking, kind of like a nanomolar binder is approximately the kind of strength of binding that you need for a therapeutic.

59:43

Speaker C

Yeah. So maybe switching directions a bit. Bolt's lab was just announced this week, or was it last week? This is like your first, I guess, product, if you want to call it that. Can you talk about what Boltz Lab is and what you hope that people take away from this?

1:02:21

Speaker A

Yeah, as we mentioned, I think at the very beginning is the goal with the product has been to address what the models don't on their own. And there's largely sort of two categories there. I'll split it in three. The first one, it's one thing to predict a single interaction, for example, like a single structure. It's another to very effectively search a space, design space, to produce something of value. What we found sort of building up this product is that there's a lot of steps involved in that we sort of need to accompany the user through one of those steps. For example, is the creation of the target itself. How do we make sure the model has a good enough understanding of the target so we can design something? And there's all sorts of tricks that you can do to improve a particular structure prediction. And so that's sort of like the first stage. And then there's this stage of designing and searching the space efficiently for something like Boltgen, for example, you design many things and then you rank them. For example, for small molecule, the process is a little Bit more complicated. We actually need to also make sure that the molecules are synthesizable. And so the way we do that is that we have a generative model that learns to use appropriate building blocks such that it can design within a space that we know is synthesizable. And so there's this whole pipeline really, of different models involved in being able to design a molecule. So that's been sort of like the first thing. We call them agents. We have a protein agent and we have a small molecule design agents. And that's really at the core of what powers the Pulse Lab platform.

1:02:43

Speaker B

So these agents, are they a language model wrapper, or they're just like your models, and you're just calling them agents because they sort of perform a function on behalf of.

1:04:22

Speaker A

They're more of like a recipe, if you wish. And I think we use that term sort of because of sort of the complex pipelining and automation that goes into all this plumbing. So that's the first part of the product. The second part is the infrastructure. We need to be able to do this at very large scale. For any one group that's doing a design campaign, let's say you're designing 100,000 possible candidates to find the good one, that is a very large amount of compute. For small molecules, it's on the order of a few seconds per design. For proteins, it can be a bit longer. And so ideally you want to do that in parallel, otherwise it's going to take you weeks. And so we've put a lot of effort into our ability to have a GPU fleet that allows any one user to be able to do this kind of large parallel search.

1:04:33

Speaker B

So you're amortizing the cost over your users.

1:05:24

Speaker A

Exactly. And to some degree, whether you use 10,000 GPUs for a minute is the same cost as using one GPUs for God knows how long. Right. So you might as well try to parallelize if you can. So a lot of work has gone into that, making it very robust. So we can have a lot of people on the platform doing that at the same time. And the third one is the interface. And the interface comes in two shapes. One is in form of an API. And that's really suited for companies that want to integrate these pipelines, these agents, directly in existing workflows that they have, or like existing user interfaces that they have. And we're already partnering with a few distributors that are going to integrate our API. And then the second part is new user interface. And we've put a lot of Thoughts also into that. And this is when I mentioned earlier this idea of broadening the audience, that's kind of what the user interface is about. And we've built a lot of interesting features in it. For example, for collaboration, when you have eventually multiple medicinal chemists are going through the results and trying to pick out, okay, what are the molecules that we're going to go and test in the lab. It's powerful for them to be able to, for example, each provide their own ranking and then do consensus building. So there's a lot of features around launching these large jobs, but also around collaborating on analyzing the results that we try to solve with that part of the platform. So BoltLab is a combination of these three objectives into one cohesive platform.

1:05:26

Speaker B

Who is this accessible to?

1:06:53

Speaker A

Everyone. You do need to request access today we're still ramping up the usage, but anyone can request access. If you are an academic in particular, we provide a fair amount of free credit so you can play with the platform. If you are a startup or a biotech, you may also reach out and we'll typically actually hop on a call just to understand what you're trying to do and also provide a lot of free credit to get started. And of course also with larger companies we can deploy this platform in a more secure environment. And so that's more like custom deals that we make with the partners. And that's sort of at the ethos of Bolt. I think this idea of servicing everyone and not necessarily going after just the really large enterprises. And that starts from the open source, but it's also key design principle of the product itself.

1:06:55

Speaker C

One thing I was thinking about with regards to infrastructure in the LLM space, the cost of a token has gone down by I think a factor of 1,000 or so over the last three years. Right?

1:07:48

Speaker A

Yeah.

1:07:57

Speaker C

Is it possible that essentially you can exploit economies of scale and infrastructure, that you can make it cheaper to run these things yourself than for any person to roll their own stuff?

1:07:58

Speaker A

100%? Yeah. I mean we're already there. Like running bolts on our platform, especially in a large screen, is considerably cheaper than it would probably take anyone to put the open source model out there and run it. And on top of the infrastructure, one of the things that we've been working on is accelerating the models. So our small molecule screening pipeline is 10x faster on Bolt's lab than it is in the open source. And that's also part of building a, a product of something that scales really well. And we really wanted to get to a point where we could Keep prices very low in a way that it would be a no brainer to use bolts through our platform.

1:08:07

Speaker C

How do you think about validation of your agentic systems? Because as you were saying earlier, alphafold style models are really good at let's say monomeric proteins where you have coevolution data. But now suddenly the whole point of this is to design something which doesn't have coevolution data, something which is really novel. So now you're basically leaving the domain that you thought was that you know you are good at. So how do you validate that?

1:08:52

Speaker A

Yeah, I like Gabri complete, but there's obviously a ton of competitional metrics that we rely on, but those are only taking so far. You really got to go to the lab and test. Okay, with this method A and this method B, how much better are we? How much better is my hit rate? How stronger are my binders? Also it's not just about hit rate, it's also about how good the binders are and there's really nowhere around that. And I think we've really ramped up the amount of experimental validation that we do so that we really track progress as, as scientifically sound as possible.

1:09:22

Speaker D

Yeah, no, I think one thing that is unique about us and maybe companies like us is that because we're not working on maybe a couple of therapeutic pipelines where our validation would be focused on those. When we do an experimental validation, we try to test it across tens of targets. And so that on the one end we can get a much more sensitive statistically significant results and really allows us to make progress from the methodological side without being steered by overfitting on any one particular system. And of course we always try to choose targets and problems are sort of at the frontier of what's possible today. So you don't want something too easy, you don't want something too hard, otherwise you, you're not going to see progress. And so this is a somewhat evolving set of targets. We talked earlier about the targets that we looked at with Boltran. Now we are even trying even harder targets both for small molecules and proteins. And so we try to keep ourselves on the boundary of what's possible.

1:10:00

Speaker C

So do you have infrastructure or this is like you just have a lot of different partnerships with academic labs and you're just going to keep pushing on these and trying to driving these.

1:11:07

Speaker D

We do partially this through academic labs. More and more we do this through CROs just because of to some extent it's also we need kind of replicability often kind of Going after the same targets multiple times and to see the progress from one month to the next.

1:11:15

Speaker A

And speed, speed of execution. Yeah.

1:11:33

Speaker C

So what happens if you start getting a bunch of really strong biters that you. It's therapeutic targets. What do you do. In open source?

1:11:37

Speaker A

Yeah, I mean, you know, I mean when we say we have no interest in making dress, we're serious. Like, you know, I mean when it, when it was with the academic labs, basically the, you know, it was, they keep it, they do whatever they want with it. And with the CRO so far, yeah, we've been, we've been very releasing.

1:11:47

Speaker D

You know, I will also say, and I think this is a bit, been a bit of the issue that I have with some of kind of the things that have been said in the field is when we say that we design new proteins or we say that we design new molecules, go and bind these particular targets, we should be very clear these are not drugs, these are not things that are ready to be put into a human. And there is still a lot of development that that goes with it. And so this is kind of to us, we see ourselves as building tools for scientists. At the end of the day, it really relies on the scientists having a great therapeutic hypothesis and then pushing through kind of all the stages of development and we try to build tools that can accompany them in that journey. It's not like a magic box where you can just turn it and get FDA approved drugs.

1:12:03

Speaker B

Yeah, but actually that brings up an interesting question that I have, I've been wondering about is do you guys see yourself staying in this, for lack of a better way of saying it layer or do you think that you'll start to either on the physical sense, looking at different layers of the virtual cell, so to speak, or also. So there's the development process that goes sort of like design pre clinical, clinical approval and thinking about improving the performance throughout that process based on the designs. Is that a direction that you guys are pushing?

1:13:07

Speaker D

Yeah. So one of the things, as Jeremy said, we are not a therapeutic company and we want to kind of stay not to be a therapeutic company, always be at the service of all the different companies, including therapy companies that we serve. And that to some extent does mean that we need to try to go deeper and deeper in getting these models better and better. One of the things that we are doing across many other in the field is now that we are ready start to be good both for small molecule and for proteins to design kind of binders, designed relatively tight binders is starting to look at all these other properties called developabilities or atme, that we care about when developing a drug and trying can we design them from Gecko? The thing about those properties, in some of them, you need to start having an understanding of the cell. And so that's on the one end kind of why we need that understanding, but also the way that we also think about all different and complex diseases is that these models and these tools that we're building have a good understanding of kind of biomolecular interactions and kind of their interactions. Now, at the same time, every disease is often kind of unique and every therapeutic hypothesis is unique. And so you maybe want to have something that needs to hit the particular, let's say, target in a virus in a particular way, but you don't maybe know exactly what way you want to do. And so maybe in the first set of designs you're going to try to target different epitopes in different ways and then you're going to test them in the lab, maybe directly in vivo, and you're going to see which ones work and which ones don't. And so then you need to bring those results back into the models. And then the models can start to have a more wider understanding, not just of the biophysical, of the antibodies interacting with that target, but also how that is shaped within the entire cell. And so first of all, that means on the one hand that we need kind of these loops and this is also partially how we design the platform to be. But that also means that we also need to start understanding more and more kind of higher level things. And I wouldn't say that we're working in any way on a virtual cell like others are. But we're definitely thinking very deeply about how does the way that we target certain proteins interfere, interact with maybe pathways that are existing in the cell.

1:13:45

Speaker C

One question that has come up is you talk a lot about user interface and so on, and I think this is really important. But my experience with dealing with medicinal chemists, when you give them machine learning models, is they are the most superstitious, skeptical, pseudo religious people I've ever talked to when it comes to doing science.

1:16:25

Speaker B

Sorry for the Minnesota chemists listening.

1:16:45

Speaker C

Yeah, they're amazing. They're absolutely. I've worked with some spectacular medicinal chemists who just pull magic out of their hat again and again and I have no idea how they do it. But when you bring them a machine learning model, it is sometimes quite tricky to get them to deal with it. How has your interaction been with this and how have you thought about building BoltzLab to work with the skeptics.

1:16:47

Speaker D

One of the great value unlocks for us and for our product has been when we brought to the team, his name is Jeffrey. So I think kind of like on the one hand, day one, he obviously had a lot of opinions on a lot of the ways that we should change both kind of the way that the agents worked, the way that the platform worked. But it's been really amazing. Kind of, you know, once also we started kind of shaping kind of the platform in a better way with this feedback, how we went from some extent a fair skepticism to him, actually using a lot more compute than any of our computational folks in the team. At times that he's running, he has all these sort of hypothesis, okay, maybe I can hit this protein this particular way. I can hit it in another way, actually. Let me look at, for this particular molecular space, let me try to optimize for these particular interactions. So he ends up running several screens in parallel using hundreds of GPUs on his own. So this, this has been pretty incredible to see, kind of how maybe the way that I was more thinking about a problem, which is, okay, you're just trying to design a binder, a small molecule to a particular protein. The way that he thinks about it is much more deeply and trying all these different things, these different hypotheses. And then once he gets the results from the model, he doesn't just take the top 15, but it really looks over and kind of tries to understand the different things. And then when we select maybe some designs to bring forth, he has something where both the models understand that something's good, but himself as well. And that's why we also built kind of the platform to be an interface for this kind of chemist and also collaborative experience.

1:17:10

Speaker A

I think at the end of the day, for people to be consistent, convinced, you have to show them something that they didn't think was possible. And until you have that aha moment, you know, I think the skepticism will remain. But then when, you know, every once in a while I think there's like a result that like, really surprises people. And then it's like, oh, wow, okay, this actually, I can do something with this.

1:19:09

Speaker C

So you just get in their hands, have them try it out and they'll be convinced.

1:19:28

Speaker A

Yeah, or like maybe once the lab.

1:19:33

Speaker C

Results come back or their, their friend. Yeah. Or maybe one of their colleagues is convinced.

1:19:35

Speaker A

I think it takes going to the lab at some point. There's no avoiding that. As beautiful as the platform can be, as nice as the molecules might look that the model predicted. I think what really convinces people is hits.

1:19:40

Speaker C

Yeah, you see the results.

1:19:54

Speaker A

Exactly.

1:19:56

Speaker C

Cool. Thank you for taking the time to chat with us. Is there anything that you would like your audience to know?

1:19:57

Speaker D

I mean first of all we're just getting started continuing to build a team and so definitely always looking for great folks both on kind of software side, machine learning side but also scientists to join the team and help us shape.

1:20:05

Speaker C

On the infrastructure side too.

1:20:25

Speaker D

Indeed.

1:20:27

Speaker C

If you think that if you want a new challenge because this is not just next token prediction, this is really a new engineering challenge that's based on.

1:20:28

Speaker D

The no matter how much experience you have with biologists and chemistry, if you want to come help us in shape what biology and chemistry hopefully will look like in five, 10 years, we'd love to hear from you. And so go to Bolts Bio and come join the team.

1:20:36

Speaker C

Cool. Thank you.

1:20:56

Speaker B

Awesome.

1:20:57

Speaker A

Thank you so much. Thank you.

1:20:58