VAEs Are Energy-Based Models? [Dr. Jeff Beck]
47 min
•Jan 25, 20263 months agoSummary
Dr. Jeff Beck discusses energy-based models, variational autoencoders, and their relationship to Bayesian inference, while exploring philosophical questions about agency, planning, and counterfactual reasoning. The conversation covers geometric deep learning, physics discovery algorithms, and the future of AI including continual learning, recursive self-improvement, and safe AI development through inverse reinforcement learning.
Insights
- Energy-based models provide an inductive prior by constraining input-output relationships, unlike pure function approximation which assumes any mapping is possible
- Agency is best understood as a spectrum of sophistication in internal state representations and policy computation rather than a binary distinction
- Joint Embedding Prediction Architectures (JEPA) represent a shift toward learning compressed representations rather than pixel-perfect predictions, aligning with how humans understand the world
- True artificial intelligence requires the ability to generate new models on-the-fly for novel situations and combine existing models in creative ways, mirroring biological brain evolution
- Safe AI development depends on empirically estimating reward functions from human behavior rather than naively specifying objectives, using inverse reinforcement learning approaches
Trends
Shift from test-time inference to treating model weights as latent variables during deployment, moving toward energy-based model principlesNon-contrastive learning methods (BYOL, Barlow Twins, V-Craig) gaining traction as alternatives to expensive contrastive approachesSelf-supervised learning emphasis on maintaining information fidelity and richness rather than task-specific optimizationGrowing recognition that continual learning and online adaptation are critical missing elements in current AI systemsMovement toward modular, specialized AI systems that can be composed and recombined rather than monolithic general-purpose modelsIncreased focus on experimental design automation and hypothesis testing as next frontier beyond data correlation discoveryTransduction and test-time optimization becoming more prominent in model deployment strategiesIntegration of physics-based inductive biases and geometric deep learning into neural network architecturesEmphasis on interpretability through energy functions and Bayesian frameworks rather than black-box function approximationCybernetic transhumanism vision where AI and humans co-evolve as partners rather than adversarial or hierarchical relationships
Topics
Energy-Based Models and Bayesian InferenceVariational Autoencoders (VAEs)Joint Embedding Prediction Architectures (JEPA)Geometric Deep Learning and SymmetriesAgency and Counterfactual ReasoningPhysics Discovery AlgorithmsNon-Contrastive Learning MethodsSelf-Supervised LearningContinual Learning and Online AdaptationInverse Reinforcement LearningModular and Compositional AIExperimental Design AutomationAI Safety and Reward Function SpecificationTransduction and Test-Time OptimizationCollective Intelligence and Specialization
Companies
OpenAI
Mentioned in context of thinking models that defeated the ARC benchmark challenge
People
Yann LeCun
Advocate of Joint Embedding Prediction Architectures (JEPA) and energy-based models; published monograph on energy-ba...
Geoffrey Hinton
Credited with developing negative sampling and contrastive learning methods for representation learning
Yoshua Bengio
Developer of G-Flow Nets, generative models capable of creating new latent variables for novel situations
Francois Chollet
Interviewed about ARC challenge version 2 and intelligence benchmarking
Clement Bonnet
Implemented VAE on ARC challenge with latent space search through decoder
Daniel Dennett
Philosopher whose intentional stance concept relates to modeling agents and agency
Quotes
"Science is about prediction and data compression and nothing else."
Dr. Jeff Beck•Mid-episode
"An agent is a really sophisticated object, right? It has internal states that represent things over very long time scales."
Dr. Jeff Beck•Early discussion on agency
"I do believe that an agent needs to be physical. Absolutely. I don't believe you can have a model of agency and not have an agent."
Dr. Jeff Beck•Agency discussion
"The fundamental feature of an agent is that it's engaged in planning counterfactual reasoning."
Dr. Jeff Beck•Agency definition
"I worry a lot more about somebody building some insane virus and takes down the internet. I'm more worried about malicious human actors than malicious AI actors."
Dr. Jeff Beck•AI safety discussion
Full Transcript
Geometric deep learning is a big part of the stack, if for no other reason than when we talk about modeling the physical world. That means incorporating the symmetries that exist in the physical world. We're highly motivated to employ a lot of those methods and techniques. But is the world written in code, or do you mean exploiting the regularities in the code that seem to have symmetries? Exploiting the regularities. No, it's like, look, the world is translation invariant. The world is like rotation. Well, not really because there's gravity, but like in principle, you know, there is a principle axis, but it's certainly rotationally invariant in the XY plane. Yeah. And if you want to have a good model of the world as it actually is, it should incorporate those features. Of course, you can discover it, you know, in a brute forcey way, but the mathematician in me really wants to build the symmetries in. And fortunately, we've got a lot of great tools that were developed over the last several years that can do that. What's your view on agency? If I'm being, you know, like an FEP purist, I have to sort of say like, oh, well, there's no difference between, you know, an agent and an object in a very real way. Or at least there's nothing structurally distinct between how we model an agent and how we model an object. It's really just a question of degrees, right? An agent is a really sophisticated object, right? It has internal states that represent things over very long time scales. um you know uh it has uh sophisticated policies that are context dependent which is basically saying really long timescales again um and things like that yeah you know um there's the kind of the philosophical highbrow notion of agency that we introduce notions of um intentionality and self-causation and things like that i mean the really no nonsense version of an agency is it's It's just a thing which acts and performs some kind of computation. And I guess you could almost model anything as an agent. Yeah. Well, so if your definition of an agent is something that executes a policy, then anything is an agent, right? A rock is an agent, right? Everything has – it's an input – a policy is an input-output relationship. When many people talk about agents, they're adding a few additional elements that I think have a lot to do with how the policy is computed. Right. So, for example, when we think of how the difference between like us and like like really like amoebas, we we often cite things like planning, counterfactual reasoning, goal oriented behavior. Right. We're specifying things that that have that that are specific mean that are all related to how it is we compute our policies. right they're latent variables that represent policies um that are uh you know that are compatible with like well reinforcement learning right and um and that's the defining characteristic of an agent but you could very easily just sort of say like from an outside perspective if you can't look at how someone or something is doing the computations if the only thing you observe is the policy right does that mean that you can never conclude that something's an agent And I would say no, right? You'd still like to be able to conclude that this is an agent, even though the only thing I ever get to measure is its policy. But do you think we should have some notion of the strength of an agent? The strength of an agent or how is this like a measure of agency? Is that what you mean? Yeah. So, I mean, I think you could use like notions of like transfer entropy and things like that in order to estimate like the timetable for which something is incorporating information or the degree to which it's taken into it exhibits a context-dependent behavior and things like that. And that would be a pretty good measure. Now, is it normative? No, it's not. But it is a measure, and you could use things like that. But at that point, you're really just talking, again, about policy sophistication, right? Not does it have a reward function? Like, is it actually executing planning? Yeah, I mean, certainly intuitively, agents to me seem to be kind of causally disconnected. because they're planning into the future. They are not impulse response machines. They're not just, you know, part of the mass of things going on around them. They are just obviously disconnected from the locality. So here, the trick is, is that, okay, so I've got this agent and I know exactly what it does, right? It takes into account information. It rolls out future, you know, internally, it rolls out a whole bunch of like future consequences of various different actions or plans that it could take, it selects the best one and then it executes it, right? So all of those variables, all of those variables that occurred inside, right? From the outside perspective, it just looked like a function transformation, right? Unless I'm somehow going in and recording and somehow demonstrating the fact that the manner in which it is calculating its policy, you know, like involved doing those rollouts, right i wouldn't be able to show that it's actually doing those rollouts i would just be able to conclude it has a really sophisticated policy so can you conclude that something isn't is is so the question is how do you identify something that's actually doing planning i think that's a really hard question as opposed to having an incredibly sophisticated policy i think my intuition is if it feels to me that a function a simple input output mapping can't be an agent and and in a way this is related to what we were talking about with grounding you know it It seems that when things are physically embedded in the world, then they're more likely to be agents. This functionalist idea that's just a bit of computer code running on a machine, it kind of feels like that can't be an agent. It does. So suppose I coded it up so it was doing all of that planning. It's like gets its inputs, does some crazy like massive Monte Carlo tree search, picks the best policy possible, and then executes it. Now, you don't observe any of that, right? Because you know what's going on, you could say, oh, well, it's clearly like executing, it's doing planning and counterfactual reasoning. It's going on. Look, there it is because you coded it, so you know it's doing it. But if you're looking at it from the outside, if you don't know what's happening inside, all you have access to is, oh, here's the action that it did given this long series of inputs. and so it's it's really hard to identify what you know something as an agent per se from the outside you kind of have to know what's going on inside this by the way is why i don't think that like you know can you know these sort of prediction based approaches to like ai um are necessary you know you could sort of say well it's not really doing anything even remotely agentic unless it's executing and doing planning counterfactual reasons so like your chess program is like, oh, clearly it's doing some planning and counterfactual reasoning because you know it's doing it. But I could describe the exact same set of behaviors just with a policy function. I think the counterfactual thing is an important feature here because we could take something which was conscious or something which had agency and we could just take a trace of the actual path which was found. And now we've just got, this is a reductio ad absurdum, but now we've just got a computational trace. And that thing clearly has now lost whatever agency or consciousness it had. So there's something about considering all of the possibilities. Yeah. Yeah. I think so. In my mind, that is the fundamental feature of an agent. Like if you can show that it's engaged in planning counterfactual reasoning and then it's definitely an agent. My argument is just simply that that's hard to do unless you crack it open and see what's going on inside. Now, you could take a pragmatic view and say, well, if the simplest computational model of the behavior, model it as if it was doing planning and counterfactual reasoning, then you can draw an implicit conclusion that, oh, yes, well, I may as well say it's an agent. And that's kind of the approach that I've taken. So one of the things that comes out of the physics discovery algorithm is that you apply it to agents and what do you get? Well, you get a model. Now, bear in mind, I called them all objects before, and I didn't change anything to make it special to an actual agent. But what I do have the ability to do because of the model is I can look at the internal states associated with that object that I want to call an agent and look at how sophisticated it is. And that degree of sophistication is what allows me to say, oh, well, I'm going to go ahead and say that. And I like the whole idea. It's a great idea. Let's have a metric. right and i'm sure it would be something that would effectively be like transfer entropy or something like that but we have this metric on like well how sophisticated were the internal states that were necessary in order to generate this output and if it's above some threshold we'll call it an agent i don't like thresholds but you know we just sort of say a degree of agency a degree of sophistication and coming back to dennett's intentional stance so this is that you know there is um a level of representation which serves as a useful explanation even though it's not actually the microscopic causal graph. And maybe we can agree that no agent can possibly be the cause of its own actions. But when there is a degree of planning sophistication, macroscopically, it's as if it's the cause of its own actions. Yes. And that's why this as if phrase comes up a lot, right? I mean, it's important to remember that no matter how clever your model is, and no matter how clever your approach is, and how clever the words are that you use to describe it, A lot of this stuff is as if. This is the best model. This is why I repeat this over and over again, grind it into the students. Science is about prediction and data compression and nothing else. And the same thing is going on here, right? You'll never, you know, just looking at behavior, you'll never know for sure in any meaningful way, like whether or not it's just doing a function transformation or whether it's engaged in planning and counterfactual reasoning. But if your best model of it, if you sort of say, well, I tried to model as a function transformation, but goddammit, it had a lot of parameters, right? But then I tried to model it as something that was just doing Monte Carlo tree search on the inside and giving the answer. And that had like, you know, 40 parameters. And it's like, well, that's the model I'm going to go with. And now I'm going to call it an agent. If we had a physical agent in the real world that was doing all of this planning and so on, would that have some kind of primacy to the computer simulation of agents that were doing all of this planning? Oh, is this like if I uploaded my brain onto a computer and didn't connect it to the world, would it still be thinking even though it's like doing all of those things? Is that the idea here or am I like way off? That works. So yeah, let's say a high fidelity computer simulation of Jeff. Would Jeff be an agent? No. Oh, wasn't expecting to say that. Because I'm the agent. And if you upload it, no, I don't know. Um, so if you do a high fidelity computer simulation brain and you put it in my body, then I think I would have to say it's an agent. Yeah. Right. If it's doing exactly the same, I mean, this is like the standard, it's doing exactly the same calculations from, from a purely like phenomenal auto perspective. It's like, it's the same. It's indistinguishable. Okay. So agents need to be physical. So I do believe that an agent needs to be physical. Absolutely. I don't believe, you know, I believe you can have a model of agency and not have an agent. Right. You can put that model in a computer and run it and make predictions as to what an agent would do. And it might even be 100% correct, but I still wouldn't call it an agent. But again this is like getting into philosophy And philosophy frustrates the Bayesian because philosophy is not probabilistic right Philosophy is really about drawing clear lines and distinctions. And in my world, those don't really exist, right? There's, everything has an error bar. You know, all of, there isn't a clear delineation between, you know, an object and an agent. It's really, you know, from this modeling perspective, it's really just a question of degrees. and philosophy is terrible at handling questions of degree. My friend Keith, he's a big fan of computability theory. And he thinks that an agent is basically, you know, like a type of computation and it has access to ambient state and it can take action and there's this kind of like cybernetic loop. And for him, the strength of the agency in the system is the compute type that the thing is doing, right? So if it's a finite state automata, then it's a weak agent. If it's a Turing machine, it's a strong agent. Yeah, it's the degree of sophistication of the compute. Pretty much. Does that ring true to you? I mean, if you forced me at the point of a gun to put a measure on agency, it'd probably look a lot like that. Yes. Jeff, let's talk about energy-based models. Sure. So Jan LeCun, he had a monograph out, I think in 2006, talking about this. Oh, yeah. He's been talking about this for a long time. Oh, yeah. When you fit your neural network to data via gradient descent, then you have written an energy function in weight space and you are following it to its energetic minimum. The advantage of taking an energy-based approach as opposed to taking, say, a straight-up function approximation approach is that an energy-based model comes with something that's kind of like an inductive prior. right it it basically you know an energy-based model you know if you're just doing function approximation you're basically saying there's any mapping from x to y x is my inputs y's but any mapping is out there i just want to figure out what it is right now in an you know in an energy-based model right you're you're you're you're effectively placing constraints on what that input output relationship can be i like thinking about the distinction between an energy-based model and a traditional sort of feed-forward neural network um has to do with where your cost function is applied, right? So in a traditional neural network, you take in your inputs, you got your outputs, and the cost function is just a function of the inputs and the outputs. And the only thing that you're optimizing is the weights. In an energy-based model, there's another thing that your cost function operates on, and that's something, one of the internal states of your model. And as a result, like in order to figure out what the best, you know, the best approach is, right, you actually have to do two minimizations. One that finds the energetic minimum associated with the part of the cost function that operates on the internal states, like the hidden nodes of your network, and then one that is your effective prediction error. This is very much consistent with the approach that a Bayesian would take. You have a prior probability distribution, which gives you an energy function over every single latent variable in your model, and you are optimizing with respect to all of them. So you take a probabilistic approach. Good examples of this are like a variational autoencoder. A variational autoencoder, I think, is the best example of the most commonly used energy-based model out there. Why? Because you have an encoder network, you have a decoder network, right? And your cost function is based on the difference between inputs and outputs, right? So that's just like a, yeah, it's fine. That's still a regular, but it also is how Gaussian in a, well, it depends on what flavor of VA, but you also have some part of your cost function is a function of the actual internal representation, right? In a traditional VAE, it's how Gaussian is it? You want that internal representation to be as Gaussian as possible. If it's a VQVAE, then it's like mixture of Gaussian. But it's still like a cost function that is applied on the internal states as well as on the inputs and outputs. Very cool. So a VAE is a fairly canonical example of an energy-based model. And what you were saying about the whole DL world is obsessed with test time inference at the moment. And in a way, that is a step towards what you're talking about. Yeah, you're treating some of the weights of your model, right? I mean, yeah, you're treating some of the weights of your model as if they're latent variables, right? Because when you show a new input, right, you're allowed to change some of the weights without looking at the output, right? And so what are you doing? Well, you're treating the weights as latents. Now, I think that like, which makes it a great trick, in my opinion. It's like, oh, great. Yeah, they're moving in the direction of energy-based models. I love it. The only thing I don't like about test time training is the vast majority of the training that is done. So in a traditional energy-based model, you always find the minimum with respect to the latent variables, these extra weights, which in the case of test time training is the subset of weights that you're allowed to change during test time. When you do the training for a traditional energy-based model, you're allowed to make those changes throughout the entire course of training. The way that we're often doing test time training these days is we just do regular old neural network learning. And then finally, when we get to the deployment phase, then we suddenly turn on these additional latents, which are basically some of the weights of the network, and we do an additional bit of learning at that point. This seems monumental. Now, again, not an expert here, but this seems unwise to me. And the reason it seems unwise is because you didn't train the original network with that on, right? You trained it in a completely supervised way. Now, I'm sure that people are aware of this and it's been addressed in the literature, but I'm not personally aware of that. I don't think that's how it's used in practice. We should also introduce this term transduction. So my definition of transduction is that you're actually doing search or optimization as a function of the test samples. Like I interviewed Clement Bonnet, he had a VAE on ARC, you know, searching latent spaces. and he actually searched through the decoder as a function of the test sample. And because these models, they are maximum likelihood estimators, right? Which means they're always giving you a kind of smoothed out average. And there's so much information in the test sample. Let's just riff on the relationship between energy-based models and Bayesian inference. So, of course, they have this advantage that you don't need to do this for expensive, intractable normalization. Yes. Yes. Tell me about that. My take on it is that an energy-based model and a Bayesian model have a lot in common, right? In many ways, like energy – I mean, well, literally in physics, right? Energy is like log probability. Now, of course, there's the normalization factor that you don't need to worry about if you're just minimizing energy. And so the difference between, which is sort of like in a Bayesian framework, that's like saying, well, I'm not actually going to treat some of these latent variables in a probabilistic way. I'm just going to do maximum or map estimation on some of my variables and just be okay with that. And that's one way to interpret the relationship between an energy-based model and a properly Bayesian model. There's a happy medium here, though. right and the happy medium is you can still treat it as if it's you know you know you don't have to just minimize the energy function but you can calculate the curvature down there too do a laplace approximation and call yourself a bayesian again right yes there is more computation involved but we've got a lot of great tricks for making that totally tractable what's the relationship between the free energy in the free energy principle and the energy and energy-based models uh regularization term i think is the short answer right um no so so uh there's been and if you're being very very very pedantic the difference between an energy-based you know minimizing energy and minimizing free energy is that free energy has this additional entropy penalty term now if you're just doing maximum likelihood estimation if you're minimizing your energy function with respect to some particular just we'll just pretend we're only one variable and I'm just going to get a point estimate and call it a day, do some kind of map estimation to get that one thing. There's not that big of a difference, right? Because there is no probability distribution over the latent that allows you to compute that regularization term. But that's the only difference. It's are you regularizing or not is, I think, the easiest way to think about it. So Lacoon is a big advocate of JEPA. So these joint embedding prediction architectures using this non-contrastive learning where essentially the learning objective is comparing the latents of observed and unobserved parts of the space. This is an architectural design. Well, okay, so what does JEPA stand for? It's Joint Embedding and Prediction Architecture. There we go. So what's the joint embedding bit about? Well, the joint embedding bit about is, well, I'm going to take my inputs, I'm going to take my outputs, and I'm going to embed them in some space, right? And then I'm going to learn a prediction between the two embeddings. And that's a great idea. It's a great idea because it has some of the flavor of what we would like to get out of our models. We're not interested in predicting every, in many situations, I should be very particular about this, in many situations, we're not interested in predicting every single pixel on the image. We want to get, you know, maybe something that's a little more gestalt, a little more high level, a little more conceptual understanding of what's going on. And so emphasizing the goal of predicting every single pixel, which is what's typically done in generative modeling right now, might lose some of the power, the abstractive power of some of the networks. And so the whole point of JEPA, as I understand it, I'm sure there are other points, is that you're going to compress your inputs and compress your outputs and then do all the learning in this compressed space. Love it, right? Science is about prediction and data compression. Let's make that compression explicit on the front end and the back end. The downside of this approach is that it doesn't work out of the box. because it's very easy to find a compression or an embedding of the inputs and an embedding of the outputs for which prediction is perfect, which is to basically make both of them zero. And so you have to do some other things. Other tricks need to be employed in order to make it work. Yes. Yes, I remember Lacoon was talking about this. So there's the traditional contrastive method, which is from, it's kind of Hinton's idea apparently, like the negative sampling and whatnot. And that's very expensive because you actually have to do lots and lots of sampling and this non-contrastive thing. Yeah, this, by the way, is what he should have won the Nobel Prize for. Right. In my opinion. Yes. Because the whole point of the wake-sleep algorithm and contrastive divergence was that, oh, it's actually biologically plausible, right? It was an end run around the need to do back prop, and that's what made it so clever and interesting, in my opinion. Lacoon is a big fan of this non-contrastive thing where you work in the latent space. There are many different algorithms that do this. We had a whole load of shows all about non-contrastive learning. There's things like V-Craig and BYOL and Barlow Twins, and there's an entire thread of research all around that. And in many different ways, what they're trying to do is avoid this motor collapse problem that you're talking about. And they use different forms of regularization. There's an old school way of accomplishing the same thing. And that is to do all of your is it's it's called pre-processing right and this is this is something that a lot of people do you take your data and in fact we do this all the time with with with like vision language models right so we want to do we want to use an lom and we want to predict images so what do we do well the first thing we have to do is tokenize the image right and so what do we do we run a va that we do the pre and we do it by the pre step is completely independent from the actual algorithm that going to be tasked with solving the problem of interest And that's not something that we necessarily have to stick with. right? It would be very nice if there was a way of, again, jointly, we're getting right back to JEP again. What we'd like to do is we'd like to choose our preprocessing algorithm in a manner that not a priori, not do it first. We'd like to choose the preprocessor that works the best in this space. And I think that that's the ultimate motivation for a lot of this work, is like, what's the right embedding? One of my favorite tricks, like, of course, I pre-process the VAEs all the time. In fact, every time someone hands me a new neural data set, the first thing I do, I'm not ashamed to admit, I run PCA on it and pass it through a VAE and then sort of take a look, right? It's the first thing you do with your data because it gives you a good idea of what the signal to noise ratio is in the data set itself. And then what do I do? I subsequently do most of my analysis, right, in that discovered embedding space. And I don't see a huge problem with that from a purely pragmatic perspective, but it's certainly cleaner, right, to have a single algorithm and approach and not just be stringing these sort of things together in an ad hoc way. There's, you know, when doing PCA, PCA is a really great example of this. There's a failure mode for principal component analysis, which is actually really common in neural data because principal component analysis basically says, well, where's the most variability? Okay, I'm going to worry about that. And then all the stuff that's not varying very much, I'm just going to throw it away, right? It's just like, you know, dimensions in which there's low variability are not important. Well, it turns out that in neural data, the dimensions in which there's very little variability are some of the most important dimensions. And so pre-processing with PCA runs a risk of throwing out the most valuable information in your data set. Yes. And so there's a lot of wisdom in jointly, right? in jointly fitting your pre-processing model as well as your inference and prediction model? I mean, on this subject of not throwing things away, JEPA and non-contrastive learning, it's part of this bigger field of self-supervised learning. And we want to learn representations that maintain fidelity and richness. And Lacoon's hypothesis is that when you do something like supervised learning with some particular downstream task in mind, the neural network gets wise. And what it does is it kind of discards all of the long-tail stuff that aren't relevant for that particular task. So when you train these models, what you're trying to do is sort of maintain enough ambiguity so that it compresses the information, but it also maintains enough fidelity to work broadly for different things. Yes, and that is a laudable goal, right? And I certainly share it, right? The last thing you want to do is, I mean, you know, fortunately, like, networks are so big, we don't really run the risk of, like, overfitting as much as we used to. But the last thing you want to do is train your network to toss information that you might need down the road. That said, the vast majority of what the brain does, just like these neural networks, is decide what information is currently task-irrelevant. But that's all the more reason to do things in a self-supervised or unsupervised way, right? Because you're basically not telling it, this is the importance. You're not telling it what's all task-relevant and task-irrelevant. so um i interviewed a chalet about the version two of the arc challenge and one thing that struck me is i think of intelligence as being multi-dimensional so version one got saturated the arc was actually really amazing because it's the only intelligence benchmark that has survived for five years before being defeated you know since the advent of these thinking models it has been defeated very quickly but they're working on version three and there'll be version four there'll be version five will there always just be something left over that sounds like another philosophical so yes is my answer there will always be there will always be something left over in the sense that like you know you know we we have this this has been the trajectory things have been going for a really long time right it's sort of like we get algorithms that do amazing new cool things and then someone comes along and says yeah but it can't build me it can't pull a rabbit out of a hat right and then and then of course what does someone do they oh they they figure out the new training protocol slightly different architecture or they just train it to pull rabbits out of hats and then suddenly it can and then someone proposes a new challenge and a new challenge and a new challenge and it's always this game of like one-upsmanship so the question becomes well what's the point at which there are no more new challenges and i'm not entirely certain we're ever going to get there right um it may very well be the case that we get you know these sort of algorithms that are capable of replicating the complete suite of human behaviors. And then someone will come up with some criticism like, yeah, but it's not really doing X. It's just faking it, right? This is just the direction things go because people really do think they're important. Yeah. Do you think that the concept of recursive self-improving intelligence is a valid one? Yes, I do think that is. So I think that one of the most critical missing elements right now is some form of continual learning, right? At the end of the day, you really want an algorithm that doesn't just learn on the training set and then just gets deployed. You want something that runs around in the world and comes across things that it doesn't understand, right? And then is able to build, you know, append its model in some sense, right? So this is like the, you know, and there are some approaches that is all based on like Bayesian non-parametrics and Dirichlet process priors and stuff like that. where you sort of see something that's surprising or unique or different, something you didn't expect, and it causes you to say, I need to turn learning on because I've got to figure this out. That is an absolutely critical element that we need to be developing. We are developing that. And it turns out that that's one of the nice things about this sort of object-centered physics discovery thing is because it's object-centered, if it comes across a new situation that it does not understand, it is capable of instantiating a completely brand new object just to explain this new situation. Right. Continually learning agents can acquire new knowledge autonomously and the whole thing just learns more knowledge. But intelligence feels different. It feels like in the system that we've been describing, the intelligence is the way we're implementing the Bayesian updates and actually building the algorithms. Could the systems on their own metaprogram themselves and develop better algorithms or something like that? That's a very good question. Something that would be closer to true artificial intelligence than what we currently have would be capable of building models on the fly to deal with new situations, to taking things that it knows about and combining them in new and different ways. there are approaches that have some of that aspect to it. Like G-Flow Nets from BenGio stuff is a great example of something that, at least in principle, is a generative model of generative models. It's sort of like, oh, I might actually need a new node. It's time to create a new latent variable because the current set's just not cutting the mustard anymore. Those are things that I think are hallmarks of true intelligence. I don't want to ever make the statement, As soon as it's got that, it's truly intelligent. I will never, ever, ever say that. But I do think that that is a critical component that needs to be present, right, is the ability to generate new models on the fly to deal with novel situations and data. Most of that, you know, as well as the ability to combine old models, previous models in new and interesting ways. This is actually how the brain evolved, right? We started out with like, you know, really simple brains and there were different regions and they solved sort of different problems. And what eventually happened as we evolved is that these different regions of the brain learned to communicate with each other in new ways. And through that communication acquired new abilities, right, and then eventually evolved into, you know, new capabilities and things like that, right? I often like to point out to the – I think olfaction is like the sense that's not studied nearly enough. It's an incredibly old part of the brain. And arguably, right, it's the first part of the brain that evolved the ability to do proper, like, associative processing, right? The odor, unlike visual space, right, where there's translation symmetries and all that sort of stuff and things are smooth, olfactory space that does not exist, right? It's really, really, really combinatorial and complicated. And the part of the brain that evolved to solve the olfactory problem, arguably, is the part that evolved into our frontal cortex. Don't quote me on that. There's a lot of disagreement there. That's just my take. But it certainly has a lot of the features that we associate with associative cortex, right? It is, wow, I just said like six uses, three different uses of the word associate in that sentence. But I think you see what I mean, right? It was all about like taking old capabilities, right? Combining, you know, simple models and modules to create something that was more complex. And then over time, right? So that was what made the brain work, right? It was all about taking little things that worked and combining them in new and different ways in order to evolve. effectively an emergent properties, emergent computational abilities, and an emergent understanding of the world in which we live. And I do think that when we get to the point where we start really saying, oh, this is actually truly intelligent, it's going to have that feature. It's going to have the ability to have a modular description of the world, and it's going to have the ability to combine those modules in a way that creates a more sophisticated understanding. it's like Legos, right? I can, you know, the Lego bricks all connect in certain ways and I can build like all sorts of new and amazing things that were never built before, right, out of them. That's the capability that we have. And that's the essence of like creativity. It's why I refer to systems engineering as like the thing we really want our AI models to be able to do. Collective intelligence is a bit different. We have this plasticity, right? We can adapt our behavior day by day. We might see some kind of meta learning or some kind of change in our organization dynamics. You know, maybe some agents will specialize and it might be an existence proof of this kind of recursive super intelligence that we're talking about. Yeah, I think that's absolutely correct, right? Is that, you know, so the specialization is great. In fact, I would argue that specialization is how we got all of this, right? And I'm pointing at London in case there was some confusion there, right? It was really about, you know, the interconnected, highly specialized intelligences that are people and their ability to learn how to work together that gave rise to the technological revolution. The brain is the same way, right? In my view, it's highly specialized little modules or agents that are capable of being repurposed, reused, capable of communicating with one another in order to solve really complicated problems. but there's always a benefit to specialization. I don believe in like AGI AGI seems like a bit of a misnomer to me What we really want is not artificial general intelligence We want collective specialized intelligences What about scientific discovery? Do you think that we could, you know, what would the world look like when we could discover new drugs, we could discover new knowledge in science? You know, right now, the way that we're doing that is largely focused on summarizing vast troves of data and looking for correlations that are present in it. I think the next major milestone in this trajectory is experimental design, right? Not just, oh, well, here's some correlations you may not have seen because they're really small, and this is what computers are good at. They're really good at identifying small but highly relevant correlations. And the next step, of course, is constructing a system that tests these hypotheses explicitly, right, and generates the experiments that will identify, like, that will fill in the gaps of our knowledge. And all of this, I believe, can in fact be automated in a very sensible way. I don't see any major obstacles to automating empirical inquiry other than we probably want to place some safety constraints when we start letting the AIs run the labs, right? Because you never know. You always have this AI where it's like, well, you know, the most effective experiment to determine if this is correct is to set off a nuke. And that would be bad. Yes. It comes across like a beach ball. I've never seen a beach ball in its entire life. And what you'd like is you'd like the robot to know how to figure out that it's a beach ball and to figure out what its properties are. And if you tell the robot, like, if you see something new, just stop. Then that's no good. What you really want to do is you want to figure out a relatively non-invasive procedure for the robot to, like, do what a child would do. What does a kid do when they see a beach ball? They run up and they poke it and they say, oh, right, yeah, and then it moved and it actually experiments with its environment for the purposes of identifying the properties of the objects that exist in it. Now, I do think we probably want to test this out virtually before it's deployed in the real world because you never know. It might very well be that the optimal experiment is to run up and kick it as hard as you possibly can. And we certainly want to avoid that. But something along those lines, a robot that is able to test the theories that it has about how things work in an online way and learn from those results in an online way is definitely part of the goal. Looking forwards, what do you think the future will look like when we have more autonomous AIs among us? A lot of people worry about enfeeblement, loss of control, it making us dumb, all of this kind of stuff. I do worry about AI making us dumb. Right. I mean, offloading offloading your thinking onto a machine, which is something that that that AI allows is is is is a potentially a big problem. I mean, I don't really want to have a situation where humans are reduced to like value. They're just reduced to like value function selectors. They're just basically going, oh, no, I don't like that outcome. Like, do this instead. I do want to see a future where we have an AI that actually improves our understanding of the world. And simply automating everything runs the risk that you specified, right? It runs the risk of people becoming couch potatoes that just watch TV and occasionally say, like, yeah, these chips are no good. That seems like a bad outcome to me. I worry less about that, I think, than some because people are remarkably adaptable, right? I mean, you know, you have all these arguments about like, oh, you know, this new technology comes along and it's going to completely destroy this way of life. And, you know, and that's going to be awful for people. And it is maybe in the short term. You know, I think of like tractors. Right. Or just go back. How many hundred years do you have to go back when like 99 percent of people were involved in agriculture? And now it's like, what, two? Right. I consider that a solid improvement because it allowed us to do a bunch of other things that we find more satisfying that are more interesting. It allowed us to spend some time reading a book, don't have to labor in the fields all day. that's the future that I sort of see and that's the future that I hope for is that is is one in which you know all of these artificial agents running around and doing things autonomously are there to to free us up to pursue more interesting more you know you know to improve ourselves in in in in more interesting ways but at the end of the day it's just another technology you know at least initially it'll just be another technology like the tractor now 100 years from now who knows what will the value of work be if the AIs can do everything and there's nothing left for us to do I don't think that will ever be the case that the AIs can do everything like I said the future I worry about is one where like it's you know the sole role of people is like sitting around like making sure the AIs aren't aren't going rogue and and and things like that which I don't consider a good outcome I would really like to see human improvement you know i i envision a future of i don't know this is like cybernetic transhumanism if i'm going to go sci-fi on this right where where you know the technology and us evolve together in a way that's beneficial for both that's the goal um you know are there these dystopic possibilities where like oh well what are humans in a world where well what are they what are what are humans in a world where everything can be done by a robot yeah you know that's that's a That's a good question. And that's at the end of the day, right? They end up just becoming like reward function selectors, right? They end up just sort of saying, oh, I don't like this. And I do like that. And they're basically, you know, I mean, you end up with this is another nightmare scenario. I don't like talking about these dystopian futures because honestly, I think people are too clever and I think people are too motivated and people are too interested in how the world really works. And then people are too interested in actually understanding things that they will never stop, that AI will become a partner, not an adversary or a crutch. And that's what I think will happen. But that's a statement more about my belief about humans than it is about my belief about the development of AI. I am a techno-optimist, if you will, not a pessimist. I believe that we will find a way to adapt to a never-changing world as we have done for millions of years, including one that includes technology that alleviates most of our labors. On that, there's an AI literacy thing because AI has moved so quickly now that certainly my parents don't understand anything about it. But by the same token, policymakers don't understand anything about it. And there are people saying AI is going to kill everyone. And there's people making negative arguments. There's people making positive arguments. So there's a bit of a fog of war now because there are so many people saying different things about AI. How should they make sense of all of this? We are now well outside my area of expertise. So I'm just going to say that before I say anything else. AI is developing very quickly, but I am much more concerned about what people will do with the new technology than I am with what the technology will do all by itself. I don't have this big concern about, I don't really believe that Skynet's gonna take over, the internet's gonna suddenly become conscious and kill us all, right? In part because AI is not that advanced, but also because we are still in the position where we specify the goals of the system. And that will likely continue for a very long time. And it will always be the case that these systems are subject to review. We will always keep an eye on them. They will always, at least initially, be released in relatively restricted domains where we're keeping a close eye on what it is that they are and are not doing. So I don't worry too much about the going rogue. I worry a lot more about somebody building, it's sort of like a virus, which we already have to deal with. Somebody builds some insane virus and takes down the internet. I'm more worried about malicious human actors than I'm malicious AI actors, because at the end of the day, all of these algorithms, they simply do what they are told. We train them. We tell them, here's your objective function. As long as we are specifying the objective function and we understand the objective function, we're probably going to be okay. I think the safest way to deal with AI concerns is to tell people, hey, look, this AI is just doing what we told it to. We set it up to make really good predictions and to achieve these outcomes. Now, is it dangerous to specify these outcomes without being very, very, very careful? Yes, it is. right? This is the whole like, hey, Skynet, end world hunger, and it kills all humans. That is a real possibility. But whose fault was that? The fault was the person who was very naively specified their goals. There are, in fact, relatively straightforward ways to specify the reward function that don't run that risk nearly as badly. And the best one is, so are you familiar with like maximum entropy, inverse reinforcement learning? I like to call it active inference because it's really similar. And so there what you're doing is you're basically observing someone's policy and then you're trying to do a maximum entropy model. You're doing maximum entropy model on the reward function itself. At the end of the day, what ends up happening when you do this is this is why it's like basically just like active inference. You get a reward function. So you have some organism or whatever and you're trying to do this for it. And it's got some stationary distribution over actions and outcomes, right? It's inputs and outputs of a stationary distribution. that becomes your reward function like not directly there's some math involved but basically your reward function is a function of the steady state distributions of our actions and outcomes so we could do this right we could take the current we could take the current manner in which humans are making decisions and we could write down right what's the stationary what is the current estimate of the stationary distribution of our actions and outcomes so this would include things like everyone's getting you know this number of people are going hungry this you know and and you know all the stats that describe like the inputs and outputs to our policy make you know to our policy to me. And then we could just ask an AI, your reward function is the one that results in the same outcome that we currently have, right, on average. And it would execute it, and to the extent that it works, right, it would ultimately result in an AI algorithm that just sort of is mimicking human behavior, right, or at least achieving the same outcome that we were achieving before. Now, here's the safe way to improve the situation. You don't say end world hunger. You perturb that distribution over outcomes, and just over outcomes a little bit, and then you evaluate the consequences. It's all you're doing. You make these little changes in an empirically estimated reward function rather than just specifying one by hand, because that's the dangerous thing. Jeff, thank you so much for joining us. It's my pleasure. Amazing.