Fixing AI’s Bottlenecks: Memory, Scale, and Sparsity

59 min

•Apr 23, 20263 months ago

Summary

A panel of neuromorphic computing experts discusses how the field has shifted from expecting intelligence to emerge from scale toward a more engineered approach, identifying memory and connectivity as the primary bottlenecks in AI hardware rather than computation itself. The discussion covers emerging technologies, sparse computation, and the role of biological inspiration in designing efficient AI systems.

Insights

The neuromorphic field has matured from a 'scale and intelligence will emerge' philosophy to a pragmatic, engineered approach requiring careful algorithm design and training
Memory and synaptic connectivity, not neurons or raw computation, are the critical bottlenecks limiting AI efficiency and scalability
Sparsity in both space and time is fundamental to biological efficiency and could reduce AI energy consumption by ~1000x if properly leveraged
ADCs/DACs in analog computing are not the limiting factor once crossbar arrays scale to thousands of elements, contrary to common architectural assumptions
The field is too method-driven rather than goal-driven, with researchers publishing on technologies known ahead of time to be unscalable due to publish-or-perish pressures

Trends

Shift from neuromorphic as pure biology emulation to hybrid analog-digital systems optimized for specific computational tasksGrowing focus on in-memory computing and emerging memory technologies (PCM, memristors) as core AI accelerators rather than peripheral componentsMixture-of-experts and conditional computation architectures enabling sparse parameter activation in production language modelsEmergence of energy-based models and diffusion-based approaches as alternative computing primitives beyond matrix multiplicationIntegration of neuromorphic systems into embodied robotics and edge AI applications (automotive, defense, security) rather than pure data center accelerationRecognition that KV cache and attention mechanisms are becoming the bottleneck in transformer architectures, not weight communicationIncreased skepticism about adding biological complexity to neural primitives without clear algorithmic pathways to leverage that complexityPublic funding shifting toward applied science over basic research, creating gaps in long-term technology developmentConsolidation around proven neuromorphic platforms (SpiNNaker, Loihi, BrainScaleS) as 'neuromorphic zoos' for application developmentDebate over whether AI bubble will burst but underlying technology will survive, with neuromorphic positioned as longer-term bet

Topics

In-Memory Computing and Emerging Memory DevicesSparse Neural Networks and Conditional ComputationAnalog vs. Digital Computing Trade-offsNeuromorphic Hardware ArchitecturesAttention Mechanisms and Transformer OptimizationADC/DAC Efficiency in Crossbar ArraysSpiking Neural Networks and Temporal DynamicsEnergy-Based Models and Diffusion ComputingEdge AI and Embodied Robotics ApplicationsBiological Inspiration in AI Hardware DesignScalability of Emerging TechnologiesKV Cache and Sequence Length BottlenecksMixture-of-Experts Model ArchitecturesState-Space Models vs. TransformersAcademic Funding for Long-Term AI Research

Companies

Google DeepMind

Maxence Erneux represents the company; discussed analog computing and physical learning research for AI acceleration

IBM Research

Julian Bichel is research staff member; discussed in-memory computing, crossbar arrays, and ADC/DAC efficiency in neu...

Intel

3DXPoint PCM technology mentioned as promising but discontinued neuromorphic memory technology

Micron

Co-developer of 3DXPoint PCM technology that was discontinued despite initial promise

TSMC

Mentioned as actively researching alternatives to SRAM with higher density for neuromorphic applications

Cerebras

Developed all-SRAM accelerators addressing SRAM scaling plateau in AI hardware

Graphcore

Mentioned as company developing neuromorphic-inspired AI accelerators

Logitech

Used neuromorphic trackball technology from Andre van Schaik for 20+ years in commercial products

Normal Computing

Company betting on noise as a computing primitive for neuromorphic systems

Extropic

Developing energy-based models and using subthreshold transistors as computing primitives for diffusion models

Microsoft Research

Proposed analog optical units for solving fixed-point equilibrium problems in deep equilibrium models

OpenAI

Referenced in context of AI bubble and funding sustainability for large language models

Keoxia

Actively researching SRAM alternatives with higher density for neuromorphic computing

SK Hynix

Mentioned as researching alternatives to SRAM scaling plateau

People

Sonny Baines

Chaired panel discussion on AI hardware bottlenecks and neuromorphic computing at Atoms to Bits conference

Julia D'Angelo

Co-hosted podcast and participated in neuromorphic computing panel discussion

Damien Curlio

Panel member discussing neuromorphic computing with emerging memory devices and shift from scale-based to engineered ...

Julian Bichel

Panel member discussing in-memory computing, crossbar arrays, ADC/DAC efficiency, and sparse neural networks

Tamalika Banerjee

Panel member and startup founder discussing integrated memory chip hardware for edge AI inference applications

Maxence Erneux

Panel member discussing physical learning, analog computing, energy-based models, and scalability challenges in neuro...

Steve Ferber

Panel member with 25 years developing SpiNNaker; emphasized sparsity and connectivity as key to biological efficiency

Raoul Fetian Cummings

Post-panel commentator discussing biological complexity, sparsity, and alternative AI paradigms beyond transformers

Taylor Marvin

Introduced and hosted the EE Times Current podcast episode on neuromorphic computing

Andre van Schaik

Developed neuromorphic trackball technology used in Logitech products for 20+ years; took over chair at Manchester

Zico Colter

Known for deep equilibrium models; discussed challenges in implementing implicit architectures on digital hardware

Elisa Donati

Led benchmarking framework for embodied neuromorphic agents published in Nature Machine Intelligence

Quotes

"Biology is always sparse, and the matrices that represent connectivity in biology are populated at a few percent, and only a few percent of neurons are ever active at any one time. One of the problems with today's AI is it computes everything all the time, which is wasting about a thousand times more energy than is required."

Steve Ferber•Panel discussion

"The field is way too method driven rather than goal driven. I think that identifying good research questions is really a challenge of its own."

Maxence Erneux•Panel discussion

"When your hardware is working, you're about 10% of the way there. The biggest deal of all is building the software stack."

Steve Ferber•Panel discussion

"If we want to move from artificial stupidity to artificial intelligence, we need to go and look at biology."

Raoul Fetian Cummings•Post-panel commentary

"I'm sick and tired of hearing 'you don't have a killer application.' We do have applications. We demonstrated for health, lots of applications. We demonstrated for vision lots of applications."

Raoul Fetian Cummings•Post-panel commentary

Full Transcript

You are listening to E.E. Times On Air, and this is E.E. Times Current. I'm Taylor Marvin. Welcome to Brains and Machines, a deep dive into neuromorphic engineering and biologically inspired technology. In this episode, Dr. Sonny Baines from University College London chairs a live panel talking about AI hardware. Co-hosting the podcast and discussion is Dr. Julia D'Angelo from the Czech Technical University in Prague. Welcome to Brains and Machines. I am Giulia D'Angelo. And I'm Sunny Baines. In today's episode, rather than our usual one-to-one interview, you'll hear an edited version of a discussion Sunny chaired at the Attems to Bits conference at the University of Manchester, held in February this year. The panelists work across the full stack, from device physics to algorithms. After that conversation, we will be talking to Raoul Fetian Cummings from Johns Hopkins University about the issues raised. Thanks, Julia. The panelists will introduce themselves in a moment. But in the discussion, you'll hear them debate whether the field's shift from a scale it and intelligence will emerge mindset to a more engineered approach is a good or bad sign. They'll agree that memory and connectivity, not neurons, have become the real bottleneck, and they'll consider whether adding more biological complexity to our neural primitives is essential progress or a distraction. There are links to some of the papers mentioned at our website. You can check them out at BrainsAndMachines.net. Welcome to Brains and Machines at the Atoms to Bits conference hosted by the University of Manchester. Today we're exploring emerging technologies for AI and asking whether the various options are real potential hardware, laterware that the next generation will have to develop, or vaporware that will evaporate on exposure to the sunlight of scrutiny. So today on our panel, we have five really fantastic speakers. I'm Sunny Baines from University College London and your host for today. Hi, I'm Damien Curlio. So I'm the Cienerist Research Director at the University of Paris-Saclay. And I work on neuromorphic computing with new memory devices. Hello, my name is Julian Bichel. I'm a research staff member at IBM Research in Zurich. Hello, I'm Tamalika Banerjee. I'm the founder of IAMChip, Integrated Memory Chip, and a professor at the University of Groningen in the Netherlands. Hello, I am Maxence Erneux, and I am a senior research engineer at Google DeepMind. Hi, I'm Steve Ferber. I'm an emeritus professor in computer engineering at the University of Manchester. So I'm going to start with Damien. You've seen a lot of different technologies come and go during your career, And you're also unusual in that you look at the physical layer and the physics at the bottom all the way up to the algorithms and architectures at the top in your work, which I think is very impressive. So you've seen all these technologies come and go. Can you mention one that looked promising but turned out not to be so? And maybe another one that really looks like it will make some impact that can be something you're working on or something you've seen from other people? That's a really interesting question. So I think the way we think about neuromorphic and hardware neural networks has changed, and that's what has changed which kind of technology we are looking at. When I joined the field, we really had the idea that if we have simple elements and we really go scale, something really interesting is going to happen. So we were really focused on this question of scale, and we were looking at things like maybe these random networks or with molecules or with nanowires, and the idea, if we just nail it and then we make it very big, it'll be intelligent. I think that's all but disappeared. Now we understand we actually need to maybe more engineer and train. And the things that we were all really excited about, also with the beginning of nanoelectronics, not as much. On the other hand, at the beginning, we were not that focused on memory and synapses. And we were focused on neurons and imitating a jink inoxclays kind of thing. We have more and more focus on memory. And this one, I think, has gone only stronger. At this point, we had some talks here that were literally all about memory. And yeah, here I see really staying power in the memory developments for neuromorphic applications. Any other contributions from the panel? In terms of the technologies that maybe came up and then disappeared, I haven't been in the game for a long time. But one example that came to my mind is 3DXPoint, the 3D stacked PCM technology from Intel and Micron, I think, that got discontinued. It was, I think, back then promising they even made it into a product, but then disappeared. I'm not exactly sure why, but just wanted to throw that in. So your work is focused on solving the sort of weight communication problem in neural networks. So that's a very specific problem that is quite well known. So we have a lot of people who are working on emerging technologies, and I'm wondering what other problems they could be addressing within the neural net neuromorphic kind of AI space. Activation communication problem is also significant, especially when you look at today's architecture. So you have huge KV caches that need to be moved. This is something that we cannot address because KV cache changes dynamically. so we cannot quickly program it into the crossbars of our arrays because that takes too much time and energy. Of course, there's sometimes large kv caches that are always the same so you can program these like system prompts and so on or tool definitions, whatever. So that's, I think, a problem that is worth addressing because especially for very long sequence lengths, the bottleneck of the workload actually shifts to attention. Still, if you solve the weight communication error, you make room for longer sequences, larger KV cache, and so on. But you can also solve this by going to different architectures. For example, states-based models don't have that problem. So that's one thing I see, like the activation problem, KV cache, and so on. I think that's very worth addressing. And people have also already tried addressing that. And the second one that I think also I saw in a lot of talks is the accuracy. that David's talk just looked into accuracy degradation. We work a lot on accuracy degradation as a result of noise because of analog computation. You also have accuracy degradation by just using a different neural network model. For example, just by using spiking neural networks, you already most of the time have accuracy degradation because maybe you struggle to train it. So I think that's one of the most interesting parts to focus on accuracy because right now when you use a large language model, for example, all you care right now is speed. But more importantly, you care about accuracy. It needs to be able to solve the problem. Okay, thanks very much. So, Tamalika, you've recently launched a startup. Can you talk about how you keep grounded within that startup so that you know that the products that you're developing within the company are what your customers actually want and need? It's a great question. Because of the fact that's also something that we are constantly also researching ourselves. But thankfully, we also have a business segment in our iMCHIP, the startup that I'm representing here now. And there, the go-to markets, the GTMs, specifically hovers around not the training, but the inference market. So that's the edge AI applications and mostly around, let's say, the defense and security applications, as well as the automotive, the car industry. So these are the ones where the cognition or latency driven. So these are the critical needs and that's what we are looking at. So I'm going to push you just a little bit further. So one of the things that I've noticed from talking to engineers is that business development people, sales people are often the problem as well as the solution in the sense that on the one hand, they go out and they find customers. But on the other hand, they promise things that you can't deliver and they don't necessarily always bring in the information that you need as an engineer within your own company to satisfy the client. Now, obviously, you're very new what you've been running for a year or so now. How is it that you're working with your potential customers to make sure that you really deeply understand their technical problems, rather than are just having them mediated by some business guy who, yeah, is an engineer, sort of. But again, a great question. The thing is that the hardware that we are developing is going to be a game changer in many sense. So it's not only having the in-memory computing, so the memory and the processing in the same die, but the architecture itself would render itself in a way that you don't need this extra footprints from the common transistors. And also the way you could at the end readout could be by various different mechanisms. So that opens much more space for the market, in fact. And for the business people, I meant that we do have our own business researchers who are looking into the space. And we have spoken to a lot of people. And of course, it's all about KPIs. It's all about benchmarking. These are more often the questions that we are asked about. So how much better is the performance? What is the latency? So all of these questions we take in and we reflect a lot and we keep on iterating on that. Right. So you're keeping the client in the loop of your development. So Maxence, so you work for Google DeepMind, obviously a very well-known company, but your work is much more kind of basic science and speculative than certainly anyone else on the panel here. How do you see your work in the context of the wider industry? Are you just part of that effort to explore the state space of all the possibilities? Or do you see one of the technologies that you're working on as being particularly relevant for near or medium term future computing? Okay, so it's an excellent question. First, I want to say that this line of research around physical learning was originally motivated for finding biologically possible theories for learning, and then it was later repurposed into accelerating training. But whether it is still relevant today or not is highly questionable, because now in the realm of foundation models, you really no longer need to fine-tune your model at the edge, like the techniques that are proposed now, if we rely on inference, gradient-free techniques. Do we really need to do something else than just backprop in the cloud? I'm not sure. So first, this research is highly dependent on whether we need to train in analog, to put it this way. Second, I think also this research speculates on new model building blocks. Right now, attention prevails as a building block. It's clearly an extremely powerful inductive bias. But you could question whether we could use other building blocks like state space models, which have a least computational memory cost than the attention or even implicit models, but which are extremely costly to run on standard hardware. So I think that there is some research that is required to assess whether by using other building blocks and attention for language processing, we can be on par in terms of accuracy while consuming less power and compute budget, memory budget, and so on. And lastly, the third component is that even if we can show in simulation that we can maintain some level of accuracy with new model building blocks, then you need to have some physical substrates to map those compute primitives onto. So these are the three main assumptions of the research that I'm doing, I would say, outside of DeepMind, and there are lots of ifs, I should say. Very good. And last, but definitely not least, we have Steve Ferber. So you've spent almost two decades developing Spinnaker and SpinCloud, which are ARM chip-based AI platforms that use a brain-inspired interconnect to make them more efficient. Can you say something about the difficulty of that journey you've taken, especially considering that the technologies you were using were actually relatively mature to begin with, at least compared to some of the new materials and devices that we've been talking about in this conference? Sure. So almost two decades is an understatement. It's now 25 years. And the first five years of work on Spinnaker were basically just thinking about the architecture. And the huge challenge is Spinnaker's focus was on supporting brain science, so building real-time models of brain subfunctions. And the brain is very highly connected. So if you look at the cortex, the outer layer of the brain, cortical neurons typically make 5,000 to 10,000 connections to other neurons. So for engineers, that's a fan in, fan out of 5,000 to 10,000. And that's extremely difficult to support with conventional computer communications. So the key breakthrough, if you like, in thinking about the Spinnaker architecture was how we do support that very high degree of connectivity. The other major challenge, of course, is building a machine. We set ourselves the goal of putting a million ARM processors into a computer for brain modeling. So building that on an academic research budget was quite a challenge. And the goal at the outset was to aim for a budget of a pounder processor And we pretty much did that went slightly over But you need to be very careful with costs if you trying to build a machine at that scale And we've written a whole book about the decisions we made to get there. I probably don't have time to read it out loud. But keeping costs under control and supporting the very high connectivity, those are architectural issues and engineering issues. But of course, the biggest deal of all, and this is the one you always forget when you're a hardware guy, is building the software stack. And having got the hardware, we then needed a very talented team with many years of work. Because from 2016, we opened a completely open user service under the auspices of the Human Brain Project. We have hundreds of users of the machine from around the world developing that kind of software and continuing to support users who come in with new requirements on a regular basis is a huge amount of work. So just the advice to the audience, when your hardware is working, you're about 10% of the way there, right? So I started to ask Julian about other problems in AI. So we've talked about storing weights and fan in, fan out interconnections. Is that where all the action is? Is it all about interconnection and that problem of the scaling of the interconnection weights that essentially for every extra neuron you want to add to your network, you're getting an n squared of the number of connections that you're going to need to have them fully connected? Or are there other issues within the neural network space, within the AI space that we are just not grappling with and we could do something new and interesting if we were able to solve those problems? Yeah, I would say that in modern neural network architectures, you have almost always nowadays conditional computation. That means you never use all of the parameters to process information. So that means that if you add a neuron, if you add weights to the neural network, the activation vectors that you need to propagate to the network don't grow as you add more weights. For hardware, this is really interesting because it means that you can keep the computational power the same while scaling the amount of memory in your system. Steve, you obviously chose to use your ARM chips to do all the computation. So I guess we could say that from your point of view, the computation problem is solved and you focused on the communication. So do you see, 25 years later, do you see other problems now that we have these systems working and we're trying to optimize them for power and things? Do you see particular places where there are bottlenecks that you think need to be addressed? Yeah, so our decision to use small general purpose processes for modeling the neurons and synapses was basically because we were building a research platform. And even today, there's no real consensus over what the right model of a neuron is or what the right model of a synapse is, what's the right learning rule. 20 years ago, there was even less certainty. So for a computer engineer, the obvious way to build flexibility in so that you can track improvements in models of both neurons and synapses is to implement them in software. And yes, there's some efficiency loss from software, but the flexibility gain is enormous. I think the bottlenecks now are really in algorithms. You talked a little while ago about N-squared fully connected networks. If you've got a fully connected network, you're doing it wrong. Biology is always sparse, and the matrices that represent connectivity in biology are populated at a few percent, and only a few percent of neurons are ever active at any one time. One of the problems with today's AI is it computes everything all the time, which is wasting about a thousand times more energy than is required to actually do the necessary computation. So we need to go sparse and we need to go sparse in space and time. And that's what biology does. And it's the key to its efficiency. Maxence and then Damien. So when I talk with people at Google about analog, of course, they are interested and curious. But oftentimes, an argument that comes back over and over again is like tile interconnection, because you can't route information in analog across tiles. So you need those costly ADC, DACs. And that's, I think, a huge limitation because I don't have exact numbers on top of my head because it's highly implementation dependent. But I'd like to have Julian's take about this. But ADCs and DACs are extremely costly, and they may possibly offset the gain from going into analog for vector matrix multiplication. And I think this is a huge limitation, but yeah. Can you elaborate on that, Julian? Yeah, sure. DACs and ADCs, especially ADCs, consume quite a lot of power and area. The nice thing about ADCs when you compare them to the weights in your crossbar is that for every ADC that you add, you get a quadratic increase of the weight. So if you scale up your crossbar size to, let's say, 1,000 by 1,000, you have 1,000 ADCs. The scaling in the ADCs is linear, while the scaling in the weights, as you increase the crossbar size, is quadratic, which is very nice. So it kind of amortizes. Also, the fact that nowadays neural network layers are in the thousands, right? Hidden layer dimensions of 16, 32,000 in dense models plays into these cards. So you can design larger and larger crossbar arrays. to steve's point that biology is sparse that's true but mixture of experts are sparse by definition and researchers have actually found that mixture of experts are parito optimal in the sense that you want to increase the number of experts so you want to increase the number of experts to very small experts and you want to activate only a very few amount of these experts so parameters weights and nowadays if you look at state-of-the-art models, also open source models, I think 10% of the total weights are being used in active computation. So in that sense, artificial neural networks, also in industry, have taken that inspiration. Just to follow up on this, when you talk to architecture people, have found that they tend to overestimate how bad ADCs are because they see ADCs from instrumentation that are really precise and very linear. But actually, ADCs that are developed for in-memory computing are much more compact and much more energy efficient. I think we really see that in the work of IBM, also in similar chips by TSMCs. I think, I mean, at this point, this analysis has been done. And for me, ADC is not the error breaker. A bigger concern that people in architecture sometimes raise for me is that the architectures that have actually been developed are not really adapted for KV cache or attention. So they solve only part of the problem if you want to go toward a transformer. And here that would be maybe my message. I mean, transformer arrives and there are these really big, scary models. But when you really look into it, they are not that scary. and the way it's done, when you look at the equations in the paper, it doesn't necessarily have to be this way. So I feel for the more academic community, it's actually time we go into this and maybe find ways that could be a lot more efficient than what is done now because the model of how language models and this type of analysis is done is not so efficient and some really creative ideas could make a difference. Okay, I'm going to ask one more question, and then I'm going to open it up to the audience. We keep saying that in order for people building new materials, building new devices, to really be able to thrive in industry, to be able to make it in the industry that we have, at least in the near term, that they need to be presenting some kind of computational primitive, whether that's a tiny one, the equivalent of a memory cell, or whether it's a bigger one, like a crossbar with very compact weights, doesn't really matter. But it has to be something that computer scientists and electrical engineers can understand so that they can build them into their systems. Would you agree with that? And what kind of primitives might there be that we haven't already been talking about. We've talked a lot about memory in this conference, and I think the wider audience of brains and machines know a lot about memory as being important. What other primitives could we be replacing with innovative physical materials, Maxis? It's a super interesting question. So I think, of course, I have no general answer to that, but I just want to give one example. I actually went to NeurIPS in 2024, and there was this ML with Nuke Compute Paralign's workshop. And the keynote speaker was Zico Colter from Carnegie Mellon University, and he's now also at OpenAI. So he's known, among other things, for deep equilibrium models. I don't know if you know about this, but essentially, instead of having a fit for architecture, you have an architecture that is implicit, that solves for an equilibrium point. And when you think about it, the way that you solve for an equilibrium point is using some fixed-point algorithm. So if you look at the computational graph that it spans, namely the neural network that it spans, it's an infinitely deep architecture with a sharp parameters along the direction of the computational graph. In a nutshell, these architectures emulate infinitely deep architectures with a sharp parameters. And this paper is cited probably over 500 times, so it was clearly noticed in the deep learning literature. But Zico Coulter started his lecture saying, hey, the deep Echolarian models are a failure. Where do you see them? We don't see them. And his point was that unless you could find a primitive for this architecture, then there was no point using deep equilibrium models instead of fit for architectures because solving for equilibrium on a digital accelerator is just so much more costly than just running a forward pass. So this was just to give an example. So to answer your question directly, if we could find a piece of hardware that can solve for fixed point, that would be super interesting. I know that recently Microsoft Research proposed an analog optical unit which could do that. Of course, it was not the most highly that scale, like very big problems, but only combinatorial optimization problems. But this is definitely an interesting plaintiff to look at. So I have a little bit of a more conservative view on this. I think when you develop new devices, it's really important to look into industry and see what people need. One example for this is people found that SRAM scaling has come to a plateau. Okay, the N2 process now increased it a little bit in terms of density. So this is definitely a challenge, right? SRAM is very fast, like nanoseconds, even sub-nanoseconds. But the density has plateaued. So now it's obviously interesting considering that Grog and Cerebras came up with all SRAM accelerators. Can we find an alternative to SRAM? It's just an example, right? Can we find an alternative to SRAM that has higher density but similar properties to SRAM? And I know that TSMC, Escahinix, Keoxia are doing active research there. And this is the kind of research that me personally, I would like to see. I know it's less romantic, but... Yeah, but also we've talked about memory. I want to get off memory just for a little bit. Because also, so Maxon's raised an interesting point. In a sense, you're talking about a bigger accelerator that is sort of a standalone piece of computation, right? And there have been attempts to do things like that, but that's different than a primitive. Oh, Tamalika. So indeed, you were asking about computing primitives. What I read commonly in literature is about also using heat as a computing primitive. And also my personal favorite would be to use spin also as a computing primitive and to design all kinds of logic. But heat as a computing primitive distinctly strikes me. I know too little. And so I would like to also know how. But with respect, that's a physicist talking, right? It's not a computing primitive. It's a method by which we can modulate the behavior of a piece of material, right? So spin is one thing we can look at. We can look at heat. That's definitely true. But what I'm saying is if I had a chip designer sitting here and saying I have to set up the architecture, I have certain pain points, how are you going to make my life easier? I totally accept Julianne's point in terms of memory is one of the big issues. But what I'm trying to get at is what are the other big issues? And then we can get to also there are people here who are working on completely different modes, right? So working on accelerators that would be standalone, like the skirmions, right? I'm assuming that those would not be part of a broader electronic system. They would be part of a skirmion system, right? Which is very interesting, but that's a very different thing and certainly not as near term. Steve, you want to comment on that? The thought that's crossing my mind is that you could use a physical system to implement the bifurcating dynamics of spiking neuron. So that's clearly a vital property of biological neurons. Most of them spike. They don't all do. And Eugene Zikiewicz showed that very simple dynamical equations can deliver this result, but they're not that bad to compute, but they cost quite a few compute cycles if you're modeling them that way. So some kind of physical primitives that would integrate inputs and then bifurcate at an appropriate point would be quite useful, I think, particularly for analog neural systems. Damia and then Maxon's after that? I just want to say I think the question you asking is an excellent question and it really a research question in itself Because in analog type of computing so we used to do really small primitive demos and now we can do bigger things So we want to do something that is more demonstrative, but obviously we cannot really yet make a fully application-relevant neural network. So we are trying to target this intermediate demonstration that will solve an important problem. and each time I found that identifying the problem we actually want to solve is a significantly hard part of the research. Every time I'm like, okay, we just have to take a simpler data set and it'll be easy, but finding something that is hard enough, that is demonstrative, but at a scale that can be done with an emerging technology often ends up being a much bigger part of the research than I thought it would. Maxence? Yeah, totally agree with Damien about this. And perhaps a general comment about normorphic computing and analog computing, physical computing, normorphic computing, whatever we call it, is that I think that the field is way too method driven rather than goal driven. And I think that identifying good research questions is really a challenge of its own. I wanted to comment a little bit more about the primitives. So, yeah, actually, there are some companies like Normal Computing, Extropic, which are betting on noise as a primitive. And I think that Damien's research is also aligned with this idea that instead of eliminating analog noise, you want to leverage it. So be it for epistemic uncertainty quantification, to be able to flag out-of-distribution samples, or for diffusion modeling. Right now, diffusion models are ubiquitous for image generation, video generation, and much more for language models as well. There are lots of proposals for diffusion model-based language models. And in that case, for example, X-Tropic developed as a primitive energy-based models. So essentially, they unfold the encoding and decoding process of diffusion. They unfold this process on two different tiles where each of the tiles are energy-based models. So here, the compute primitive is an energy-based model of its own that you can stochastically sample out of the physics of the system. and in particular, Extropic are using subthreshold transistors as like a core device component. So that was just another example. And to build up on Dominique's point about using noise. Hi, my name is Patrick Parkinson. I'm based here at the University of Manchester, and I'm something of an amateur in this field. So please excuse the kind of broad question. But I think it's really fascinating on this panel. We've got people with such a range of backgrounds and places that they currently work. so i was wondering if the goal for a technology like neuromorphics is paradigm shift right and nothing less to achieve that and to demonstrate that you have to have application right to get the application we have different ways of funding this so we look at companies we look at venture capitalists short cycles lots of money what i'm going to ask is something completely different which is in that picture what should publicly funded research be focusing on looking at the risk reward profile. I'd be interested on your opinion. Go for it, Steve. My strong encouragement for those with public funding is to build a neuromorphic zoo, right? And basically have a place, virtual or physical, where you have lots of neuromorphic systems. There are many available already today on the market where people can go and play and understand what they do, develop applications and so on. And by the way, it is worth pointing out that there have been neuromorphic applications since the 90s. The earliest touchpads on laptops had neuromorphic systems that interpreted the finger movements. In the 90s, Andre van Schaik, who's taken over the chair at Manchester, developed a chip that was for about 20 odd years in the Logitech trackball systems using neuromorphic techniques to monitor the movements of the ball. So commercial neuromorphics have been out there for 40 years. It's just the explosion hasn't quite yet happened. So the way I see it, maybe in my very limited focus, is to, from the zoo of membristas or these materials, to choose the ones which can be integrated and can be upscaled. I guess that is one of the challenges that remains and will remain. And I guess from the researchers, we can actually look into that modestly. And just a very quick thing, I feel also we should be careful not to only provide solutions to accelerate or make more efficient what industry is doing, but also propose alternative paths to AI, maybe not only the model of AI that is centered in the data center of these industry giants, but that maybe is more personal that we should really dare propose other ways. I'm Christophos Mutafis and I'm based here in Manchester following up from what Damien said what we should be working on and what you asked about computing primitives for me computing primitives are either top down we need to do hardware to accelerate multi-planet accumulate operations and we need hardware for the attention layer or bottom up Steve suggested bifurcation dynamics, rich dynamics find some physical substrate that emulates these dynamics. So either being inspired from biology or top-down trying to accelerate something. It seems to me that either way, whatever is your preferred way, you need emerging technologies and physical substrates in order to plug that in. From what I understand is either we serve existing technology like transformers or we try to enable bottom-up, richer dynamics, we need the physical substrate and the emerging technologies. Does the panel agree with that or not? All of these richer dynamics and devices that can do more than, let's say, memristors and the crossbar executing matrix vector multiplications. I'm typically skeptical about devices that bring more to the table in terms of, let's say, non-linearity and all of that, Because from what I've seen also with working with academic partners that work on devices that have some intrinsic non-linearity and are supposed to be bringing more to the table, typically they just hurt performance and you cannot really leverage it. And if you look at exploiting these factors, for example, by, I don't know, training a large model that can do interesting things, training a model and then bringing that model to use these dynamics of these devices is very hard. I haven't seen a large demonstration that can successfully exploit these features. So if I just quick follow up, I find what you say very interesting. I also find it interesting because most of the work you talked about is about deploying in-memory computing. So it's deploying memories with CMOS. So in some sense, what you're telling us is that Richard Dynamics, etc., we don't know how to leverage that. But your work is also about devices. So I find it a bit odd what you're saying. We're just trying to design devices that work as reliable. So maybe the algorithm is the bottleneck, as Steve said. Okay, Maxence wants to come in, and then I want to take one last question. I want to double down on your comment, Julian, about scalability. I think that fundamental issue in my eyes in the norm of computing community, physical computing, whatever we call it, is that all the studies are biased by the scale of the proof of concepts. And of course, people may argue you need to start somewhere, you need to start little and then scale up. But I think that in my eyes, I guess it's controversial, but so much time is wasted on writing academic papers on technologies that we know ahead of time will not scale. But if you publish or perish kind of logic pushes you to publish about these things. And I think that if we could tell upfront, you know, investigate in the very first place the scalability of the technology before investing time and money into it, I think that'd be best. So I'm just going to say to that, that's not something you want to say to a bunch of physicists. It's the money in the paper is more than the technology sometimes that I think they're going for. Come on, we're going to have our last question from an engineer. All right, good afternoon. So the question is less technical. Neuromorphic computing is directly linked to AI. And you might have seen articles stating that AI market exhibits the characteristics of a bubble similar to that of the dot-com back in the end of 90s. And Steve might have to say more on that. So the question is, do you think there is truth in these claims? Because at the same time, NVIDIA sells tons of GPUs, memory chips are required for AI applications, or we can safely ignore these threats or claims. Nice ending question. Let's have some short but honest answers. Steve, why don't you start? So is there going to be AI bubble burst? Almost certainly in stock market terms, right? The current valuation of the AI companies is clearly stupidly unsustainable. So that's going to come crashing down. AI as a whole is not going to go away. OK, AI will survive that crash because that crash is just losing stockbrokers money. Right. And slightly more importantly, my pension fund. And mine. Yeah, that matters a bit more. Neuromorphics is moving to sort of join the AI game. And so it will, in some sense, share in that risk, although at the moment it's got nothing like the stock valuation from the big AI companies. So it's not as much threat. But fundamentally, long term, if we want artificial stupidity to become artificial intelligence, we've got to understand the biology better. And neuromorphic systems are a route to that. OK, Tamalika, you said a short answer, so time will tell. Julian. Yeah, I have the much more optimistic opinion on AI. I think, yeah, sure, it's maybe a bubble. And there's some VCs that make wrong bets on, I don't know, some startups, some companies that are just not selling good products. And maybe, you know, also OpenAI will at some point run out of money. But definitely there's going to be some key players that will stay. I think AI will creep into every aspect of the world. It's going to automate so many workflows in companies and governments. and it will make everything much more efficient. Yeah, I always come back to coding, right? If you're using coding agents, you can already see there's people that are not even coding anymore. Long story short, I believe, okay, maybe there's a bubble. I'm not sure it's relevant to talk about a bubble, but AI is fundamental transforming technology here to say, yeah. I think I have to agree on that, although I have no knowledge of economic bubbles. So many times in recent years, AI has been able to achieve what I thought just one year before would be impossible to achieve with current technology. There's something here. Maybe there is some type of bubble, but we are living life changing situation and there's something. Tamalika Banerjee, Julianne Buker, Maxence Ernaud, Steve Ferber and Damien Kallios, thank you so much for coming on to Brains and Machines. and thanks to the Atoms2Bits audience. Thank you, Sonny. It has been an incredible panel session. And for more about the panel, please go to brainsandmachines.net. And now we welcome back our regular commentator, Professor Raul Fertian-Kommett from Johns Hopkins University. Hello, guys. How are you? It's been a little while. Good to see you. Hey, Raul. So, guys. So yes, a little while. So let's start, as usual, with Ralph's impressions, and then we go ahead because I have my points too. Excellent. I enjoyed the discussion. There were some parts that I agreed with strongly and some parts that I disagreed with strongly as well. So I like Steve Ferber's perspective that if we want to move from, what did he say, artificial stupidity to artificial intelligence, we need to go and look at biology. I totally agree with that. I love his perspective as well that sparsity is everything, connectivity as well as number of units and so on. So those are the two pieces that I loved. On the disagreement side, Julian argues that adding complexity to the primitives is not one of his preferences, in a sense, arguing that most of the time it doesn't improve the functionality of the system. And for me, I tend to take a very different perspective of that. I think that ultimately, yes, we may not have a theory that explains how the additional complexity play a role. But I think that's really actually the missing link in everything that's going on with artificial intelligence. We cannot just be depending on matrix vector, ReLU computation. We need to get beyond that and add some of the other richness that we find in biology, be it different neuron behavior, different synapse dynamics, different glia components, as well as the biochemistry, all of that is going to make for better autosherent languages, in my view. And that's complexity. Yeah, so I have three big points on my side, and then I will start with the questions for you guys. So the first point is that the field has shifted from K, something simple, and intelligence will emerge to a more engineered, trained approach, which is what you were referring to, which I think is a very pragmatic pivot, but we possibly lose creativity and therefore discovery on what is important for us. Then my second point is fervor. Spinnaker's 25 years journey around one problem, which is supporting finding and fan out and connectivity, which I think it's the point that we should prioritize at the moment. And then the third one is just a comment from Max, method-driven versus goal-driven. And I understand the comment that it's making that we are all suffering from this publishing and perishing problem. Unless we collectively do something about it, we will still suffer from this problem. So it's a systemic problem in this, nothing that a single researcher can do. So let go then towards the questions that I have So one of the most important points that Fulmer was making and I think it the point for neuromorphic is biology is sparse And the sharpest line of this panel to me is the fact that AIs is wasting lots more energy than necessary because it computes everything all the time But then since we do everything in sparsity, right, in space and in time, and we use this biological efficiency, I wonder why we haven't made our point clearer, right? I want to say that we all agree about the sparsity shell and all that. We talk about it every month. But to put the other side of it is sparsity only works when the actual signal is sparse. And there are times when signals are not sparse. And if you have to engineer a system that is going to be reasonably efficient when it's at that really dense level of signals, then sometimes it can be easier to just engineer for that. Now, I don't agree with that as a paradigm. I'm just saying that's the reason why it's not an easy switch. I'm just trying to show the other side of that. So I guess my perspective is that I think every signal is sparse. It's just a function of finding in what domain is it sparse, right? And then representing it in that domain. That's what I refer to as a sparsity shell, is locating that one domain in which it is sparse. And then after that, you just have to make sure that you use the right connectivity, the right unit sparsity in order to do the computation. I love that idea, Ralt. So can you give us an example? Absolutely. I mean, you think about speech, this continuous speech, right? You are talking and so on and so forth. But if I break it up into spectrograms and I localize it in terms of phonemes, phonemes are sparse, right? And they become pieces of speech. And I think you can make that same kind of argument in different cases, even in vision, right? You look at a video, you think things are moving in particular, but there are some specific frequency components that appear in a certain ways that you can then take advantage of. That's what people do to do sparse reconstruction or compressive sampling. All of that takes advantage of these types of ideas. So they exist in the world. We just have to find the right transformation. And then that's where we go. So I've been thinking a lot about some of these issues and the engineering trade-offs that surround them. And one of them has to do with Inatera and why they've done things the way that they've done, where they've got it part analog, part digital. And I've started to look more into Tianjic as well. And they also have a similar idea, which is that, to paraphrase, pay Caesar what is due Caesar. So for a long time scales, for dense computations, you're maybe passing forward more to the digital when power is absolutely at the crux of it and it's more sparse. sparse, maybe you can do it more in analog, and trying to imagine a scenario in which we can take all of these different technologies that we're working on, including stuff that's more weird and at the edge. And I don't mean at the edge in the conventional sense of little tiny sensors, but at the edge in terms of at the edge of what people think is normal technology, which would be something like brain scales, which is at the extreme end of the kinds of things that we do. And imagine how we could be integrating those and getting the best of all of our different uses of silicon without having to always be choosing just one methodology. So is that the zoo? Yeah, accepting the zoo. So you could imagine brain scales almost being like the brain and have other kinds of neuromorphic elements through the body of an embodied system and have that not be a bad combination. It's just making the most of speed where you need it, power efficiency where you need it, that kind of thing. Yeah, I think I agree with generally that concept. I think it's also true for the type of implementation that we use. What are the primitives? You pick the right one for the right place. You know, where analog makes sense is analog, where digital makes sense is at all, where some other materials make sense is that, right? But I guess the difficulty is being able to make that partition efficient or to optimize the decisions there. Maybe that's where humans still wins compared to AI at this point in the design. We're not ready to let AI do the AI design of themselves just yet. So on that point, together with and led by Elisa Donati, We have done a benchmarking framework for embodied neuromorphic agents and nature machine intelligence that responds exactly to that, right? Having a soft robot that uses the power of neuromorphic for controlling and adaptation and working with the environment and the environment working for the body and the distribution of the sensors, having an intelligence. So that is all linked. So I think I want to mention it also because it's still a good piece of science done recently. One point that I want to make that is linked with what we were talking about before about the zoo of Werber. One question stuck in my head stayed with me and I found it brilliant. What should be publicly funded now? And I want to challenge you on this because I think that Steve's answer is pragmatic and community oriented. Fine. That means it's much more provocative. let's not just make existing AI more efficient, but proposing alternative paths to AI, which I honestly agree slightly more with it. So I would like to know what you think. And then I have a very provocative point that I will go later. Yeah, funding right now is difficult, right? It's going to be hard to convince at least maybe venture capitalists to fund work that is only going to become relevant in 10, 15 years. But that's the academic goal, right? Academic mission should not be trying to compete with the startups that like similar to what Malika is trying to do and so on. Right. That is not the role of academia, I think. We need to look beyond. We need to look at the next implementation version, the next primitive set, the next algorithm perspectives and so on. Right. Funding agencies like, I guess, the equivalent of National Science Foundation here in America, maybe the European Science Foundation and so on should be funding those things. Now, of course, there's been a twist recently, at least in America, where the funding has actually moved away from basic science in some way and looking more as applied science. Now, I have no problem with applied science. I think that's where you actually demonstrate your ideas. But the basic science that is looking for the materials for the future still needs to be real and is refunded. And theory, that's another place where, you know, we need to fund theory because just thinking about the same question that was raised, you know, in terms of how do you take into consideration nonlinearities in the AI of the future? Well, the reason that Julian may argue that it's not effective is because we don't have the theory to support it, in my view. Look for the theory that actually makes it real, provides a pathway to better algorithms, better implementations. And those things can be funded by non-commercial funding entities at this point. Yeah, on this point, I have a provocative point in the sense that I'm sick and tired of hearing, oh, you don't have a killer application. I'm sick and tired of hearing it. We do have applications. I think that we demonstrated for health, lots of application. We demonstrated for Vision lots of applications. So we demonstrated that we can work in very teeny tiny spaces like cars or mobile phones and etc. We did prove something and nobody's asking for quantum computing to show the killer application. But whilst we are asked for that, so possibly it's because there is really some value in what we are doing. And I think that this is linked with the fact that we need more funding, because the problem is that we are not yet ready to show all of the possibilities, because we need to understand still how to move to this paradigm perfectly. I am not a fan of quantum computing. You guys know that. But quantum computing's absolute biggest benefit is that we've always known exactly what its killer application is, right? Its killer application is this ability to decrypt things super fast to emulate physical systems, super fast in parallel, all of that stuff. So I think it is true that sometimes if you have something like Alice and Bob and Eve for quantum communication, I think if you have these stories to tell, it can be really nice in terms of getting you funding. If you're just talking pragmatically from that point of view, having a killer application, and yes, we know we don't want to kill anyone, but having an application that really grabs everybody's interest, that is useful. So I think these things that we come up with, the things that Ralph and I are discussing in the book, things like implants, things like prosthetics, things like care robots, all these things that are for good. They're literally the opposite of killer applications. They're applications for good, for the good of mankind. These kinds of things, we need to talk about them more. We need to focus on them more. Or maybe we need to have a sort of like, Ralph, your revolutionizing prosthetics project. We need a focus on care robots. We need these focuses on implants and the issues around that. And I hope, Ralph, you're about to tell me that there are such projects. But these are the things we need, I think, in order to really make progress, is to have people working together on all of that. I think you're absolutely right. So the reason that RP 2009 made such a big movement and such a big splash because it was a major, it was what, $75 million put in by government to move the technology forward. Now, for the case of care robots that you referred to, there was a moment where that was as similar, maybe not quite $75 million at that time. This was, I think around 2010, was around $30 million that was being invested by the National Science Foundation here. It was an ERC, which stands for Engineering Research Center, that was at CMU doing what was called quality of life technologies, which was about helping people that are disabled and are in the room, elderly, and so on. But that was, again, funded by the National Science Foundation at that scale. And it was always intended to be a demonstration example, right? You put together the technology and demonstrate the effectiveness. And hopefully then with the appropriate technology links to commercial entities, then you transfer the technology out and it becomes the next robot or whatever. But it hasn't happened. With those investments, it still hasn't progressed to the point. And it might have been just that it was too early. Timing is everything. And when they were doing this, it was just a little too early. Now would be the time to fund these things. But the interest is elsewhere. Of course, your friend Elon Musk is saying that he's going to be shipping robots by the end of the year. Is it the end of this year, the end of next year? So we'll see how many people he knocks out. Anyway, the zoo, right? I want to come back to the zoo because I think it's an important aspect of what the panel talked about. And I want to plug Telluride. Telluride has been that zoo for the past 30 years. This is where new technologies came together. People got to play with it. So the zoo is real, and I think we need to continue funding. And this is an example where funding through government agencies has made it possible to happen. Right. So that's one thing. The next thing was, I think the bubble is real. I think it's going to come and the good technologies are going to survive. The ones that are not effective or not powerful enough is not going to survive. And that's normal. That is expected. So we're going to see the cyclic aspect of that. And the last part that I want to say, something that I couldn't quite understand, I don't remember whether it was Max or whether it was Damien who was talking about the accuracy degradation. It sounded to me like they were talking about basically resolution of the computation, right? The fact that you have noise, the fact that you don't have 32 bits to do hardware in, right? You may do it in five bits or less. But I don't see that as accuracy degradation, right? I see it more as the resolution of your computation might be lower, but then you need to use collective computation to regain that accuracy, that representation. To wrap it up, I'd like to emphasize the fact that Telluride, but for example, also Capocatza Neuromorphic Workshop. Capocatza, of course. Bangalore Neuromorphic Workshop. and a new one, Canada Neuromorphic Engineering Workshop, that will be starting this year, are a playground for us to develop the future of everything. And companies, especially companies in neuromorphics, should care about these workshops because this is the playground where, for free, they get out what they can do later in the year. I'm talking to the companies. Think about that. That is to an extremely receptive audience. That's the biggest thing. These are the people who are going to use your stuff. Exactly. I keep saying this to all of the companies. Anyway, let's stop it here for today. Thank you so much, Sunny, for sharing this amazing discussion. And Ralph, for your comments as usual. In the next episode, Sunny, we talk to Dr. Riyad Benesman in New York. We hope you will join us then. that brings another episode of ee times current to its end thank you for listening and thanks again to our guests ee times current is available through all the major podcast platforms but if you get to us at our website at eetimes.com you'll find the transcript along with direct links to the other stories you've mentioned and other resources thanks for listening you