Unlocking the Data Layer for Agentic AI with Simba Khadder

49 min

•Apr 21, 20263 months ago

Summary

Simba Khadder, AI Strategy Lead at Redis, discusses how context management has become the defining challenge in agentic AI. Rather than pre-loading all context upfront (naive RAG), agents should dynamically retrieve data when needed through a context engine architecture built on materialized views, semantic layers, and adaptive memory systems.

Insights

Context, not model capability, is now the primary bottleneck for agentic AI—agents can handle increasingly complex tasks if supplied with relevant, current information on-demand
Materialized views with semantic layers are replacing direct database access for agents, enabling safer, more controlled data pipelines without exposing production systems
Engineering practices must shift upstream to architecture and specification design, with behavior-driven testing replacing traditional code review as the primary quality gate
Memory systems should extract and compact information asynchronously from agent traces, creating tunable projections that improve context over time rather than treating all memories equally
The shift from linear RAG to context-based architectures represents the next major phase in AI application development, requiring fundamental changes to how teams build and review code

Trends

Agents moving from minutes to hours of unsupervised task execution, with capability doubling every 6 monthsMaterialized views and semantic layers becoming standard infrastructure for agentic data accessDesign-first, specification-driven development replacing traditional code-centric engineering workflowsAsync memory extraction and compaction as core pattern for improving agent context over timeNon-technical users (PMs, marketers) becoming capable of building functional AI applications without codingArchitecture and interface design becoming higher-leverage than implementation details in AI-driven developmentBehavior-driven testing and end-to-end testing prioritized over traditional unit test coverageMCP (Model Context Protocol) standardization enabling interoperability across agent frameworks and toolsInternal tools and personal agents becoming primary use case for agentic AI before autonomous production systemsEngineering skill shift from debugging/optimization to maintaining architectural coherence at scale

Topics

Context Engines for Agentic AIMaterialized Views and Data PipelinesSemantic Layers for Agent Data AccessAgent Memory Systems and Async ExtractionRAG vs. Dynamic Context RetrievalFeature Stores and Feature ManagementSpecification-Driven DevelopmentBehavior-Driven Testing for AI SystemsCode Review in AI-Powered DevelopmentMCP (Model Context Protocol) StandardsData Access Control and Row-Level SecurityRedis Data Structures for AgentsMaterialized View Maintenance and ETLAgent Orchestration and Multi-Agent SystemsTemporal Memory Decay and Memory Scoping

Companies

Redis

Simba leads AI strategy; acquired Featureform in 2025; building context engines with materialized views and semantic ...

Featureform

Feature store platform co-founded by Simba; acquired by Redis in 2025; enables defining features as code for producti...

Google

Simba's first employer as a software engineer where he solved complex technical problems

Anthropic

Referenced for advancing agent reasoning capabilities; claims agents can handle 1-hour tasks, doubling every 6 months

OpenAI

Mentioned as competitor to Anthropic in advancing agentic AI capabilities

Spotify

Used as example of personalized recommendations requiring feature engineering and context management

Salesforce

Mentioned as example of external system that could be part of materialized view architecture

Mento

KBall (Kevin Ball) is Vice President of Engineering; co-host of the episode

LangGraph

Agent framework mentioned as successful due to ecosystem integration and extensibility

Cursor

AI-powered IDE mentioned as tool for coding agents; Simba notes it's no longer the only bleeding-edge option

People

Simba Khadder

Co-founder of Featureform (acquired by Redis 2025); discusses context engines and agentic AI architecture

Kevin Ball

Host; co-founded two companies as CTO; founded San Diego JavaScript Meetup; organizes AI in Action discussion group

Quotes

"Context is all that matters. Context is all that matters."

Simba Khadder•End of episode

"The limit is less the model's ability to pay attention and more like, how do you keep relevant tasks in front of it?"

Simba Khadder•Mid-episode

"If the behavior tests are correct and they pass, then the code is right."

Simba Khadder•Code review discussion

"The moat is who can build this context mode that really separates them out from everyone else."

Simba Khadder•Context engine discussion

"I haven't ran the line of code in like maybe almost a year now, which is crazy for me to even like imagine. But why would I? It's lower leverage."

Simba Khadder•Agent-driven development discussion

Full Transcript

AI agents are increasingly capable of reasoning and performing autonomous work over long periods. However, as agents take on more complex, longer-horizon tasks, keeping them supplied with the right information becomes the core engineering challenge. The industry is moving away from preloading context up front toward a model where agents dynamically navigate and retrieve the data they need, when they need it. Redis is approaching context management using a context engine, which is an architecture built around four pillars. On-demand context retrieval, data that is always current, fast retrieval, and a memory layer that improves over time. In practice, this means building materialized views of data with the semantic layer on top, rather than giving agents direct access to production databases. A memory system sits alongside this, extracting and compacting information asynchronously as the agent works. Simba Cotter leads AI strategy at Redis, and he previously co-founded the feature store platform Featureform, which was acquired by Redis in 2025. In this episode, Simba joins Kevin Ball to discuss why context has become the defining challenge in agentic AI, how context engines differ from traditional RAG architectures, how materialized views underpin reliable agent data pipelines, how memory systems can improve through async extraction and compaction, and how engineering teams need to adapt their practices as AI-driven development accelerates. Kevin Ball, or KBall, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co-founded and served as CTO for two companies, founded the San Diego JavaScript Meetup, and organizes the AI in Action discussion group through Latent Space. Check out the show notes to follow KBall on Twitter or LinkedIn, or visit his website, kball.llc. Simba, welcome to the show. Thanks for having me. Yeah. Well, let's get started learning a little bit about you and your background, and then we can move into Redis and talking about that. So how do you explain yourself to folks? When I grow up, I want to surf all day. But for today, I'm in AI. Okay. I mean, I could resonate with that. Surfing is a blast. Yeah. I started as a software engineer. I'm technical. When I started my career, I was at Google, a software engineer there. I solved a lot of really fun technical problems, worked with some super smart people. But I was always kind of itching to like go and learn on a slope that I felt like I had been learning on before. So I left Google. I started my first company, kind of did well there, started a company after, which was FeatureForm, which then was acquired by Redis. And I've always loved Redis as a product. So it's kind of awesome to become a huge part of the AI strategy here. But yeah, that's kind of the short. All right. Well, can you quickly describe FeatureForm and what it is so that we have that as well as background? Yeah. So FeatureForm, we were a feature store company, VC backed kind of that style of company. The problem that we solved was around anyone building models and getting them to production. Every model, like when you open Spotify, you get all these personalized recommendations. You can imagine that every time you open the app, it's looking up your favorite song, favorite category, favorite genres, all these concepts about you, these signals. And you can imagine that there's this whole team at Spotify whose whole job is to optimize these signals, come up with new ones, et cetera. Those signals are called features. And what Featureform was and is, the whole team is here and we're growing it and hiring, by the way. But what Featureform does is it enables a data scientist to define their features as code. The name even, I want Terraform for features. And it deploys those to production and keeps them up to date. It runs a compute on your compute, whether it's Spark or Snowflake or whatever. It maintains like a view, like a materialized view in Redis so that you can get your features up to date, but you have your training data as well. What's interesting, and I'm sure we'll get into it is if you change feature for context variable, it looks not too dissimilar, excluding the training part for what people are building today from agents. Yeah. So that actually is a nice transition into kind of what I think a lot of the meat of this, right? Five years ago, you didn't have to worry about what was getting fed to your model or how it was representing things or any of this, unless you were deep down in data science or integrating it and you were doing that. nowadays, we're all having to learn about machine learning. How do you put things in there? What does context even mean? All these different things. So I'd be curious to kind of get your take on the environment that we're in right now. Like as devs, we're at the bleeding edge of it. So maybe we start there of how this is changing the world of software development. And then we can kind of dive into how those feature pieces are becoming context and what that looks like. Yeah. The thing that's happening, there's a few things happening in parallel, and they're all really interesting. One is it's even funny to say like us as devs, because I almost feel like everyone's a dev now in an interesting way. Like our jobs are changing. Not that like, if you need to build a website or an app, my uncle could go do that or my uncle could go put something together, you know, things have changed. Now, could they build a database? No. So anyway, this is kind of an aside. The main thing that we're seeing in terms of the overarching landscape is a couple of years ago, even last year, you could trust an agent unsupervised for maybe a few minutes. And that was cool. I mean, that was huge change from before. Like imagine like five minutes of an agent running by itself to software tasks. Now, I mean, I think the actual metric is about an hour. You can trust an agent to solve unsupervised, solve a software task that takes up to an hour to complete, which is roughly around complexity. And the interesting thing is, is Anthropix says that that number is going to double every six months. So if it's an hour now and we do a repeat, you know, in a year, it's going to be four hours of unsupervised agents working. And so what's changed with that is if you do RAG, like traditional RAG, naive RAG, what you're going to find is you're not going to be able to get enough context to feed it for four hours. Right. The limit is less the model's ability to pay attention and more like, how do you keep relevant tasks in front of it? Exactly. And my take is, and Redis' take has been that for many use cases, the agent will give the agent access to the context. Let it find what it needs and use it. Don't try to put it all up front. It's not just a context window problem, but almost like tool-based context retrieval. And that's really the thing that's happening now. And if your agents don't look like that and your AI applications don't look like that, then you are not taking advantage of this reasoning wave that is happening. And so your really naive RAG app is pretty much capped out. It's not going to get better because it's not like the agents are getting smarter. They like have more inherent facts, I guess, maybe a little bit. But what they're really good at is they're able to stay coherent for longer. They're able to reason about more complex tasks that have a longer time horizon. and be able to solve them end to end. That's the thing that's changed. It's almost like the RL, the post-training has gotten better, but the pre-training is about the same. So that's, I think, the fundamental thing that's changing. And because of that, all that matters is context. Yeah, so I think you're describing something that I've also seen where we've shifted from this model of trying to pre-compute what are all the things that we can then do a single-step pattern match to instead saying, hey, let's just give this thing, this thing can reason now, give it the ability to pull what it needs, explore, you know, and we're seeing this in the ways that things are being managed, just no longer, as you say, naive rag, where you're like, do a search up front, dump it in the context window, run. It's here's the search tool, call it when you need it, get some things back. We're even seeing that in terms of how you manage local stuff, instead of like, here's all the things I want you to do. You're like, here's a set of skills that you can progressively disclose context when you need it, not previously. So yeah, that totally tracks. What I'd be interested is, Where's the limitation area there? What is it that you can do at Redis or in another environment to facilitate agents' ability to do this? There's a pattern I'm seeing emerge. And the term that we use is a context engine. And what a context engine is, it has, I would say, four pillars. One, an argument that's the most important, is that agents should be able to navigate and retrieve context that they need. And this doesn't have to, by the way, be tool-based. Well, it's always tool-based in some way, but it can be a CLI, it can be MCP. It actually, we have opinions and like with our newer products, we typically do both MCP and CLI, but that's almost an implementation detail of a context engine. So that's one, is that you need to be able to navigate and find data. Two is that your data always has to be up to date. Three is that the data should be fast. For many use cases, speed is really important to have the UX and everything feel natural. This is more obvious if you think of an extreme or like every single tool call as a whole Spark query, which takes three minutes. It just fundamentally doesn't feel like an agent anymore. And the four is that the context should get better with time. This means a lot of things. It could mean personalization. It could mean that it's keeping track of decisions and errors it's made and keeping track of those things so it doesn't make them again. There's a lot in there. But in the end, what you end up with is I have a surface of context that I can go navigate and look through. I can retrieve the context that I need. But I will always feel like that context is either the source of truth or that it's a view. It's always up to date. And then for that, it's always just going to get better. This is the moat. If the reasoning is solved, not solved, but it's constantly getting better and it's doubling every year. The moat is who can build this context mode that really separates them out from everyone else. Arguably, this is literally the difference between Anthropic and OpenAI and everyone else is that as they get better, they get more data and their contexts get better and they're able to build better models. So let's maybe break down those steps. Should we use the example, I think probably most of our audience is familiar with a coding tool of some sort. So we can use a coding tool and like work through what does that actually mean in that context so imagine we're building the redis based version of like cloud code or something like that first step asians should be able to navigate and retrieve the context that they need what does that look like in a code setting i think code and we can go through code i think the thing that makes code a little unique is that it's almost always going to just be on your local file system so the retrieval step and the sync step and a lot of those things are solved just using Git. I think where maybe a good one might be like a customer support agent. So with a customer support agent, what you'll find is one, the information you need. Someone asks, why is my order late? I use this example a lot because if you think of RAG and how you'd build a RAG app, you would take a knowledge base, you would chunk it up, put embeddings. Someone asks, why is my order late? And you would say, here are common reasons for delays, which is terrible. Exactly. Generic search against it. Yeah, totally. And that's what most people do. And what you end up with, like the last iteration of AI apps, we're kind of glorified summarizers. We're pretty much just showing off what LMS can do. The thing is, is that most people kind of know what LMS can do now. I mean, not everywhere, but like for a lot of tech forward places, like San Francisco, New York, like people already get it. So now it's like, okay, when someone asks, why is my order late? I need to go get the order. I might need to go get information about the order, the deliverer. I need to look up our policies. All those things are going to be in different places. So firstly, I need to give the agent tools to be able to access all of those things. Now what you find is there a major question that pops up there It like how do we do we just let our agent have access to our Postgres DB directly and just run queries Like what do we do What could possibly go wrong Yeah, exactly. It's like, huh, why is that prod database? Where is it? It's like, did you delete it? It's like, you're absolutely right. So don't do that. So what we see more often, and this is where the context engine architecture comes up, is that people are going to are building materialized views of data, where they might have all these systems of record, and they don't want to deal with scale and all the other things that come up with a ton of agents hitting it. So it's like, let's create a materialized view. And then on top of a materialized view, let's create almost like a retriever service, a set of tools that can go and access these things. At Redis, things that we are doing are more around like putting almost like a semantic layer on top of Redis. And semantic layers are not new, but we've never seen a semantic layer on top of something like Redis because it would make zero sense before agents. It would be a very strange thing to do. So anyway, one piece is having almost an ETL synchronization layer. Redis has a product called RDI, which is pretty much a ETL. It maintains, it builds views and maintains those views as information changes. Now I have a materialized view of context. I control within that context. I accept rules around how that context is accessed. I don't have to worry about scaling a ton of random systems for agent use and I own it. It's my thing. It's not like, So this part's in Salesforce and this part's here and this part's here. So in the customer support use case, I might have some Postgres databases. I might have some different APIs, et cetera. I build a materialized view. I describe what is this information. It's almost like a semantic layer on top, which we can compile into a set of tools, an MCP endpoint or a CLI. Have the agent connect to that. And now I have a fundamental ETL built for context with a retriever set. So a couple of things I'd be interested to know on that. So one, are you building a single set of materialized views that you're exposing to all these different agents? Or are you able to customize that down to like this agent gets a materialized view with literally what the current customer could see or something like how fine grained can you get this thing? Very fine grained. So you can use ACLs at the row layer. You can do RBAC at the agent layer so the different agents can see different things. You can mix and match different forms. But yeah, for sure. You definitely have to make sure that you nail it down. But it's a lot easier to nail down. when it's in something that you built for context, the issue is like, like if you give access to Postgres, you then have to try to nail down Postgres to work in a generic way. And the thing is, is that people don't typically build Postgres databases with the idea of like any random person can query this. And so they're not well set up for that. And so then it's like, do you try to set that up or do you build a whole different like API suite for it? And then how do we deal with all the new indices that we're going to want to create for all the kind of unusual search patterns that we expect agents to have. So that's where we see the materialized view use case come up. Yeah, that makes a lot of sense. I'd be interested, you mentioned the semantic layer on top of it. And one of the things that I think has been fascinating to see with LLMs is how much semantics matter, right? The more you can shape your data into something that linguistically makes sense to the agent or is able to live in that part of the LLMs training data that is well covered, the better it's going to be able to use it. So what does that look like for Redis and how do you see it used? Firstly, on the point of this thing, it works really well. I agree. And actually you said like for things for agents to understand, I would go and say for humans to understand too. A lot of times we define tables in ways that are optimized for compute. And really what we need here is things that are optimized for the ability to grok it, both as a human or as an agent. What? You mean you don't naturally think in perfectly denormalized tables? I would love to, but I unfortunately do not. And so I'm happy I'm not so deep in that world anymore. But yeah, that's a big thing is that you need to build. And the other interesting thing is actually something, a lot of people, when they think of Redis, they think of speed. And they should. I mean, Redis is the fastest database. But from my perspective, the thing that always made Redis kind of special to me was these data structures. that we have are kind of unique. It doesn't really look like any other database. There's not really anything out there that really looks like Redis unless it's essentially a fork or a clone. Let's actually dive into that. I have more recently gotten into using Redis for a few different things and I'm using Redis streams and other stuff. But like when I first encountered Redis, I just thought of as a key value store. Like that was it. It was just keys and values and nothing more. But there's a lot more. There's so much more. We have some even like statistical data structures, like hyperloglog. We have sets, hashes. We have indices, I mean, vectors, like we support vectors and vector search. But the structure is much more around data structures. And it's not trying to be, it doesn't look like anything else. It's really actually Redis is almost its own brand and category. The thing that's interesting, though, is the reason why people loved Redis is because it's like kind of fit very naturally to a lot of problems and how people structure things in their heads. Oh, this is a set of things. Oh, this is a hash. This is a JSON object. This is searchable text. It kind of fits really nicely in that way. It's not like we have columns and primary keys and foreign keys and cascading deletes and trying to kind of reason about all that stuff. So the thing is that it's really nice for humans. And it turns out it's also really nice for agents. The only thing that was missing was the semantic layer. The semantic layer was essentially the application code before. The application that was writing or knew what was there, kind of just knew what was there. And the code was almost like self-documenting what the data was. But now we need an agent to be able to look at something like Redis and understand what is this and what's in there. And the thing is, is some people use Redis as their primary DB, but more often Redis is used either as a cache or very often as a materialized view. In which case the actual data, the system of record is actually somewhere else. And Redis almost is like the delivery mechanism. And so in that way, it really makes sense to, on the context delivery mechanism, put a semantic layer on top. Yeah, that's interesting. And I think it points to another thing, which is it allows you to maybe not have to separately define to the agent, like what can you read or not? Because if the database itself is semantic, you can say like, hey, go look at my database. you can figure out what to do with it. And that's kind of what you need. Otherwise you end up building a million tools. Those tools keep changing. If we, you know, MCP changes dramatically and you're starting to use tool search on top of tools, then actually now you need to completely change how your descriptions work and how things are defined. But most MCP servers really boil down to data delivery or just context servers. and it makes a lot more sense from our perspective to just define this as a schema. A lot of the concepts that went into what we're thinking about, there's really two concepts that we merged together. One is thinking almost like ORM, less the writing part, but more like the mapping. It's almost like an agent would work really well with that sort of thing. The same reason humans use ORMs and not write SQL because it's much easier to reason about. And then two is like, it's GraphQL. GraphQL is interesting because it didn't really take off the way some people thought it would. It's obviously still very popular and all, but it hasn't like killed REST, like REST, another, you know, ways of building APIs. And the reason why I think is because having used GraphQL, there's a lot of axes of freedom and what you have to write for and how you can query. And in my opinion, I was like, I just like when things have input and output. It's really simple. I know that this input creates this output. I can write really nice tests. I'm not going to miss anything. Performance is easy to control. But agents would actually love something that looks like GraphQL because they actually can take advantage of all that gray space to get exactly what they need and solve for a variety of different problems. The problem is, is that you don't really want to give them an actual GraphQL server because they don't write good GraphQL queries, one. And it's still a lot of them aren't built to handle the scale that you would expect of an agent. But if you take those concepts and put them together, one, you'll be like, okay, we should cache this data, build views of data. We should build a semantic layer. We should enable this layer to hit the materialized view, but if needed, go hit the systems of record. And then you end up with this thing that looks like a context engine minus the memory loop, the loop that makes the context better. But that's kind of how we think about it. So I was going to dig into that. Like, how do you make the context better, right? Because so far we've talked about Redis in the form of essentially a materialized view is a good thing, right? It's read only. It's a view that has been transformed in some way or another. But there's no iteration in that. There's no rights necessarily in that model yet. So what does that end up looking like here? One is I think a lot of it actually doesn't happen of the agent writing. Some of it will. But I think a lot of it happens async. I'll give you an example. If you're using a lot of agents to write code, I know I do, you've probably landed on something that sort of smells like spec-driven development, whether it's exactly that or not. At the very least, you're using markdown files everywhere. You're keeping track of those markdown files. You're editing that way. In many ways, your prompt is almost really just pointing at a file. And one thing that's annoying is that over time, you have to almost compact that stuff, make sure it's still accurate. like prune archive. Yeah. And the same thing you could imagine would be very true for like a customer support bot. It's like, oh, I learned this from this one. I learned this here. I learned that I should not go and call this tool. I should call this tool in this sort of situation over time. Not only will it be too much, it will be conflicting. And so what you need is something that as stuff is happening, it's extracting information. And then it's also compacting information. And these are typically LLM style problems. And so in a way, I almost think of memory as like, there's one piece of this personalization, which I think is how most people are using it today. But I think where it's going to go is around almost like this unique type of ETL. I joke that like every vector DB is more or less a materialized view. Like you have raw documents, the chunking and embedding, it's a weird transformation historically. But I mean, that's a transformation. So everything in the VectorDB is projected exactly from your raw data. It's projection, it's materialized view. And I think what will happen is we're going to see the same thing where it's like, hey, this is everything that we've talked about and all the decisions that were made and pretty much raw traces. And we are going to make these projections that can be tuned, right? I don't think that there's one way to do it. I don't believe that you can just say, hey, here's like the memory server and it just works for everything. like you can tune it to do what you want it to do. And likely what we'll have is like different engines for different use cases. Like I know people talk a lot about like decision traces or context graphs, and there's all these concepts that are really just extraction. There are different types of extractions from raw data that you then fit and feed back to the agent as context. I want to replay a little bit of what I heard, but this is a meaty one. I want to dig into this. So first, just what I'm hearing right now is you're saying, okay, take these historical traces, the conversations, the things that you technically kind of want to learn from, have this thing adapt from, store those in some sort of raw layer. So this is just history, log data, essentially. And then you are going to apply the same tooling concepts that you've been doing here, where you say, okay, let's transform that in some way. Let's materialize that. And that is now part of what's feeding this agent. Yes. Okay. Two things that I'm curious about. So one is the transformation process, the materialization piece. is that something that itself can be modified or evolved by the agent? Like for example agent is trying to answer these questions that says hey the shape of this materialized view is wrong for the types of questions that I asking I maybe consistently am screwing this up and this is caught in an async analysis of like oh this is a consistent source of errors Or maybe it's the agent itself being like, gosh, I wish I could ask this question. Can you change the way that views are being materialized as a part of your feedback loop? So today we have agent memory server, which is open source. and how it works is you can define different types of, I forget what they call them, but I always think of them as transformations, but they're actually prompts. So you're defining strings of what you wanted to accomplish or the type of like, hey, I want you to keep track of decisions that were made. I want you to think of seasonality concepts. I want you to think of grabbing facts. I want you to be optimized for personalizing for users. So there's different types of transformations you can write and you can write custom ones. I do think it's really interesting to start. Sure. A funny problem with building agents is a lot of times you try to stack agents on top of each other to solve problems. A good example of this is like LLM as a judge. It's like, okay, well, we have this problem where like, we can't tell if these are good responses and there's too many of them for human street. So why don't we have an agent go and do it? But how about the LLM judge is bad? Could we like write an agent on top of the LLM judge, judge the judge, so it gets even better. And then you can kind of, you just like keep taking the derivative forever and put agents on top. I do think that we will have self-healing memory in a way where even the memory engine gets better over time. But I think where we're at today is that you can just control the memory engine. And this is part of building a context engine, which is that you tuning those knobs. But yeah, for sure. I mean, the layers of abstraction, they're getting higher. We're already seeing, like, I find myself having to orchestrate, like, when I'm building, I'll be orchestrating like 15 agents and they'll be running and then i use codex today and they ping they literally like make a sound which is kind of annoying because i feel like i'm a waiter it's like i'm running from like table to table to go like wait on them well we could talk about that because i think there's some meat there too so the second question related to this is around controllability because i think one of my personal frustrations with most of the memory systems that i've used to date is they treat all memories the same in the sense that like i'll use chat gpt as an example chat gpt they added memory i've had many conversations with it and now i'm asking for a perspective on this thing and it's referencing some completely unrelated concepts that i talked about three weeks ago like i have a lot of different interests i would like to scope your memories please so don't tell me what you think i wanted because i previously talked about cooking right don't make a cooking metaphor when I'm asking you about code, just talk about code or vice versa. So I'm kind of curious how you all think about the control knobs for users on memory. Yeah, actually, my story, which is funny, was I used it when I was planning my wedding. I was using it as anyone would for all the random details. How should I do this? And how should I do that? And whatever. And to this day, it still is like, since your wedding is coming up. I know, right? And you're like, no, the LLMs are terrible with time that yeah they're terrible with time they can't deal with it you're like i want this to be scoped keep it closed so there are a lot of concepts we have and are tunable around memories being able to fade away temporally like if a new memory comes in that contradicts an old memory even the new one should be used you also talked a bit about like having separate categories and subcategories and almost like layering these things up some of the stuff exists already in in the memory server some of the stuff is like clearly where we're going there's always ux things right Because if you start adding those layers of like, okay, well, there's, let's say, team-wide memory and then individual memory and then there's org memory. And it's like, okay, well, how do you choose which one goes where? What happens if the agent remembers something that's really personal and puts it at the org level? And now everyone knows that you love Katy Perry. And it's those sorts of things that really make it tricky. These are problems that we have super smart people working on constantly. Well, nobody gets it right, right? Spotify thinks I love Katy Perry because my 10-year-old does. The thing that's funny is like I was in feature stores before. I feel like I was kind of broadening distributed systems before that. I think I've always found myself gravitated towards problems that just don't have a solution. There is no right answer. And that's fun because you can keep pulling at it and you can keep learning more and you can be creative. And I think that's where we're at with memory is like at Redis, we're being really creative and creating unique solutions for it. So on the subject of problems where there's no good solutions, Let's go back to your point about orchestrating 15 agents. Because I feel like this is what every, not everybody, most of us are trying to do is like, we're orchestrating all these different things and they're going in parallel. And I don't know about you, but it fries me. Like, I get too many of these things going. And then by like three o'clock in the afternoon, I'm like, all right, I'm done. I have no brain left. What's funny is I love it. It actually really fits into how my brain works. And I see it really frustrate one of my, he's like one of our best engineers but some of the things that made him one of our best engineers are actually getting it's like he's learning how to pivot those things into this new style because the thing he's good at is he could like kind of single thread a problem really well like he could give the hardest problem in the world to him and he'd be able to like single thread it but now it's like yeah but i still need you to do that sometimes but a lot of what i need to do is actually just do that in eight things at once and jump between them and context switch and as a founder that's all I did all day. So I feel like I've been training myself for this day, not knowing it. But people who have not had similar experiences. It's actually interesting because people who've been PMs, people who've managed engineers, they actually tend to do better with agents in this problem than people who haven't. And it's because in some ways, there are lots of skills you got to learn to manage engineers, which are really similar here. You have to context switch constantly. As much as you'd like, sometimes you can't sit down and take the keyboard and just go do it. So you have to learn how to like help build a system or like kind of prompt your way into getting the results you want. Obviously, it's different, but there are skills that are transferable. So the short is like, I think everyone deals with it differently. I think that it's fundamentally changing how we do things, even at Redis on the org level of how we build products. And I think it's only going to accelerate. Can I ask you a little bit about those changes? Because certainly like having conversations, And it's interesting, right? Because I have lots of conversations with people who are pushing the bleeding edge. And then I'll talk with somebody, maybe not on a podcast, and they'll be like, is that really how people are doing? Am I so behind? Am I there? So I think it's really helpful to have real life examples of like, how is this changing what you're doing at Redis? You're not at OpenAI. You're not at one of these bleeding edge, but you are adopting and changing everything. So how is it working? So I think there's a lot of aspects to it. One aspect to it, and it's funny because it's actually one of our predictions for the year, and then it just happened naturally internally, and we're definitely seeing the results, which is that everyone will become a coder. Everyone can build. When I work with PMs, I don't want to read a PRD. I'm like, you could take this PRD and turn it into a prototype in the same amount of time. Let's do that. There are times when a PRD makes complete sense, but you can go build a prototype. And the other thing is features. like it's very easy now this wasn't a problem as much before where you can build these like you get feature happy you're like oh i'll just add this it's just one prompt i'll just add this thing i'll add this thing and so taste is really important like what are we trying to do here what what does good feel like and good is not oh you have every single knob in the world because you can do that now it's it all works together really nicely and so the people who do really well in it are able to move that fast because fundamentally like you can just move so much faster now. You can think of the user, could be a literal customer, it could be an internal user, but you're thinking from their perspective, what does good feel like? And you're just constantly like, where can I use AI to make this better? A good example of this is like, it's actually a story from my wife where she had all these spreadsheets. She does marketing. She was trying to figure out events to focus on. And before she would like try to scan them or just try to figure out how to use Excel in a way that was good enough to be able to get an answer. And she was like, wait, I have clogged. So she like found and built this actually a website. She doesn't know how to write a line of code. She's never in a line of code before. She built an entire website and shared over the marketing team where they could click into things and it had graphics. And I was like, wait, you built this? Like it blew my mind that that was possible. So I think everyone will become one. And I think that's one thing we're seeing is that It's not just an eng thing. So that's one aspect, which is that every single everyone should be using these tools. And the thing is, is actually most of the value is going to come from the engineers. I think a lot of them get it. The different layers, there's a lot to learn. But give someone who's always has all these ideas and they've always wanted to build something or they're like, oh man, like if only I just had an engineer assigned to me so I could have them build this like one-off thing. Now they can just go do it. And so that's where we're seeing a ton of change. That's one. Two, on the end side, I think focusing in terms of like our team, we write a lot of specs. We review those specs. We actually find the architecture, both on the software level and actually on every level, is the thing that matters the most. If we can define an interface and you can define acceptance criteria for that interface, I'm like 98% confident that the agent will be able to build something that solves for that interface with the acceptance criteria. If it's wrong, because it wasn't clear, the interface wasn't good. or the acceptance criteria, we're missing things. In which case, that's a design problem. And so anyway, I could go way deeper, but those are some things that are top of mind. Yeah, it tracks with what I'm seeing as well. I think the non-technical example is kind of interesting because it does mean, you know, I think for core product, the temptation when you can build everything faster is to build everything. And as you highlight, that's not necessarily actually a better experience. But when it comes to like tools for yourself, it's phenomenal. And to go back to your customer service agent or something like that, maybe you have a real customer service person, but they're building themselves an agent to help them do their work. So kind of curious, are there ways in which the context engine that you want to expose to people should be different for those non-technical use cases, right? We talked a lot about, okay, we're using this, we're building a product or we're building a customer service agent that's going to run autonomously. But is there an internal tools context engine? What would that look like? How does that vary? A lot of our internal Slack bots, especially the new ones coming out, are all built on top of contact extensions. We had one example of a bot, actually this is a personal bot someone built for themselves. They would ask questions about their calendar on their Google Cal. And they'd have all these API calls that they hadn't made. And they'd ask the question like, what are my most important meetings this week? And like, oh, when did I meet with this person last? And it asked these questions and it would take 20 minutes because the response time and Google API is not made for the sets of queries you'd ask. So then they actually had the agent build a sync thing so that every time their calendar changes within a few seconds, it just syncs it into Redis. And then they put a contact surface on top and they were able to connect to it that way. So that's one example. I think for internal apps, I think we will see something. it's going to look different because a lot of what Redis, the style is today is much more oriented towards people building with some sort of skill in mind beyond like a single user. I think that you can use it and a lot of people are using it and like kind of one-off, oh, I have my own Redis database. And that is my, just because of the data structures are so nice, because it's so easy to reason about, because the interface, you can just go write commands. Like you can go write the wire format in English, you know, it makes a lot of things easier and really fast. So I think people are doing it that way, but I do think it will look different I think the thing that will be really unique and things that we are thinking about is like if everything looks like a coding agent like OpenClaw is essentially a coding agent That was the whole idea It like if we give a WhatsApp interface to a coding agent it essentially a generic general purpose agent. And really then you start thinking, okay, well, what do we need? We need workers, cloud workers. We need a cloud file system. We need the ability to pull data from different places and make that data accessible. So there's something there. But I think that it's still early days. I'm really, I get super excited just thinking about what's possible. Yeah. If we can solve like the data sandboxing data access problem, because like big challenge with open clause, it's like, okay, to put this on your box, go now it has access to everything. But like an agent that looks like I can write code, I can run code is a general purpose agent. It can do anything. And if we solve the data layer around that so that that can be safe and somebody doesn't have to think too much about it, like that's beautiful. So with that concept in mind, then, are there particular frameworks for building coding agents that you see going forward? Or like, you know, if everybody's going to build their own personalized coding agent, what are the tools they're going to use to do that? So on the agent framework side, I think the ones that are going to be most successful are the ones that work best at the ecosystem. I think that's really what it comes down to, because I think what we're finding is that to make these agents successful, they need to interface with the world and agent frameworks that make it easier to either extensively do that or have built-in things to do that make them much more powerful for example this isn't true i'll use a contrived example to show something that would not succeed let's say someone's like mcp sucks we're just not going to support it we're going to build our own version of mcp they might be able to build something that whatever is more efficient or better in some way but it doesn't matter the thing that's like unique about mcp and valuable mcp is the standardization Everyone's doing it. And so what matters more right now is standards, because if we can build so fast, the only thing you want to avoid is having these proprietary blocks in front of everything. Make everything speak standards and make it, it doesn't have to be standards, it's like make it clear how to contractually touch all the things. And then I can think of and build any agent I want. So in terms of the frameworks themselves, I think the ones that are most successful, most extensible will be most successful. And we see LandGraph a lot. obviously super powerful, very early player in the space. They have a Steve agents framework, which I think is really interesting. We see ADK, obviously a lot, like a lot of the cloud providers are releasing their own solutions. And ADK is really good. I like ADK a lot, actually. There's like some kind of more off ones, but I like one called Agno, for example. That's another agent framework. But in short, it matters, but I think it matters less than, I think as engineers, we get so caught up in our tools. I talked to people who, with our coding agent setup, they have this crazy setup that they put together. I'm like, yeah, man, I just have like 100 Tmux windows. I switch between them in this way. I have the most simple ad hoc setup because it's what works for me. I don't need any... In fact, I didn't use cursor for way too long because I didn't want to switch to an IDE. I was like, I use Vim. I was so happy when the CLI-based versions caught up because, yeah, I was the same way. I'm a longtime Vim user and cursor, like you had to use it because it was at the bleeding edge. It was the best. And even though it sucked, it was still the best, but it's not the best anymore. And so I'm happy back down in my also Tmux enabled, like whatever. I have a bunch of work trees and a bunch of Tmux sessions. I'm like, go. Same. And it's funny because if you had told me, I also haven't ran the line of code in like maybe almost a year now, which is crazy for me to even like imagine. But why would I? it's lower leverage. What's funny is I also feel like I've become a better engineer at the same time because I'm no longer caught up. I feel like the skills of an engineer was like almost, can you Google fool your way into some weird random library to solve this weird random error or kind of like dig deep through the internals of a code base and find this random bug. And I almost feel like that's not that important anymore because the agent can go do that. The thing the agent's really bad at is like maintaining a consistent architecture that scales well and makes sense at the code base over time, especially with the backdrop of I can put out 100,000 lines of code in a week by myself times that by team size of Redis. And it's actually easy to almost overwhelm the system with productivity. It's not meant to handle that. And so the issues actually move downstream. And this comes back to this question of like, how are you all adapting, right? Because once again, like reviewing 100,000 lines of code in a week times N engineers is incredibly hard. So now your reviews maybe have to move upstream to the design or something like that. I don't know. How are you all handling review and the feedback loop there? So there's a few pieces of this. One, it depends on the code base, actually. I think different languages and different severities, like an internal tool, you can get away with a lot more. and Redis core is different. It's a different beast. That said, the thing that makes, the way I think is one, like we will do design sessions. We do them every day. I don't like standups and we just do standups async, but the design sessions, I like doing sync. It's like every day someone comes up like this is the system I'm building. This is this big, the project I'm working on. They'll show me the interfaces. They'll show me how it's all going to work together. I make sure that we really deeply understand what we're going to do and what the acceptance criteria is for all of it. If that's good, then I'm like, yeah, and just go manage some agents to do it. As long as we all agree on what we're doing and it makes sense, then your job kind of becomes, okay, now go manage the agents, go do it. So the first piece is the engineering rigor is much higher. Another example of this is, and it used to be where someone would make a PR and it's not exactly what I want. It's not bad, but it's not what I would have done in. I don't really like it. And I might even tell them and they might be like, yeah, you're right. It's like, do you want me to go rewrite this? And you're just like, we have so much to do. Like, this is fine. and that doesn't happen anymore. Now I'm like, I mean, the code's cheap. Throw it all away and change the spec and regenerate it. You know, and if I keep it, say that you want the same behavior, but you want it to work this way and then figure out the path to get there. So that's one. Two is tests. Behavior tests are like the most important thing. A lot of times when I start, I will start with what is the golden path, both on the error cases and the happy cases that would show with high certainty that this thing is going to do what it's supposed to do end to end. So I write a lot of end-to-end tests, but those are the ones I manually look at and make sure those are good on every PR. I'm like, okay, what do we know works? Did we miss anything? And then obviously I still expect test coverage to be high. I expect a lot of integration tests, a lot of unit tests, a lot of fuzzing on parameters, stuff like that, obviously. but I think the flow has changed where I almost would go as far as to say, if the behavior tests are correct and they pass, then the code is right. And if you're like, well, how about the speed and how about this? And it's like, well, did you have a behavior test for it? If those are important, they should be captured in the behavioral tests, right? Yeah. They should be captured. Yeah. Because then it's doing exactly what's designed to do. So all that matters is the behavior tests. So I'm with you, but I'm going to bring back to a slightly different thing, which you talked about, like the really challenging thing, the really high engineering thing here is how do you keep a system that is architecturally coherent and continues to be maintainable and evolvable over time? So I have seen agents build things that completely match my behavioral tests and yet create all sorts of like entangling of concerns that are going to make it very hard to do things. So how do you square that piece of the challenge of the code base? Yeah. So it's almost like there's three steps. It's like step one, which is the architecture review. And we do that synchronously. The other thing that really benefits, which a lot of people don't talk about, is really good engineers who embrace this fully are like ridiculously productive right now. But there are a lot of people who are trying. I see them, they're embracing it, but they're not able to hit the same layer. I didn't even realize until I saw how big the separation can get. So a lot of the design sessions are one, to make sure that everyone's look at the architecture, it makes sense that we didn't miss anything. Two is it kind of works as a knowledge osmosis, which is really important. But anyway, the first piece is architecture. So I'm assuming by the time we got to the behavior test that we've already decided on the architecture and I'm expecting to see the same interfaces that we talked about and that the spec, which we have written and looked at, which is, by the way, in plain English and not super long, it's long enough, but it's not, a lot of people end up implementing the thing in the spec, which is not the point. The spec should be readable by human and reviewable by human. And then what's funny is like, I used to still treat code review at the same rigor I used to myself. And then what I would find is that BugBot and similar products would actually find things that I missed. And then it just kept happening. And I, you know, I spent a lot of time code reviewing it. Some of the things that finds like, oh, there's this race condition. And I looked at how it figured out what the race condition was. And I was just like, I give up. I'm like, there's no world I would have been able to put all those things together to catch this race, which by the way, would like never actually be triggered, but it's there. Where is this like crazy? It is real. Yeah, no, I agree. Bugbot in particular is shockingly good. I don't use the cursor ID anymore. I would not survive without Bugbot. It's phenomenal. But yeah, so I think that is really interesting. And I think one of the things that is really important there that you're calling out is the fact that these things are doing the code has not lowered the requirement on engineering. If in anything, it's increased it. And you're doing these very synchronous discussions and debates. And like you slow down in some areas in order to be able to speed up and just generate code quickly. Yeah. The trick is how do you do that? And I end up in a waterfall like system, which some people push back on. And the idea is that we come back and edit the specs. We don't always get them right. And sometimes going through it, you actually, you can get better. You can iterate on them. A lot of times you can't even like the size of the work done is not one thing. It would take a few things. So we kind of break them up into almost like sub specs and there's a whole process there. But the cool thing is for us, like I read this is like, since we've gone so good at this and we've embraced it fully, like one thing that we do in our leadership does, which is really amazing here is that like the default answer is yes of AI in a sense of like, we try and enable everyone. Like I talked to a company where it's like, yeah, we're just getting access. Like some people have select access to like, you know, this. And I'm like, you don't even realize how behind you are if you're not letting your engineers use this stuff. I understand, you know, why people might feel that, oh, you know, we need to control this more. But it's night and day. So it lets us build amazing products really quickly. And because we've learned how to do it early on and we haven't let the quality drop, that's other things. There's a difference between agentic powered engineering and vibe coding. And nothing wrong with vibe coding, but I mean, all my demos are vibe coded, but they're demos. I don't vibe code database code. Yeah, there's pace layers, right? There's like, okay, this thing can be vibed. And if you don't like it or it breaks, throw it away, do it again. This thing, you got to keep it working right. Yeah. Awesome. I think that's super helpful. We've covered a lot. We're getting close to the end of our time here. Is there anything we haven't talked about that you think we should discuss before we wrap? I think the big thing and the biggest takeaway is in this next phase, which nowadays that could be like another year or two, but the big change is going to be the switch of systems from linear rag style to context based. So everyone's going to be building these context engines. And I think that that's just true. I mean, we at Redis are obviously enabling it and building a ton of awesome products to enable it. But I think that that is the biggest shift is that attention is all that matters. Paper is what kind of got us here. And I think if you're building real products, you just assume that someone else is dealing with attention and you're for you. Context is all that matters. Context is all that matters. Love it.