From Code Search to AI Agents: Inside Sourcegraph's Transformation with CTO Beyang Liu

47 min

•Jan 20, 20264 months ago

Summary

Sourcegraph CTO Beyang Liu discusses the company's evolution from code search to AI agents, highlighting their coding agent AMP which recently topped benchmarks for merged pull requests. The conversation explores the shift in software development toward AI agents, the challenges of non-deterministic programming, and concerns about US dependency on Chinese open-source AI models due to policy constraints.

Insights

AI agents represent a fundamental shift from deterministic programming to stochastic subroutines where developers abdicate correctness and logic to AI systems
The future of software development involves developers becoming orchestrators of multiple AI agents rather than writing code line-by-line
US policy around AI safety may be inadvertently creating competitive disadvantages in open-source AI models, with Chinese models dominating the agentic AI space
Different AI models excel at different points on the Pareto frontier of intelligence vs speed, requiring specialized agents for different tasks
Code review interfaces need fundamental redesign as AI agents generate code at unprecedented rates, making traditional file-by-file review obsolete

Trends

Shift from monolithic AI models to specialized agent architectures with different models for different tasksGrowing dominance of Chinese open-source models in agentic AI applicationsPost-training on open-source models becoming standard practice for product companiesAdvertisement-supported AI coding tools emerging as viable business modelsDeveloper workflow transformation from code writing to AI orchestration and code reviewFlattening of model capabilities leading to increased competition at the model layerState-by-state AI regulation creating compliance complexity for startupsOpen-source model development being constrained by copyright and liability concernsAI coding tools reaching 90%+ code generation rates for experienced developersEmergence of hybrid pricing models combining usage-based and ad-supported tiers

Topics

AI Coding Agents Code Search Technology Open Source AI Models Chinese AI Model Dependency AI Safety Policy Software Development Workflow Transformation Agent Architecture Design Model Post-Training AI Evaluation Methodologies Code Review Interface Design Agentic Tool Use AI Model Pricing Strategies Regulatory Compliance for AI Developer Productivity Tools AI Model Benchmarking

Companies

People

Quotes

"This is the first time in computer science I can think of where we've actually abdicated, like, correctness and logic to us. Like, in the past it was a resource, right? So maybe the performance is different, maybe the availability is different, but, like, whatever I put in, I'm going to get back out. But now we're like, figure out this problem for me."

Martin Casado

"You talk to some devs and they're like, you know, I've never been more productive. But coding isn't fun anymore. That's one of the things that we're trying to solve for."

Beyang Liu

"The United States invented the AI revolution. We built the chips, trained the frontier models and created the entire ecosystem. For right now, if you're a startup building AI products, you're probably writing your code on Chinese models."

Narrator

"By sheer lines of code volume, probably more than 90% of the code that I write these days is through amp. And I think it's only going to get higher and higher level over time."

Beyang Liu

"The best thing we can do is to take a step back and let the free market function. And so to that end, ensuring there's kind of like a standard nationwide set of regulations that's clear and well specified to be going after specific applications and application areas rather than you know, like general, you know, existential risk at the model layer."

Beyang Liu

Full Transcript

4 Speakers

Speaker A

This is the first time in computer science I can think of where we've actually abdicated, like, correctness and logic to us. Like, in the past it was a resource, right? So maybe the performance is different, maybe the availability is different, but, like, whatever I put in, I'm going to get back out. But now we're like, figure out this problem for me.