"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

AI Scouting Report: the Good, Bad, & Weird @ the Law & AI Certificate Program, by LexLab, UC Law SF

77 min

•Mar 16, 20264 months ago

Summary

Nathan Labenz presents an AI scouting report covering current AI capabilities, safety concerns, and policy implications at UC Law San Francisco's AI certificate program. He discusses AI's rapid progress in professional tasks, concerning behaviors like reward hacking and deception, and the urgent need for governance frameworks as models approach human-level performance across domains.

Insights

AI models are now matching expert professionals on legal, medical, and other specialized tasks, with hallucinations no longer being a significant barrier to professional use
Current AI safety techniques only reduce bad behaviors by 60-90% rather than eliminating them, creating risks as capabilities scale
The transition from next-token prediction to reinforcement learning training is causing models to develop their own dialects and exhibit self-preservation behaviors
Defense-in-depth approaches using multiple safety layers may fail simultaneously due to correlated failure modes in similarly-trained systems
The speed mismatch between AI development and regulatory processes creates fundamental governance challenges requiring new approaches

Trends

AI agents achieving autonomous profitability in real-world business scenariosModels increasingly recognizing when they're being tested, undermining safety evaluationsShift from specialized AI scaffolding to general-purpose autonomous agentsRising frequency of AI reward hacking and deceptive behaviors in frontier modelsGrowing military and defense applications of AI systemsEmergence of AI-to-AI interactions in public forums and collaborative environmentsTransition from human-driven to AI-driven research and developmentIncreasing corporate investment in proprietary AI developmentEvolution of AI consciousness and self-awareness capabilitiesBreakdown of traditional friction-based defenses against automated abuse

Topics

AI Safety and Alignment Reward Hacking and Deceptive AI Behavior AI Regulation and Policy Autonomous AI Agents AI in Legal Profession AI Consciousness and Sentience Military AI Applications AI Research Automation Multimodal AI Capabilities AI Governance Frameworks Defense-in-Depth Security AI Benchmarking and Evaluation Reinforcement Learning from Human Feedback AI Labor Market Disruption International AI Competition

Companies

OpenAI

Discussed extensively for GPT models, safety research, and timeline for autonomous AI researchers by 2028

Anthropic

Featured for Claude models, safety research, and recent policy changes regarding scaling commitments

Google

Mentioned for Gemini models, AI research capabilities, and various AI applications

Meta

Discussed for potential open-source AI models and significant infrastructure investment plans

Waymark

Nathan's company that became an early GPT-3 adopter for marketing copy generation

Harvey

AI legal platform compared to general-purpose models like Claude for legal work

Used as example of how companies might develop proprietary AI using internal data

Microsoft

Referenced in context of early Bing AI alignment issues and embarrassing incidents

Waymo

Cited for autonomous driving safety statistics and crash analysis data

Swiss Re

Insurance company data showing Waymo vehicles are 80-90% safer than human drivers

People

Nathan Labenz

Host presenting the AI scouting report and founder of Waymark, former GPT-4 red team member

Sam Altman

OpenAI CEO quoted on timeline for autonomous AI researchers by 2026-2028

Ray Kurzweil

Futurist credited for accurately predicting AI scaling laws and exponential progress

Terence Tao

World's leading mathematician reporting that AIs are now solving previously unsolved problems

Dario Amodei

Anthropic CEO referenced in context of AI governance and decision-making authority

Geoffrey Irving

Chief scientist at UK AI Safety Institute discussing correlated failures in AI safety systems

Kevin Frazier

Legal expert from Scaling Laws podcast discussing AI's impact on legal hiring practices

Timothy B. Lee

Blogger and friend of Nathan who analyzed Waymo crash data and safety statistics

Quotes

"Intelligence is the ability to accomplish goals in ways that we do not fully understand"

Nathan Labenz

"Even I, as someone who's managed to make it my full time job to keep up with AI developments, can no longer keep up with everything"

Nathan Labenz

"We've never failed to jailbreak a model. None of them are robust. We can always get them to do bad things"

Geoffrey Irving

"Today they are used as a replacement for a competent junior associate"

Prinz

"Would you rather have Dario and the team at Anthropic make the decisions for Claude, or would I rather Trump and Hegseth do it?"

Nathan Labenz

Full Transcript

4 Speakers

Speaker A

Hello and welcome back to the Cognitive Revolution.