AI safety and alignment
Discussed in 17 analyzed podcast episodes across 10 shows
# AI Safety and Alignment These podcast episodes examine the technical, economic, and geopolitical dimensions of ensuring AI systems remain safe and aligned with human values as development accelerates toward AGI. Key themes include mechanistic interpretability research to understand AI behavior, the tension between private AI companies and government security interests, risks of economic disruption and power concentration, and urgent calls for international regulation and coordination to prevent existential threats. The discussions feature perspectives from AI safety researchers, company leaders, and technologists debating both immediate concerns like deception and reward hacking and longer-term implications of superintelligent systems.
Discussed On
Episodes
Personal AI · May 5, 2026
Behind the Scenes of Anthropic's Day
Elon Musk Podcast · Apr 23, 2026
Claude Mythos finds thousands of hidden vulnerabilities
Making Sense with Sam Harris · Apr 10, 2026
#469 — Escaping an Anti-Human Future
Elon Musk Podcast · Apr 3, 2026
Meta sacrifices human oversight for AI
Modern Wisdom · Apr 2, 2026
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”
Big Technology Podcast · Apr 1, 2026
OpenAI President Greg Brockman: AI Self-Improvement, The Superapp Bet, Path To AGI, Scaling Compute
"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis · Mar 16, 2026
AI Scouting Report: the Good, Bad, & Weird @ the Law & AI Certificate Program, by LexLab, UC Law SF
The a16z Show · Mar 5, 2026
Ben Thompson: Anthropic, the Pentagon, and the Limits of Private Power
TBPN · Mar 2, 2026
FULL INTERVIEW: Ben Thompson on Why Anthropic is Wrong
TBPN · Feb 23, 2026
CitriniPocalypse, Dot Com Lore, Gene-Edited Polo Horses | Alap Shah, Will Brown, Michelle Lee, Mike Annunziata
Elon Musk Podcast · Feb 20, 2026
AI UPDATE: The AI Industry Is a $300 Billion House of Cards Built on Stolen Books and Broken Promises
Dwarkesh Podcast · Feb 13, 2026
Dario Amodei — The highest-stakes financial model in history
TBPN · Feb 10, 2026
Super Bowl Ad Reactions, ChatGPT launches ads, Jordi vs France | Diet TBPN
Latent Space: The AI Engineer Podcast · Feb 6, 2026
The First Mechanistic Interpretability Frontier Lab — Myra Deng & Mark Bissell of Goodfire AI
Dwarkesh Podcast · Feb 5, 2026
Elon Musk - "In 36 months, the cheapest place to put AI will be space”
Hard Fork · Feb 4, 2026
Moltbook Mania Explained
Hard Fork · Jan 23, 2026
Will ChatGPT Ads Change OpenAI? + Amanda Askell Explains Claude's New Constitution
"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis · Jan 1, 2026
Confronting the Intelligence Curse, w/ Luke Drago of Workshop Labs, from the FLI Podcast
The TED AI Show · Apr 11, 2025