Higher level overview of the current AI Safety landscape:
13 - First Principles of AGI Safety with Richard Ngo
How Does Claude 4 Think? — Sholto Douglas & Trenton Bricken
Why AI alignment could be hard with modern deep learning
Introduction to various sub-fields in AI Safety:
The True Story of How GPT-2 Became Maximally Lewd
An overview of 11 proposals for building safe advanced AI — AI Alignment Forum
Adversarial Machine Learning explained! | With examples.
AI Control: Improving Safety Despite Intentional Subversion — AI Alignment Forum