An Overview of AI Safety | Notion

Higher level overview of the current AI Safety landscape:

13 - First Principles of AGI Safety with Richard Ngo

How Does Claude 4 Think? — Sholto Douglas & Trenton Bricken

Why AI alignment could be hard with modern deep learning

The True Story of How GPT-2 Became Maximally Lewd

An overview of 11 proposals for building safe advanced AI — AI Alignment Forum

Adversarial Machine Learning explained! | With examples.

AI Control: Improving Safety Despite Intentional Subversion — AI Alignment Forum

Zoom In: An Introduction to Circuits

Toy Models of Superposition

What is mechanistic interpretability? Neel Nanda explains.