Introductory Technical AI Safety Fellowship

Every semester and summer, AISST runs an 8-week introductory reading group on AI safety, covering topics like neural network interpretability,¹ learning from human feedback,² goal misgeneralization in reinforcement learning agents,³ and eliciting latent knowledge. The fellowship meets weekly in small groups, with dinner provided and no additional work outside of meetings.

See here for the curriculum from last spring (subject to change).

Applications for spring 2025 have closed.

For people interested in AI policy or governance, we recommend our Policy Fellowship. It is possible to participate in both fellowships.

Joint AISST and MAIA workshops, where members and intro fellows discussed AI alignment and interacted with researchers from Redwood Research, OpenAI, Anthropic, and more. Learn more here.