The Legible Machine
Author — 2023
A book on making intelligent systems understandable to the people they affect. Translated into six languages.
I work at the intersection of machine intelligence and human understanding. My research centres on interpretability — making the systems we build legible to the people who use them. I write essays on technology and attention, and I believe the most consequential engineering decisions are, at heart, decisions about values.
Author — 2023
A book on making intelligent systems understandable to the people they affect. Translated into six languages.
Lead — 2021
Open tooling for the mechanistic interpretation of transformer circuits — now standard equipment in the field.
Essayist
An ongoing series of essays on technology, attention, and the ethics of automation.
The Aurora Institute
Lead the interpretability group. Published widely-cited work on the mechanistic interpretation of transformer circuits, and built the open tooling the field now uses to inspect them.
MIT Media Lab
Studied human–model interaction and how explanations change the way people trust automated systems.
Wolfram Research
Worked on symbolic computation and technical documentation systems.
University of Oxford · Thesis on interpretable representations · 2014–2018
University of Cambridge · 2011–2014
Python · PyTorch · Interpretability · Research · Technical Writing · Mathematics · Causal Inference · Statistics
Poetry, Long-distance running, Classical piano