prof_pic_cropped.jpeg

I am a research scientist and founding member at Apollo Research working on evaluations of large language models for deception capabilities. Previously, I was a MATS scholar working with Owain Evans on evaluating out-of-context reasoning and co-discovered the Reversal Curse.

Recently, I worked on the GPT-4 "insider trading deception demo", presented to policymakers at the 2023 UK AI Safety Summit, and contributed to a benchmark of LLM situational awareness.

Active research

I am currently working on:

Past Research