I am an AI researcher, interested in building general, open-ended training environments to improve language models at decision making.
Currently, I'm pursuing a PhD co-advised by Jonas Geiping and Douwe Kiela.
In the past, I have done research at Meta, MATS, and Millennium.
I like honest feedback and enjoy technical discussions, reach out.
Highlights
- 2025 Dec Training AI Co-Scientists Using Rubric Rewards
- 2025 Dec Scaling Open-Ended Reasoning To Predict the Future
- 2025 Dec Blog: How to Game the METR Plot
- 2025 Sep The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
- 2025 Jun Pitfalls in Evaluating Language Model Forecasters
- 2025 Jun Measuring Belief Updates in Curious Agents
- 2025 Jun Answer Matching Outperforms Multiple Choice for Language Model Evaluations
- 2025 May Blog: Incorrect Baseline Evaluations Call into Question Recent LLM-RL Claims
- 2025 Feb Great Models Think Alike and this Undermines AI Oversight
- 2025 Feb Evaluating Algorithmic Reasoning with Counterexample Creation
- 2024 Mar The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
- 2024 Feb Corrective Machine Unlearning
- 2023 Oct Representation Engineering: A Top-Down Approach to AI Transparency
- 2023 Jun Proportional Aggregation of Preferences for Sequential Decision Making
- ...
- 2016 Mar Where it all started: Anastasi, an underwater settlement