Shashwat Goel

I am an AI researcher, interested in building general, open-ended training environments to improve language models at decision making.

Currently, I'm pursuing a PhD co-advised by Jonas Geiping and Douwe Kiela.

In the past, I have done research at Meta, MATS, and Millennium.

I like honest feedback and enjoy technical discussions, reach out.

Highlights

2025 Dec Training AI Co-Scientists Using Rubric Rewards
2025 Dec Scaling Open-Ended Reasoning To Predict the Future
2025 Dec Blog: How to Game the METR Plot
2025 Sep The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
2025 Jun Pitfalls in Evaluating Language Model Forecasters
2025 Jun Measuring Belief Updates in Curious Agents
2025 Jun Answer Matching Outperforms Multiple Choice for Language Model Evaluations
2025 May Blog: Incorrect Baseline Evaluations Call into Question Recent LLM-RL Claims
2025 Feb Great Models Think Alike and this Undermines AI Oversight
2025 Feb Evaluating Algorithmic Reasoning with Counterexample Creation
2024 Mar The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
2024 Feb Corrective Machine Unlearning
2023 Oct Representation Engineering: A Top-Down Approach to AI Transparency
2023 Jun Proportional Aggregation of Preferences for Sequential Decision Making
...
2016 Mar Where it all started: Anastasi, an underwater settlement