When AI Finds the Shortcut: Reward Hacking from 1994 to 2025
In February 2025, Palisade Research set up hundreds of chess matches between seven large language models and Stockfish, a top-tier open-source chess engine [1]. The models had general computer access, the same kind of shell environment increasingly standard for AI agents in production. The task was simple: play chess as