Why long RL runs collapse and how soft resets fix it. Analysis with examples from MAI-Thinking-1 and BFS-Prover-V2.
A reality check on "AI solved an open Erdős problem" claims, with case studies on what AI actually contributed.
A brief overview of my research interests and current research directions.