Two days left to apply for the Inference-Time Compute Hackathon, hosted by Cognition, @mercor_ai, @Etched, and @AnthropicAI, where you get:
• 8x H100s per team
• $100k+ in prizes
• Dedicated Agents Track
Build something that pushes the frontier.
Mercor
289 posts
Organizing human intelligence to power the AI economy.
- Agents are only as good as the environments behind them. At Mercor, we've built deep expertise in the realistic, economically-grounded environments that help agents bridge the gap from the lab to real-world usefulness. We want to put that expertise to work for the broader
- Mercor repostedClaude Fable 5's progress in coding (APEX SWE) dramatically outpaced progress in other domains like finance, law, and consulting (APEX Agents). The top reason for the fast progress in coding is that we have GitHub, with over 28 million repositories of human-written code. We're
- Replying to @mercor_aiSign up for the APEX-Agents newsletter: mercor.com/apex Download the APEX-Agents dataset: huggingface.co/datasets/merco… Open-source infra + eval service (Archipelago): github.com/Mercor-Intelli… Technical report: arxiv.org/abs/2601.14242
- Replying to @mercor_aiAPEX-Agents domain breakdown for Claude Fable 5 (Max), Pass@1: Corporate Law: 40.9% (1st) Investment Banking: 47.7% (2nd) Management Consulting: 46.4% (2nd) With 4 runs, Fable solved 246 of 480 tasks, including 8 that no other model has solved. All 8 tasks are in Law,
- Claude Fable 5 places 2nd on APEX-Agents leaderboard @claudeai Fable 5 (Max) scores 45.0% Pass@1, behind Gemini 3.5 Flash (49.6%) and ahead of Claude Opus 4.8 (42.5%). Fable 5 reached 2nd overall while spending far fewer tokens. It used 70% less than Gemini 3.5 Flash and 37%
- Mercor repostedWhat are the top AI researchers saying about the future of human work? @BrendanFoody, co-founder and CEO of Mercor: "People often have this misconception that we won't need data in three years because we'll have super intelligence... AI better than humans at absolutely
00:00Mercor's average pay rate just surpassed $100 / hour. We're mobilizing tens of thousands of top software engineers, bankers, lawyers, and doctors to build the next generation of AI models. Every top economist believes that there will be more jobs in 10 years than there are - Mercor repostedClaude Fable 5's progress on hillclimbing APEX-SWE is accelerating exponentially. While other models focus on reasoning over a code base, Claude has unparalleled results at reasoning over Linear tickets, observability logs, Slack messages, and Google Drive files alongside theClaude Fable 5 takes #1 on APEX-SWE: 65.5% Pass@1 overall. It scores ~18pp higher than Opus 4.8. We tested @claudeai Fable 5 on APEX-SWE which measures whether AI models can do real software engineering work. Fable 5 tops our two APEX-SWE categories: - Integration: 61.3% -
- Replying to @mercor_aiThe pattern is higher leverage, not more effort. Fable 5 spends less of its budget searching and more validating, testing as it goes and catching issues during edits rather than after. It rewrites less too: 3.8 edit iterations versus 6.0 for Opus 4.8. The result: 10% fewer
- Replying to @mercor_aiWhat changed is how Fable 5 works. Where Opus 4.8 reads code one file at a time, Fable 5 investigates in parallel: 10.4 parallel tool calls per trajectory versus zero for Opus 4.8. It searches code, reads files, runs tests, and queries logs at once, instead of chasing one
- Claude Fable 5 takes #1 on APEX-SWE: 65.5% Pass@1 overall. It scores ~18pp higher than Opus 4.8. We tested @claudeai Fable 5 on APEX-SWE which measures whether AI models can do real software engineering work. Fable 5 tops our two APEX-SWE categories: - Integration: 61.3% -
- Mercor repostedAI progress requires (1) compute, (2) algorithms, and (3) data. - The leading compute company is worth $5 trillion. - The leading model company is worth $1 trillion. - @mercor_ai is the leading data company and is currently valued orders of magnitude lower. There's an










