Log inSign up
Mercor
293 posts
user avatar
Mercor
@mercor_ai
Organizing human intelligence to power the AI economy.
San Francisco
mercor.com/careers
Joined April 2021
27
Following
20K
Followers
  • Mercor reposted
    user avatar
    Dwarkesh Patel
    @dwarkesh_sp
    Jun 19
    Narration: the data efficiency black hole. 00:00:00 – What is really driving AI progress? 00:03:11 – Comparing human vs AI sample efficiency 00:08:46 – Does sample efficiency matter? Also on pod and YouTube feed.
    00:00
    174K
  • Mercor reposted
    user avatar
    Brendan (can/do)
    Mercor
    @BrendanFoody
    Jun 19
    72% of users on the Mercor platform come from referrals. What other platforms have a higher referral rate?
    13K
  • Mercor reposted
    user avatar
    Brendan (can/do)
    Mercor
    @BrendanFoody
    Jun 15
    Organizing human intelligence is the most important problem in society. Every enterprise will need humans to constantly refine their system of record for evaluating and specifying agent behavior. As AI becomes more powerful, humans will become more important not less.
    user avatar
    Satya Nadella
    Microsoft
    @satyanadella
    Jun 14
    Article
    A frontier without an ecosystem is not stable
    I’ve been thinking a lot about the future of the firm in an AI-driven economy. This transition is different than any previous platform shift. In the past, we used digital systems to enhance human...
    24K
  • Mercor reposted
    user avatar
    Brendan (can/do)
    Mercor
    @BrendanFoody
    Jun 14
    A year ago, I predicted that we would enter The Era of Evals, but it's now happening much faster than I anticipated. Frontier Labs have scaled their Eval production with us by more than 10X in the last 12 months, on what was already a 9-figure base. Every tech-forward
    user avatar
    Brendan (can/do)
    Mercor
    @BrendanFoody
    Jun 30, 2025
    Mercor (@mercor_ai) is now working with 6 out of the Magnificent 7, all of the top 5 AI labs, and most of the top application layer companies. One trend is common across every customer: we are entering The Era of Evals. RL is becoming so effective that models will be able to
    64K
  • Mercor reposted
    user avatar
    Cognition
    @cognition
    Jun 11
    Two days left to apply for the Inference-Time Compute Hackathon, hosted by Cognition, @mercor_ai, @Etched, and @AnthropicAI, where you get: • 8x H100s per team  • $100k+ in prizes  • Dedicated Agents Track Build something that pushes the frontier.
    30K
  • user avatar
    Mercor
    @mercor_ai
    Jun 11
    Agents are only as good as the environments behind them. At Mercor, we've built deep expertise in the realistic, economically-grounded environments that help agents bridge the gap from the lab to real-world usefulness. We want to put that expertise to work for the broader
    16K
    user avatar
    Mercor
    @mercor_ai
    Jun 11
    Read more on @huggingface:
    The Open Source Community is backing OpenEnv for Agentic RL
    From huggingface.co
    1K
  • Mercor reposted
    user avatar
    Brendan (can/do)
    Mercor
    @BrendanFoody
    Jun 11
    Claude Fable 5's progress in coding (APEX SWE) dramatically outpaced progress in other domains like finance, law, and consulting (APEX Agents). The top reason for the fast progress in coding is that we have GitHub, with over 28 million repositories of human-written code. We're
    51K
  • user avatar
    Mercor
    @mercor_ai
    Jun 11
    Claude Fable 5 places 2nd on APEX-Agents leaderboard @claudeai Fable 5 (Max) scores 45.0% Pass@1, behind Gemini 3.5 Flash (49.6%) and ahead of Claude Opus 4.8 (42.5%). Fable 5 reached 2nd overall while spending far fewer tokens. It used 70% less than Gemini 3.5 Flash and 37%
    APEX-Agents | Claude Fable 5
    58K
    user avatar
    Mercor
    @mercor_ai
    Jun 11
    APEX-Agents domain breakdown for Claude Fable 5 (Max), Pass@1: Corporate Law: 40.9% (1st) Investment Banking: 47.7% (2nd) Management Consulting: 46.4% (2nd) With 4 runs, Fable solved 246 of 480 tasks, including 8 that no other model has solved. All 8 tasks are in Law,
    APEX-Agents | Claude Fable 5
    4.6K
    user avatar
    Mercor
    @mercor_ai
    Jun 11
    Sign up for the APEX-Agents newsletter: mercor.com/apex Download the APEX-Agents dataset: huggingface.co/datasets/merco… Open-source infra + eval service (Archipelago): github.com/Mercor-Intelli… Technical report: arxiv.org/abs/2601.14242
    APEX - Mercor
    APEX Benchmarks: The AI Productivity Index | Mercor
    From mercor.com
    4K
  • Mercor reposted
    user avatar
    MTS
    @MTSlive
    Jun 9
    What are the top AI researchers saying about the future of human work? @BrendanFoody, co-founder and CEO of Mercor: "People often have this misconception that we won't need data in three years because we'll have super intelligence... AI better than humans at absolutely
    00:00
    user avatar
    Brendan (can/do)
    Mercor
    @BrendanFoody
    May 6
    Mercor's average pay rate just surpassed $100 / hour. We're mobilizing tens of thousands of top software engineers, bankers, lawyers, and doctors to build the next generation of AI models. Every top economist believes that there will be more jobs in 10 years than there are
    9K
  • Mercor reposted
    user avatar
    Brendan (can/do)
    Mercor
    @BrendanFoody
    Jun 9
    Claude Fable 5's progress on hillclimbing APEX-SWE is accelerating exponentially. While other models focus on reasoning over a code base, Claude has unparalleled results at reasoning over Linear tickets, observability logs, Slack messages, and Google Drive files alongside the
    user avatar
    Mercor
    @mercor_ai
    Jun 9
    Claude Fable 5 takes #1 on APEX-SWE: 65.5% Pass@1 overall. It scores ~18pp higher than Opus 4.8. We tested @claudeai Fable 5 on APEX-SWE which measures whether AI models can do real software engineering work. Fable 5 tops our two APEX-SWE categories: - Integration: 61.3% -
    APEX-SWE | Claude Fable 5
    18K
  • user avatar
    Mercor
    @mercor_ai
    Jun 9
    Claude Fable 5 takes #1 on APEX-SWE: 65.5% Pass@1 overall. It scores ~18pp higher than Opus 4.8. We tested @claudeai Fable 5 on APEX-SWE which measures whether AI models can do real software engineering work. Fable 5 tops our two APEX-SWE categories: - Integration: 61.3% -
    APEX-SWE | Claude Fable 5
    117K
    user avatar
    Mercor
    @mercor_ai
    Jun 9
    What changed is how Fable 5 works. Where Opus 4.8 reads code one file at a time, Fable 5 investigates in parallel: 10.4 parallel tool calls per trajectory versus zero for Opus 4.8. It searches code, reads files, runs tests, and queries logs at once, instead of chasing one
    4K
    user avatar
    Mercor
    @mercor_ai
    Jun 9
    The pattern is higher leverage, not more effort. Fable 5 spends less of its budget searching and more validating, testing as it goes and catching issues during edits rather than after. It rewrites less too: 3.8 edit iterations versus 6.0 for Opus 4.8. The result: 10% fewer
    Mercor Logo
    APEX-SWE: AI Rankings for Software Engineering | Mercor
    From mercor.com
    3.1K
  • Mercor reposted
    user avatar
    Brendan (can/do)
    Mercor
    @BrendanFoody
    Jun 6
    AI progress requires (1) compute, (2) algorithms, and (3) data. - The leading compute company is worth $5 trillion. - The leading model company is worth $1 trillion. - @mercor_ai is the leading data company and is currently valued orders of magnitude lower. There's an
    77K
  • Mercor reposted
    user avatar
    Brendan (can/do)
    Mercor
    @BrendanFoody
    Jun 4
    I told @HarryStebbings on @20vcFund that @mercor_ai now spends more on tokens for our internal agents than we do on headcount. In a few years, every enterprise will too. We already see that the companies pulling ahead are running like AI labs: an eval for every workflow, agents
    48K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up