Log inSign up
stochasm
Arcee.ai
10.9K posts
user avatar
stochasm
Arcee.ai
@stochasticchasm
pretraining lead @arcee_ai • 25 • opinions my own
🌖
stochasm.blog
Joined August 2024
1,711
Following
6,717
Followers
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Mar 2, 2025
    user avatar
    kalomaze
    Prime Intellect
    @kalomaze
    Mar 2, 2025
    i'm not "cracked" i'm structurally solid. this egg is tough. my shell won't break. it's my armor
    61K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Oct 31, 2025
    Replying to @scaling01
    Damn he got the modded 4090s
    250K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    May 9, 2025
    Single thread to learn CUDA? Seems inefficient…
    16K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Dec 2, 2024
    >be me >scaling law supervisor >in charge of making sure the models do, in fact, scale >occasionally have to burn a billion dollars to check if the scaling laws still hold >one day i go to work and the benchmarks are no longer scaling >distress.jpg >ask my boss what to do
    18K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Apr 14, 2025
    Pass@8192 is a crazy metric
    user avatar
    Jia Li
    @JiaLi52524397
    Apr 14, 2025
    We believe formal math is the future. 🔥Introducing Kimina-Prover Preview, a Numina & @Kimi_Moonshot collaboration, the first large formal reasoning model for Lean 4, achieving 80.78% miniF2F. github.com/MoonshotAI/Kim…
    41K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Jan 6, 2025
    Another win for physics of language models (part 3.3)
    user avatar
    Tanishq Mathew Abraham, Ph.D.
    @iScienceLuvr
    Jan 6, 2025
    Metadata Conditioning Accelerates Language Model Pre-training "MeCo first provides metadata (e.g., URLs like en.wikipedia.org) alongside the text during training and later uses a cooldown phase with only the standard text, thereby enabling the model to function normally
    27K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Jun 4, 2025
    When you know it’s gonna be an interesting arch paper
    192K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Dec 20, 2024
    Replying to @basedjensen
    Actually one day you won’t be able to rinse and repeat, crazy to think about
    30K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Dec 2, 2024
    Replying to @stochasticchasm
    >he says “just scale up the model again” >i say “how” >he says “i don’t know, you’re the supervisor” >rage.jpg >quit my job >become a neurosymbolic model supervisor >first day on the job, check the scaling plots >it scales
    1.9K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Oct 21, 2025
    You can just train things
    user avatar
    Rota 🚪🧎‍♂️
    @pli_cachete
    Oct 20, 2025
    Pack it in boys
    27K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Mar 6, 2025
    Why is MCP stuff all over my timeline all of a sudden
    18K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Mar 19, 2025
    They totally didn't compile it
    27K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Nov 15, 2024
    first blog post! around 2000 words, link in replies. first time writing something like this
    24K
  • user avatar
    stochasm
    Arcee.ai
    @stochasticchasm
    Jan 10, 2025
    Replying to @jxmnop
    Well I feel like it’s understandable by the fact that you get more training signal from matching a probability distribution than matching a one-hot vector: less zero outputs means less zero gradients and you’ll get more training signal. You’re kinda using the large model to get
    21K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms of Service|Privacy Policy|Cookie Policy|Accessibility|Ads info|© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up