Log inSign up
Michael Poli
440 posts
user avatar
Michael Poli
@MichaelPoli6
AI, numerics and systems. Co-founder & Chief AI Scientist @RadicalNumerics.
zymrael.github.io
Joined August 2018
251
Following
3,422
Followers
  • user avatar
    Michael Poli
    @MichaelPoli6
    Mar 7, 2023
    Attention is great. Are there other operators that scale? Excited to share our work on Hyena, an alternative to attn that can learn on sequences *10x longer*, up to *100x faster* than optimized attn, by using implicit long convolutions & gating šŸ“œarxiv.org/abs/2302.10866 1/
    154K
  • user avatar
    Michael Poli
    @MichaelPoli6
    Feb 19, 2025
    [1/7] Introducing Evo 2, a new foundation model for biology. šŸš€ Evo 2 is the largest-scale, fully open-source AI model ever released: 40 billion parameters, over 9 trillion tokens, and a 1 million context length. All the details are public: weights, data, training infrastructure,
    75K
  • user avatar
    Michael Poli
    @MichaelPoli6
    Mar 28, 2024
    šŸ“¢New research on mechanistic architecture design and scaling laws. - We perform the largest scaling laws analysis (500+ models, up to 7B) of beyond Transformer architectures to date - For the first time, we show that architecture performance on a set of isolated token
    126K
  • user avatar
    Michael Poli
    @MichaelPoli6
    Apr 27, 2020
    [1/4] Excited to share the first experimental release of *torchdyn* github.com/DiffEqML/torch…, a PyTorch library for all things neural differential equations! torchdyn is developed by the core DiffEqML team. @Massastrello @Diffeq_ml
  • user avatar
    Michael Poli
    @MichaelPoli6
    Aug 25, 2025
    Life update: I started Radical Numerics with Stefano Massaroli, Armin Thomas, Eric Nguyen, and a fantastic team of engineers and researchers. We are building the engine for recursive self‑improvement (RSI): AI that designs and refines AI, accelerating discovery across science and
    30K
  • user avatar
    Michael Poli
    @MichaelPoli6
    Sep 30, 2024
    This is what happens when a world-class team sits down and rethinks the way things are done, from architecture design to post-training. Today, we release three language models pushing the boundaries of quality and efficiency, with SOTA performance, minimal memory footprint, and
    user avatar
    Liquid AI
    @liquidai
    Sep 30, 2024
    Today we introduce Liquid Foundation Models (LFMs) to the world with the first series of our Language LFMs: A 1B, 3B, and a 40B model. (/n)
    35K
  • user avatar
    Michael Poli
    @MichaelPoli6
    Nov 14, 2024
    An absolute privilege to see our work on Evo🧬 highlighted on the cover of the latest issue of Science. Thank you to all the friends and collaborators at Stanford (@StanfordAILab) and the Arc Institute (@arcinstitute) @exnx @BrianHie @pdhsu @HazyResearch @StefanoErmon and more.
    user avatar
    Science Magazine
    @ScienceMagazine
    Nov 14, 2024
    A new Science study presents ā€œEvoā€ā€”a machine learning model capable of decoding and designing DNA, RNA, and protein sequences, from molecular to genome scale, with unparalleled accuracy. Evo’s ability to predict, generate, and engineer entire genomic sequences could change the
    49K
  • user avatar
    Michael Poli
    @MichaelPoli6
    Dec 8, 2023
    We've been hard at work pushing the frontiers of efficient architecture design and optimization. StripedHyena-7B is the result: the first alternative architecture truly competitive with the best Transformers of its size or larger. And it's very fast.
    user avatar
    Together AI
    @togethercompute
    Dec 8, 2023
    Announcing StripedHyena 7B — an open source model using an architecture that goes beyond Transformers achieving faster performance and longer context. It builds on the lessons learned in past year designing efficient sequence modeling architectures. together.ai/blog/stripedhy…
    35K
  • user avatar
    Michael Poli
    @MichaelPoli6
    Jun 11, 2022
    Let us embark on a fractal journey about dynamical systems and neural implicit representations... 1/
    GIF
  • user avatar
    Michael Poli
    @MichaelPoli6
    Jun 8, 2023
    Hungry for more content on efficient long context models after @srush_nlp's awesome keynote? We put together some of our perspectives in a short note:
    user avatar
    Sasha Rush
    @srush_nlp
    Jun 4, 2023
    Do we need Attention? (v0 github.com/srush/do-we-ne…): Slides for a survey talk summarizing recent Linear RNN models with a focus on NLP. Tries to cover a lot of different S4-related models (as well as RWKV/MEGA) in a digestible way.
    hazyresearch.stanford.edu
    The Safari of Deep Signal Processing: Hyena and Beyond
    Hyena is a large language model that uses long convolutions and gating to reach attention quality with lower time complexity.
    36K
  • user avatar
    Michael Poli
    @MichaelPoli6
    Dec 12, 2021
    Join us Dec 14th (EST time) for the NeurIPS workshop "The Symbiosis of Deep Learning and Differential Equations": dl-de.github.io This is also your chance to submit questions to our great lineup of panelists, via: forms.gle/6seK279g4AxpeM…
  • user avatar
    Michael Poli
    @MichaelPoli6
    Mar 5, 2025
    New version of the StripedHyena 2 paper is out on arXiv To learn about how we trained large (40 billion parameters) convolutional language models efficiently at one million sequence length, with custom context parallelism: šŸ‘‡ All code is available
    6.7K
  • user avatar
    Michael Poli
    @MichaelPoli6
    Jul 25, 2020
    [1/n] The community has been hard at work to speed up Neural ODEs, e.g. regularization strategies @DavidDuvenaud @chuckberryfinn to keep the ODE easy to solve. We've also been thinking about the same problem, and we propose a different (compatible!) direction. @Massastrello
    GIF
  • user avatar
    Michael Poli
    @MichaelPoli6
    Dec 10, 2023
    I'm going to be at NeurIPS to present work on efficient model architecture and inference (with @exnx @Massastrello and others) HyenaDNA: arxiv.org/abs/2306.15794 Laughing Hyena: arxiv.org/abs/2310.18780 Excited to catch up with old friends and make some new ones - DM if you'd
    arXiv logo
    arxiv.org
    HyenaDNA: Long-Range Genomic Sequence Modeling at Single...
    Genomic (DNA) sequences encode an enormous amount of information for gene regulation and protein synthesis. Similar to natural language models, researchers have proposed foundation models in...
    10K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

TermsĀ·PrivacyĀ·CookiesĀ·AccessibilityĀ·Ads InfoĀ·Ā© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up