ScalingOpt Optimization at Scale
Discover, compare, and contribute to cutting-edge optimization algorithms designed for large-scale deep learning.
Latest News
Recent updates from the ScalingOpt community
How to Set the Batch Size for Large-Scale Pre-training?
New paper on batch size scheduling for large-scale pre-training, proposing a revised framework for WSD scheduler and dynamic batch size scheduling strategy.
Read PaperJianlin Su's Blog Collection
Added many profound articles from Jianlin Su (Scientific Spaces) covering optimization theory, Muon, and scaling laws.
Explore BlogsScalingOpt Community Growth
Our optimizer database has grown to over 60 implementations! Join us in building the most comprehensive optimization resource.
Join UsOur Team
Meet the members behind ScalingOpt. We thank them for their contributions.
Team member information is continuously updated. We welcome email applications for collaboration.
Featured Optimizers
Discover the most powerful and innovative optimization algorithms powering modern AI
Apollo (2)
2024SGD-like Memory, AdamW-level Performance
Conda
2025Column-Normalized Adam for Training LLMs Faster
Muon
2024Orthogonal weight updates via Newton-Schulz iteration
SOAP
2024Improving and Stabilizing Shampoo using Adam
Industry-Optimized Implementations
Production-ready libraries with improved distributed support and hardware optimization
Hugging Face
Optimizers integrated into Transformers (AdamW, AdaFactor) with native support for distributed training and mixed precision.
Meta Research
Cutting-edge optimization algorithms like Distributed Shampoo developed by Meta for large-scale model training.
NVIDIA TensorRT
Advanced model optimization toolkit for NVIDIA GPUs, focusing on quantization and inference acceleration.
Why Choose ScalingOpt?
Everything you need to understand, implement, and scale optimization algorithms for modern AI
Extensive Optimizer Library
Explore all optimization algorithms from foundational SGD to cutting-edge Adam-mini and Muon, with detailed implementations and PyTorch code.
Research & Learning Hub
Access research papers, tutorials, and educational content covering optimization theory, implementation guides, and latest developments.
Open Source & Community
Contribute to open-source implementations, join GitHub discussions, and collaborate with researchers worldwide on optimization algorithms.
Join the Optimization Community
Connect with researchers and practitioners exploring efficient AI and optimization algorithms. Discover, learn, and contribute to the future of machine learning optimization.