Skip to content

AlexGoldie/discobench

Repository files navigation

DiscoBench: An Open-Ended Benchmark For Algorithm Discovery

DiscoBench Logo

This repository contains code for the DiscoBench modular benchmark for automated algorithm discovery.

Quick Start

Install DiscoBench:

pip install discobench

List available domains:

discobench get-domains

Create a task:

discobench create-task --task-domain OnPolicyRL

See the full documentation for detailed usage. Please note that each task_domain has its own set of requirements which may need to be installed.

Every domain includes references in discobench/tasks/<task_domain>/utils/_reference.txt.

Task Domains

Task Domain Modules Datasets Description
BayesianOptimisation acq_fn, acq_optimizer, domain, next_queries, surrogate, surrogate_optimizer 11 synthetic functions (Ackley, Branin, Bukin, Cosine, DropWave, EggHolder, Griewank, Hartmann, HolderTable, Levy). Optimization of black-box functions using surrogate models to find global minima/maxima.
BrainSpeechDetection loss, networks, optim 7 LibriBrain tasks. Detecting speech features directly from brain activity data.
ComputerVisionClassification loss, networks, optim, preprocess CIFAR10, CIFAR10C, CIFAR10LT, CIFAR100, FashionMNIST, MNIST, OxfordFlowers, StanfordCars, TinyImageNet. Image classification on a range of datasets.
ContinualLearning optim, regularizer, replay, sampler, scheduler PermutedMNIST, SplitCIFAR100, TinyImageNetSplit. Training a model on continually changing data, such that it can adapt to new data without losing old capabilities.
GreenhouseGasPrediction data_processing, model 4 Mauna Loa Time-series (CO2, N2O, SF6, CH4). Time-series forecasting of atmospheric greenhouse gas concentrations.
LanguageModelling loss, networks, optimizer OPCFineWebCode, OPCFineWebMath, LMFineWeb, TinyStories. Training transformer-based models on code, mathematics, and narrative text.
ModelUnlearning loss MUSE, TOFU, WMDP_Cyber. Fine-tuning pretrained models to remove specific knowledge or data points while retaining others.
OffPolicyRL q_update, policy, networks, optim, rb, train 4 MinAtar games. Value-based RL for training an agent in MinAtar.
OnPolicyRL loss, networks, optim, train 4 MinAtar, 7 Brax, 2 Craftax. Training an RL agent in a range of different RL environments using PPO-style algorithms.
UnsupervisedEnvironmentDesign sample_levels, train_step, train_loop, config 3 Kinetix sizes, Minigrid. Generating and curating training environments/levels to improve RL agent generalization.

Development Setup

1. Set Up Your Development Environment

Install the environment and the pre-commit hooks with:

make install

This will also generate your uv.lock file.

Contributing

We welcome contributions! DiscoBench grows stronger with more tasks and domains.

See CONTRIBUTING.md for detailed development guidelines.

Citation

If you use DiscoBench in your research, please cite:

@article{goldie2025discobench,
  title={DiscoBench: An Open-Ended Benchmark For Algorithm Discovery},
  author={Alexander D. Goldie and Zilin Wang and Adrian Hayler and Deepak Nathani and Edan Toledo and Ken Thampiratwong and Aleksandra Kalisz and Michael Beukman and Alistair Letcher and Shashank Reddy and Clarisse Wibault and Theo Wolf and Charles O'Neill and Jakob N. Foerster and Shimon Whiteson and Roberta Raileanu},
  year={2025}
}

License

DiscoBench is released under the MIT License.

About

Official implementation of "DiscoBench: An Open-Ended Benchmark For Algorithm Discovery"

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •