This repository was originally forked from the official implementation of Score-based Data Assimilation by François Rozet and Gilles Louppe. Their approach is to learn a score-based generative model to enable inference over state trajectories of large scale dynamical systems given noisy state observations. We use their training setup as a benchmark problem for deep learning optimizers on diffusion models.
NOTE: This repository is rather large (~2.4 GB) as it includes the logging files of all experiments we ran.
Training is done in PyTorch. The repo also relies on JAX and jax-cfd to simulate fluid dynamics. All dependencies except jax-cfd are provided as a conda environment file.
For running the benchmark, our setup was as follows:
- local testing on a MacBook Pro M3
- training runs on a SLURM cluster
We provide an environment file that we used for local testing (environment.yml) and the one we used on the cluster where also cuda is needed (environment_cluster.yml). These environments can be installed via
conda env create -f environment.yml
In both case, we need to install additionally jax-cfd, which we recommend to install via
pip install git+https://github.com/google/jax-cfd
Finall, install the sda package (from this repository) with
pip install -e .
To run all experiments, it is necessary to have access to a Slurm cluster. We also log to Weights & Biases, but this is optional and we always store local logging files (see experiments/kolmogorov/logs/).
Further remarks:
- The code detects whether a
SCRATCHenvironment variable exists, and deduces from this if we are running on a cluster or locally. In case of local environment, it sets the PyTorch device tomps. Change this here in case you want to run locally on machines other than MacBook. - All benchmarking runs were the
kolmogorovexperiment. The code for thelorenzexperiment might therefore be disfunctional.
The sda directory contains the implementations of the dynamical systems, the neural networks, the score models and various helpers.
The experiments/kolmogorov directory contains the scripts for the experiments (data generation and training) as well as the logs of all training runs.
The plotting directory contains the scripts for generating the plots in our paper.
If you find this code useful for your research, please consider citing
@Article{schaipp2025,
author = {Schaipp, Fabian},
title = {Optimization Benchmark for Diffusion Models on Dynamical Systems},
year = {2025},
archiveprefix = {arXiv},
eprint = {2510.19376},
publisher = {arXiv},
}
Please consider citing as well the original paper on which this benchmark is based upon:
@inproceedings{rozet2023sda,
title={Score-based Data Assimilation},
author={Fran{\c{c}}ois Rozet and Gilles Louppe},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=VUvLSnMZdX},
}