Skip to content

yasahi-hpc/Generative-EnKF

Repository files navigation

Generative Ensemble Kalman Filter (Generative EnKF)

CI

Generate pseudo ensembles with an observation guided diffusion model

Generative EnKF is designed to surrogate data assimilation (DA) for numerical simulations without ensemble runs. Forecasting a real world system is often achieved by combining a scientific model for the time evolution of the system and an estimate of the current state of the system. Basically, DA like Ensemble Kalman Filter (EnKF) blends the simulation states and observation data iteratively to give a reasonable estimate of the current state. Though widely used, EnKF has two critical issues: fragility to model biases (errors) and the ensemble simulation costs. Here, we propose a DA method using the pseudo ensembles generated by observation guided denoising diffusion probabilistic model (DDPM). Thanks to the variance in generated ensembles, our proposed method displays better performance than the well-established ensemble DA method (say, EnKF) when the simulation model is biased. For questions or comments, please find us in the AUTHORS file.

Usage

Installation

This code relies on the following packages. As a deeplearing framework, we use PyTorch.

  • Install Python libraries numpy, PyTorch, xarray, and netcdf4

  • Clone this repo
    git clone https://github.com/yasahi-hpc/Generative-EnKF.git

Preparation

Before running simulation with Generative EnKF, we need to train a diffusion model guided by observations. Firstly, one needs to construct a dataset and train the model for that. See deep learning model for detail.

Simulation with Generative EnKF

For simulation, we rely on the simulation codes and the pretrained diffusion model. We perform an observing system simulation experiment (OSSE) for Lorenz96 system. To try data assimilation (DA), you may compare the twin simulations with and without DA. For DA, you can use LETKF or Generative EnKF (EFDA). We use 32 ensembles for LETKF. One needs to run a nature simulation followd by DA simulations. We can perform the OSSE of Lorenz96 in the following manner:

  1. Create observation data with Nature run
python run.py --filename dns.json

The results would be stored in netcdf files under <base_dir/case_name>.

  1. DA simulation You can perform simulation with LETKF or Generative EnKF by
python run.py --model_name [LETKF, EFDA] --filename [letkf.json, efda.json]

See simulation codes for detail.

Brief explanations of scripts

Following table summarizes the major scripts and their inputs.

Scripts Arguments Explanation
run.py -dirname --filename --model_name Running a simulation
train.py -dirname --filename --model_name --inference_mode Training or inference
post.py -dirname --filename --model_name Postscript for simulations or trained models
convert.py -dirname --filename --mode --start_idx --end_idx Convert the simulatin data into dataset
setup.py Install the required packages
cleanup.py symdir --verbose Erase the result directory and symbolic link (Be careful to use this!)

Citations

@INPROCEEDINGS{Asahi2023, 
      author={Asahi, Yuuichi and Hasegawa, Yuta and Onodera, Naoyuki and and Shimokawabe, Takashi and Shiba, Hayato and Idomura, Yasuhiro},
      booktitle={ICML 2023 Workshop SynS and ML}
      title={Generating observation guided ensembles for data assimilation with denoising diffusion probabilistic model},
      year={2023},
      volume={},
      number={},
      pages={},
      keywords = {Deep learning; Graphics-processing-unit-based computing; Data Assimilation; Lorenz96},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages