LOB-Bench: Benchmarking Generative AI for Finance – an Application to Limit Order Book Data

@@html:<div class=”row”><h2 class=”col-md-12 text-center”><strong><font size=”+4r”>@@ LOB-Bench: Benchmarking Generative AI for Finance – an Application to Limit Order Book Data @@html:</font></strong></h2></div>@@

Peer Nagy*
Sascha Frey*
Kang Li
Bidipta Sarkar
Svitlana Vyetrenko
Stefan Zohren
Anisoara Calinescu
Jakob Foerster

ICML 2025

@@html:<a href=”https://arxiv.org/abs/2502.09172”><image src=”imgs/paper_front.jpg” height=”60px”><h4><strong>Paper</strong></h4></a>@@
@@html:<a href=”https://github.com/peernagy/lob_bench”><image src=”imgs/GitHub-Mark.png” height=”60px”><h4><strong>Code</strong></h4></a>@@
@@html:<a href=”https://huggingface.co/datasets/peernagy/lob_bench”><image src=”imgs/huggingface.svg” height=”60px”><h4><strong>Dataset</strong></h4></a>@@

Abstract

While financial data presents one of the most challenging and interesting sequence modelling tasks due to high noise, heavy tails, and strategic interactions, progress in this area has been hindered by the lack of consensus on quantitative evaluation paradigms. To address this, we present LOB-Bench, a benchmark, implemented in python, designed to evaluate the quality and realism of generative message-by-order data for limit order books (LOB) in the LOBSTER format. Our framework measures distributional differences in conditional and unconditional statistics between generated and real LOB data, supporting flexible multivariate statistical evaluation. The benchmark also includes features commonly used LOB statistics such as spread, order book volumes, order imbalance, and message inter-arrival times, along with scores from a trained discriminator network. Lastly, LOB-Bench contains “market impact metrics”, i.e. the cross-correlations and price response functions for specific events in the data. We benchmark generative autoregressive state-space models, a (C)GAN, as well as a parametric LOB model and find that the autoregressive GenAI approach beats traditional model classes.

How to Use

Please refer to the benchmark readme for information on how to use along with the tutorial notebook.

Leaderboard

We evaluate models on a subsample of data from January 2023 (except for Coletta which was trained and tested on January 2019). If you would like to add your model to this leaderboard, please contact the authors via email, and provide the directory of CSVs with data_real, data_gen, and data_cond as shown in the tutorial notebook so we can reproduce your results. We highly encourage open-source development of models, and will provide links to your code or model if provided.

GOOG

Model	Wasserstein (↓)	L1 (↓)
LobS5 [cite:@nagy2025lobbench;@nagy2023generative]	0.16	0.14
Baseline [cite:@nagy2025lobbench;@cont2010stochastic]	0.29	0.36
RWKV4 [cite:@nagy2025lobbench;@peng2023rwkv]	0.36	0.29
RWKV6 [cite:@nagy2025lobbench;@peng2024eagle]	0.45	0.31
Coletta [cite:@nagy2025lobbench;@coletta2023conditional]	0.54	0.48

INTC

Model	Wasserstein (↓)	L1 (↓)
LobS5 [cite:@nagy2025lobbench;@nagy2023generative]	0.19	0.13
RWKV6 [cite:@nagy2025lobbench;@peng2024eagle]	0.26	0.32
Baseline [cite:@nagy2025lobbench;@cont2010stochastic]	0.69	0.5
RWKV4 [cite:@nagy2025lobbench;@peng2023rwkv]	0.81	0.61

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
imgs		imgs
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.org		README.org
bibliography.bib		bibliography.bib
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LOB-Bench: Benchmarking Generative AI for Finance – an Application to Limit Order Book Data

Abstract

How to Use

Leaderboard

GOOG

INTC

About

Uh oh!

Releases

Packages

Languages

License

lobbench/lobbench.github.io

Folders and files

Latest commit

History

Repository files navigation

LOB-Bench: Benchmarking Generative AI for Finance – an Application to Limit Order Book Data

Abstract

How to Use

Leaderboard

GOOG

INTC

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages