Add `DataSet.record()` by quaquel · Pull Request #3295 · mesa/mesa

quaquel · 2026-02-13T15:26:38Z

This PR adds a new record method to DataSet. Building on #3145 and #3156, and the discussion on data collection, this makes for quite an elegant API:

self.recorder = DataRecorder(self)
self.data_registry.track_agents(self.agents, "agent_data", "wealth").record(self.recorder)
self.data_registry.track_model(self, "model_data", "gini").record(self.recorder, 
                                                                  configuration=DatasetConfig(start_time=4, interval=2))

Implementation details
I kept this PR as focused as possible. At its core, it adds a new method DataSet.record(recorder, configuration). Internally, I added a BaseDataRecorder. add_dataset(dataset:DataSet, configuration:DataConfig|None=None). I updated the __init__ of BaseDataRecorder to allow for config=None and I removed the behavior where a recorder automatically records all datasets. This last change is not needed for this PR, but a key design principle is that we want to separate datasets at a given instant from their recording over time. Automatically recording datasets defeats this purpose.

I deliberately left out other ideas from the discussion because there is no consensus on those yet.

for more information, see https://pre-commit.ci

… dataset_record

for more information, see https://pre-commit.ci

github-actions · 2026-02-13T15:43:37Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🔴 +4.1% [+3.4%, +4.8%]	🔴 +26.6% [+26.1%, +26.9%]
BoltzmannWealth	large	🔵 +0.8% [-0.3%, +1.9%]	🔴 +20.8% [+16.7%, +24.7%]
Schelling	small	🔵 +0.9% [+0.6%, +1.3%]	🔵 +0.7% [+0.6%, +0.9%]
Schelling	large	🔵 +2.4% [+1.9%, +3.0%]	🔴 +11.0% [+9.4%, +12.6%]
WolfSheep	small	🔵 +0.2% [-0.1%, +0.5%]	🔵 -0.7% [-0.8%, -0.6%]
WolfSheep	large	🔵 +1.5% [+0.7%, +2.3%]	🔴 +5.4% [+4.4%, +6.3%]
BoidFlockers	small	🟢 -4.9% [-5.2%, -4.6%]	🔵 -0.6% [-0.7%, -0.4%]
BoidFlockers	large	🔵 -3.4% [-3.9%, -3.0%]	🔵 -0.8% [-0.9%, -0.6%]

for more information, see https://pre-commit.ci

github-actions · 2026-02-14T18:09:04Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🔵 +1.2% [+0.7%, +1.7%]	🔴 +26.5% [+26.4%, +26.7%]
BoltzmannWealth	large	🔵 +2.3% [+1.4%, +3.0%]	🔴 +20.5% [+16.7%, +23.6%]
Schelling	small	🔵 +1.1% [+1.0%, +1.3%]	🔵 +0.5% [+0.3%, +0.6%]
Schelling	large	🔵 +0.4% [-0.1%, +1.0%]	🔵 -3.4% [-6.3%, -0.7%]
WolfSheep	small	🔵 +0.4% [+0.3%, +0.6%]	🔵 +0.2% [+0.0%, +0.4%]
WolfSheep	large	🔵 +0.6% [-0.7%, +1.7%]	🔵 +1.0% [-1.7%, +3.4%]
BoidFlockers	small	🔵 -1.4% [-1.8%, -1.1%]	🔵 -1.5% [-1.6%, -1.4%]
BoidFlockers	large	🔵 -1.6% [-2.0%, -1.1%]	🔵 -1.0% [-1.2%, -0.8%]

EwoutH · 2026-02-14T20:24:04Z

I'm concerned we're accumulating some API bloat.

This PR adds .record() to DataSet and add_dataset() to BaseDataRecorder, resulting in this API:

self.recorder = DataRecorder(self)
self.data_registry.track_agents(self.agents, "agent_data", "wealth").record(self.recorder)
self.data_registry.track_model(self, "model_data", "gini").record(
    self.recorder, 
    configuration=DatasetConfig(start_time=4, interval=2)
)

In our recent discussion I thought we were moving towards an API like this:

# No explicit recorder construction - handled internally by model.data
self.data.track_agents(Wolf, "wolf_energy", "energy").record()
self.data.track_model(self, "gini", "gini").record(Schedule(interval=5, start=100))

With as main difference that .record() doesn't take a recorder argument and is internally managed).

With this PR, users now need to understand:

DataRecorder - construct it explicitly
DataSet.record(recorder, configuration) - new method that takes the recorder
BaseDataRecorder.add_dataset(dataset, configuration) - new public method
Configuration can be passed to both DataRecorder.__init__() and .record()

Do you see options to decomplicate both the mental model for users and the API? I think sensible defaults also would help to go a long way.

quaquel · 2026-02-14T20:33:27Z

With as main difference that .record() doesn't take a recorder argument and is internally managed.

Users might use different recorders depending on the backend. So, the dataset needs to know the recorder that is to be used.

Also, at the moment, we don't have a default recorder field on the model, nor a configuration via __init__ to set it up. So, datasets cannot rely on any field on the model, nor do they, in general, have a reference to the model, even if a default recorder field did exist.

I agree about the end goal of having a clean API. This PR is just a small step to getting there. And I would argue it already simplifies it quite a bit because there is no longer the need to pass a complex configuration dict to the recorder. Instead, we tie the configuration directly to the dataset via the new record() method.

A next step, in my view, is to see if we can simplify DatasetConfig, potentially via Schedule, or at least by using the same keywords where appropriate.

EwoutH · 2026-02-14T20:45:56Z

Thanks for the context, sounds good.

A next step, in my view, is to see if we can simplify DatasetConfig, potentially via Schedule, or at least by using the same keywords where appropriate.

We might create a DataSchedule subclass. User can than:

Pass nothing for the default (collecting ever 1 time)
Pass a Schedule for simple functionality
Pass a DataSchedule for more advanced functionality

quaquel · 2026-02-14T20:56:36Z

It's a bit different. The two share start, end, and interval. But for data recording, interval only takes a number, but not a callable. In addition to these, DataSetConfig takes window_size, while Schedule takes count. So inheritance does not seem the best solution here.

… dataset_record

for more information, see https://pre-commit.ci

EwoutH · 2026-02-14T21:00:26Z

Basically Schedule is a data storage object. It can have some fields that are useful/valid for recurring event scheduling, some that are useful for data recording, and some that are useful for both.

quaquel · 2026-02-14T21:17:20Z

Basically Schedule is a data storage object. It can have some fields that are useful/valid for recurring event scheduling, some that are useful for data recording, and some that are useful for both.

Which is why a basic protocol with start, end, and interval might make sense, but a full hierarchy of classes is probably overkill.

quaquel and others added 6 commits February 13, 2026 16:16

initial commit

eb2928c

[pre-commit.ci] auto fixes from pre-commit.com hooks

12fc075

for more information, see https://pre-commit.ci

ruff fixes

68dfb72

Merge branch 'dataset_record' of https://github.com/quaquel/mesa into…

dcc131c

… dataset_record

Update model.py

8f48ebe

[pre-commit.ci] auto fixes from pre-commit.com hooks

e676596

for more information, see https://pre-commit.ci

Merge branch 'main' into dataset_record

bb0ece7

EwoutH mentioned this pull request Feb 14, 2026

Mesa 4 development tracking issue #3132

Open

42 tasks

quaquel and others added 4 commits February 14, 2026 18:31

Merge remote-tracking branch 'upstream/main' into dataset_record

dc7a03f

fix unit tests

c252c8a

[pre-commit.ci] auto fixes from pre-commit.com hooks

24c5100

for more information, see https://pre-commit.ci

ruff fix

15b035d

quaquel marked this pull request as ready for review February 14, 2026 17:53

quaquel added the enhancement Release notes label label Feb 14, 2026

Merge branch 'main' into dataset_record

832905e

EwoutH added the experimental Release notes label label Feb 14, 2026

EwoutH approved these changes Feb 14, 2026

View reviewed changes

quaquel and others added 2 commits February 14, 2026 21:57

Merge branch 'dataset_record' of https://github.com/quaquel/mesa into…

014ed8e

… dataset_record

[pre-commit.ci] auto fixes from pre-commit.com hooks

87cff13

for more information, see https://pre-commit.ci

quaquel merged commit 42245cd into mesa:main Feb 14, 2026
13 of 14 checks passed

quaquel deleted the dataset_record branch February 14, 2026 21:16

codebreaker32 mentioned this pull request Feb 17, 2026

Remove additional lines of code used for testing #3335

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `DataSet.record()`#3295

Add `DataSet.record()`#3295
quaquel merged 14 commits intomesa:mainfrom
quaquel:dataset_record

quaquel commented Feb 13, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 14, 2026

Uh oh!

EwoutH commented Feb 14, 2026

Uh oh!

quaquel commented Feb 14, 2026

Uh oh!

EwoutH commented Feb 14, 2026

Uh oh!

quaquel commented Feb 14, 2026 •

edited

Loading

Uh oh!

EwoutH commented Feb 14, 2026

Uh oh!

Uh oh!

quaquel commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

quaquel commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 14, 2026

Uh oh!

EwoutH commented Feb 14, 2026

Uh oh!

quaquel commented Feb 14, 2026

Uh oh!

EwoutH commented Feb 14, 2026

Uh oh!

quaquel commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EwoutH commented Feb 14, 2026

Uh oh!

Uh oh!

quaquel commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

quaquel commented Feb 13, 2026 •

edited

Loading

quaquel commented Feb 14, 2026 •

edited

Loading