Skip to content

Add explicit RUN_ENDED signal for terminal data handling in DataRecorder#3424

Merged
quaquel merged 9 commits intomesa:mainfrom
codebreaker32:run_ended
Mar 4, 2026
Merged

Add explicit RUN_ENDED signal for terminal data handling in DataRecorder#3424
quaquel merged 9 commits intomesa:mainfrom
codebreaker32:run_ended

Conversation

@codebreaker32
Copy link
Copy Markdown
Collaborator

This PR fixes the data collection/recorder architecture by ensuring the final state of a model is always captured at the end of a run, and introduces an "overwrite" mechanism to prevent duplicate data rows when the model is paused, modified, and resumed at the same timestep.

Previously, users had to manually call finalise() or collect() to get the last timestep if it didn't perfectly align with the interval.

Key Changes:

  1. Signals (model.py, signal_types.py):
  • Added a new ModelSignals.RUN_ENDED signal.
  • Decorated run_for and run_until to automatically emit this signal when they finish executing.
  1. Base Recorder (basedatarecorder.py):
  • Subscribed the recorder to the RUN_ENDED signal to automatically capture the final snapshot (_on_run_ended), removing the need for the manual finalise() method.
  • Added _last_collection tracking to detect if the current timestep is identical to the previous one, passing an is_overwrite flag downstream.

@codebreaker32
Copy link
Copy Markdown
Collaborator Author

Hi @quaquel

Do you prefer this approach or should we wait for RunConfiguration idea to crystallise?

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 3, 2026

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔵 -0.1% [-1.0%, +0.7%] 🔵 -0.4% [-0.8%, -0.0%]
BoltzmannWealth large 🔵 +2.6% [+1.8%, +3.5%] 🔴 +9.0% [+6.4%, +11.7%]
Schelling small 🔵 +1.3% [+0.9%, +1.7%] 🔵 +1.5% [+1.2%, +1.7%]
Schelling large 🔵 +1.0% [-0.3%, +2.2%] 🔵 +4.0% [-0.1%, +7.4%]
WolfSheep small 🔵 -1.0% [-1.5%, -0.4%] 🔵 -1.5% [-1.9%, -1.1%]
WolfSheep large 🔵 +0.6% [-0.4%, +1.8%] 🔵 +2.0% [-0.2%, +4.3%]
BoidFlockers small 🔵 -0.7% [-1.1%, -0.3%] 🔵 -1.4% [-1.7%, -1.1%]
BoidFlockers large 🔵 -0.9% [-1.5%, -0.3%] 🔵 -1.0% [-1.2%, -0.7%]

Copy link
Copy Markdown
Member

@quaquel quaquel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me for now. We might want to revisit this later when we revisit the run_while/stopping condition issue and the run_config idea.

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Mar 3, 2026

Can you check the code coverage and if needed at tests? I'll merge later today or more likely tomorrow.

@codebreaker32
Copy link
Copy Markdown
Collaborator Author

codebreaker32 commented Mar 3, 2026

While writing tests, I found an interesting edge case in parquet backend, In old logic where we only pop from the in-memory buffer:

if is_overwrite:
    while buffer and buffer[-1]["time"] == time:
        buffer.pop()

If a file flush happens right before the overwrite (e.g., calling get_table_dataframe), the buffer is empty. The while loop skips, and it blindly appends the new row, missing the fact that the old row is already saved in the physical file and causing duplicate data.

It might be propagated or reflected in user's own custom backend as well. I am convinced that it is a delicate approach. Do you think these type of edge cases can be handled by run_configs easily?

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Mar 4, 2026

If a file flush happens right before the overwrite (e.g., calling get_table_dataframe), the buffer is empty. The while loop skips, and it blindly appends the new row, missing the fact that the old row is already saved in the physical file and causing duplicate data.

It might be propagated or reflected in user's own custom backend as well. I am convinced that it is a delicate approach. Do you think these type of edge cases can be handled by run_configs easily?

That's an interesting edge case. In my current conception of run_config, we would not need the overwrite as used in this PR, so it would avoid the edge case. We would not need it because run_ended is only emitted once, and it would leave the model in a state where any subsequent run_x calls would fail.

On the edge case for the current approach, what about never flushing the last item in the buffer?

@codebreaker32
Copy link
Copy Markdown
Collaborator Author

On the edge case for the current approach, what about never flushing the last item in the buffer?

Agree. However the trade-off for this approach, is that users would be strictly required to fetch their data through get_table_dataframe() to see the terminal state. If they bypass the recorder API and load the raw .parquet file directly from the disk, that final timestep will be missing

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Mar 4, 2026

Agree. However the trade-off for this approach, is that users would be strictly required to fetch their data through get_table_dataframe() to see the terminal state. If they bypass the recorder API and load the raw .parquet file directly from the disk, that final timestep will be missing

Fair point. I am unfortunately not familiar with parquet, but it seems the real fix will require some check of the file and see if something needs to be overwritten. The other option is to at least document this for parquet.

@codebreaker32
Copy link
Copy Markdown
Collaborator Author

codebreaker32 commented Mar 4, 2026

I have already implemented following approach: if the in-memory buffer is empty, it falls back to opening the physical .parquet file, filters out the old timestep (df = df[df["time"] != time]), and rewrites the file before appending the new data.
Are you comfortable with this approach of explicitly manipulating the .parquet file for this edge case, or would you prefer we stick to the simpler buffer logic and just document the quirk?

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Mar 4, 2026

As indicated, I have little knowledge regarding parquet, so I'll trust your judgment.

@quaquel quaquel merged commit b69bb53 into mesa:main Mar 4, 2026
13 of 14 checks passed
@codebreaker32 codebreaker32 deleted the run_ended branch March 6, 2026 14:52
@EwoutH EwoutH added experimental Release notes label enhancement Release notes label labels Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Release notes label experimental Release notes label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants