Fix: Correct data retrieval in batch_run when DataCollector.collect() called multiple times per step by Nithin9585 · Pull Request #3058 · mesa/mesa

Nithin9585 · 2026-01-02T17:03:45Z

fixes #3057

Problem

The batch_run utility incorrectly assumed that the step number equals the list index when retrieving data from DataCollector. This caused wrong data retrieval when DataCollector.collect() was called multiple times within a single model step (e.g., collecting data at initialization, mid-step, and end-of-step).

Example Scenario

Model runs for 5 steps
collect() called 3 times per step (initialization, mid-step, end-of-step)
Total collections: 15 entries in the data list
When batch_run requested data for step 5, it accessed index 5 instead of indices 12-14
Result: Retrieved data from step 2 instead of step 5

Solution

Changes Made

DataCollector (mesa/datacollection.py):

Added _collection_steps list to track which model time each collection occurred at
Appends model.time to _collection_steps on every collect() call
Uses model.time (instead of deprecated model.steps) throughout for consistency with Add model.time as universal source of truth for simulation time #2903

BatchRunner (mesa/batchrunner.py):

Modified _collect_data() to use binary search (bisect.bisect_right)
Maps model time values to correct collection indices via _collection_steps
Handles multiple collections per time step correctly

Tests (tests/test_batch_run.py):

Added test_batch_run_time_dilation to verify the fix
Tests scenario with 3 collections per step over 5 steps
Validates correct data retrieval for specific time steps
Uses model.time in all test model reporters for consistency

…d multiple times per step - Added _collection_steps tracking in DataCollector to record when each collection occurs - Modified batch_run to use binary search (bisect) to map model steps to collection indices - Previously assumed step number equals list index, causing incorrect data retrieval - Added test_batch_run_time_dilation to verify fix with multiple collections per step Fixes issue where batch_run would retrieve wrong data when collect() is called multiple times within a single model step.

github-actions · 2026-01-02T17:11:14Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🔵 -0.0% [-0.5%, +0.5%]	🔵 -0.2% [-0.3%, -0.0%]
BoltzmannWealth	large	🔵 -0.8% [-1.0%, -0.5%]	🔵 -1.6% [-2.9%, -0.4%]
Schelling	small	🔵 -0.5% [-1.5%, +0.5%]	🔵 -0.2% [-0.7%, +0.3%]
Schelling	large	🔵 -1.1% [-1.4%, -0.8%]	🔵 -1.2% [-1.8%, -0.6%]
WolfSheep	small	🔵 -0.5% [-0.6%, -0.4%]	🔵 +0.1% [-0.1%, +0.2%]
WolfSheep	large	🔵 -2.6% [-5.6%, -1.0%]	🔵 -0.1% [-1.5%, +0.9%]
BoidFlockers	small	🔵 +1.5% [+0.9%, +2.1%]	🔵 +1.0% [+0.8%, +1.1%]
BoidFlockers	large	🔵 +1.6% [+1.3%, +2.0%]	🔵 +0.1% [-0.0%, +0.2%]

- Added test_batch_run_legacy_datacollector to test backwards compatibility with DataCollectors lacking _collection_steps - Added test_batch_run_missing_step to test sparse collection scenarios where requested steps may not have data - Enhanced batchrunner.py error handling with try-except blocks for IndexError cases - Made datacollection.py backwards compatible by checking for _collection_steps existence before appending - All 11 batch_run tests now pass including edge cases

for more information, see https://pre-commit.ci

quaquel · 2026-01-02T21:12:41Z

mesa/datacollection.py

    def collect(self, model):
        """Collect all the data for the given model object."""
        if self.model_reporters:
+            if hasattr(self, "_collection_steps"):


As per #2903, it would be better to use model.time. model.step is likely to be deprecated in the near future.

Thank you for the feedback! You're absolutely right. I'll update the implementation to use model.time instead of model.steps . I'll push the updated code shortly.

quaquel · 2026-01-02T21:13:47Z

mesa/batchrunner.py

    dc = model.datacollector

-    model_data = {param: values[step] for param, values in dc.model_vars.items()}
+    if hasattr(dc, "_collection_steps"):


What is going in in this if block? The pr description is about data collection, yet this changes the batchrunner.

The fix requires changes in both DataCollector and BatchRunner because they work together to solve the bug.

The Root Cause:
The bug is in BatchRunner's _collect_data() function, specifically this line:

model_data = {param: values[step] for param, values in dc.model_vars.items()}

Nithin9585 · 2026-01-03T07:37:49Z

@quaquel Thank you for the feedback! . I've updated the implementation to use model.time instead of model.steps throughout the fix to align with the deprecation plan from #2903.

Changes made:

In mesa/datacollection.py:

_collection_steps.append(model.time) (line you commented on)
_record_agents(): Changed agent prefix to use agent.model.time
_record_agenttype(): Changed agent prefix to use agent.model.time
collect(): Changed dictionary keys to use model.time for _agent_records and _agenttype_records

In tests/test_batch_run.py:

Updated all test model reporters to use m.time instead of m.steps
Updated conditional logic in test models to use self.time

This ensures consistency throughout the fix and avoids mixing deprecated (model.steps) and new (model.time) approaches.

All tests pass with these changes. Pushing the update now.

- Updated DataCollector to use model.time in _collection_steps tracking - Changed _record_agents and _record_agenttype to use model.time for dictionary keys - Updated all test models to use model.time in reporters and conditions - Ensures consistency with Mesa's deprecation plan (issue mesa#2903) - All tests passing, ruff checks clean

github-actions · 2026-01-05T07:46:08Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🔵 +0.2% [-0.9%, +1.4%]	🔵 +2.4% [+2.3%, +2.6%]
BoltzmannWealth	large	🔵 +0.9% [+0.1%, +1.7%]	🔵 +0.1% [-2.1%, +2.0%]
Schelling	small	🔵 +1.6% [+0.7%, +2.6%]	🔵 +1.8% [+1.3%, +2.4%]
Schelling	large	🔵 +2.0% [+1.3%, +2.6%]	🔵 +4.9% [+2.7%, +6.9%]
WolfSheep	small	🔵 +1.4% [+0.9%, +1.8%]	🔵 +0.5% [+0.2%, +0.8%]
WolfSheep	large	🔵 -2.4% [-5.9%, -0.0%]	🟢 -5.1% [-7.3%, -3.0%]
BoidFlockers	small	🔵 +1.6% [+1.1%, +2.0%]	🔵 +2.1% [+1.9%, +2.3%]
BoidFlockers	large	🔴 +3.9% [+3.5%, +4.4%]	🔵 +1.5% [+1.3%, +1.6%]

quaquel · 2026-01-05T07:52:50Z

Can you update the PR description to reflect the shift to using model.time? Otherwise, this seems good to go.

Nithin9585 · 2026-01-05T08:11:53Z

@quaquel I've updated the PR description to reflect the shift to using model.time instead of model.steps.

Resolves test failures after merge with main by properly integrating two fixes that were conflicting: 1. Sparse collection fix (PR mesa#2988 for issue mesa#2987): Handle models that collect data irregularly (e.g., only at steps 0, 5, 10) 2. Time dilation fix (PR mesa#3058): Handle DataCollector.collect() called multiple times per step Changes: - Prioritize _collection_steps when available (modern DataCollector) - Fall back to sparse collection logic when step not found - Support legacy DataCollectors without _collection_steps All 12 batch_run tests now pass, including both test_batch_run_sparse_collection and test_batch_run_time_dilation.

* Fix #2987: IndexError in batch_run with sparse data collection * Add docstrings to test classes to fix ruff linting errors * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix merge conflict: Integrate sparse collection and time dilation fixes Resolves test failures after merge with main by properly integrating two fixes that were conflicting: 1. Sparse collection fix (PR #2988 for issue #2987): Handle models that collect data irregularly (e.g., only at steps 0, 5, 10) 2. Time dilation fix (PR #3058): Handle DataCollector.collect() called multiple times per step Changes: - Prioritize _collection_steps when available (modern DataCollector) - Fall back to sparse collection logic when step not found - Support legacy DataCollectors without _collection_steps All 12 batch_run tests now pass, including both test_batch_run_sparse_collection and test_batch_run_time_dilation. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add test for empty collection edge case to improve coverage Adds test_batch_run_empty_collection_edge_case to cover the scenario where data is requested before any collection has occurred. This tests the IndexError exception handler in _collect_data when idx is out of bounds and model_vars is empty (lines 277-282 in batchrunner.py). This improves patch coverage for the merge conflict fix. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jan Kwakkel <[email protected]>

Nithin9585 force-pushed the fix-batchrunner-time-dilation branch from db2f12e to c8fbfa1 Compare January 2, 2026 17:12

[pre-commit.ci] auto fixes from pre-commit.com hooks

10dfb8f

for more information, see https://pre-commit.ci

quaquel added the bug Release notes label label Jan 2, 2026

quaquel reviewed Jan 2, 2026

View reviewed changes

Nithin9585 and others added 2 commits January 3, 2026 13:10

Merge branch 'main' into fix-batchrunner-time-dilation

78a66e0

quaquel added the trigger-benchmarks Special label that triggers the benchmarking CI label Jan 5, 2026

quaquel removed the trigger-benchmarks Special label that triggers the benchmarking CI label Jan 5, 2026

Merge branch 'main' into fix-batchrunner-time-dilation

d508cf9

quaquel approved these changes Jan 5, 2026

View reviewed changes

quaquel merged commit cefa5dc into mesa:main Jan 5, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Correct data retrieval in batch_run when DataCollector.collect() called multiple times per step#3058

Fix: Correct data retrieval in batch_run when DataCollector.collect() called multiple times per step#3058
quaquel merged 6 commits intomesa:mainfrom
Nithin9585:fix-batchrunner-time-dilation

Nithin9585 commented Jan 2, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 2, 2026

Uh oh!

quaquel Jan 2, 2026

Uh oh!

Nithin9585 Jan 3, 2026

Uh oh!

quaquel Jan 2, 2026

Uh oh!

Nithin9585 Jan 3, 2026 •

edited

Loading

Uh oh!

Nithin9585 commented Jan 3, 2026

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

quaquel commented Jan 5, 2026

Uh oh!

Nithin9585 commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Nithin9585 commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Example Scenario

Solution

Changes Made

Uh oh!

github-actions bot commented Jan 2, 2026

Uh oh!

quaquel Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Nithin9585 Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

quaquel Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Nithin9585 Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Nithin9585 commented Jan 3, 2026

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

quaquel commented Jan 5, 2026

Uh oh!

Nithin9585 commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Nithin9585 commented Jan 2, 2026 •

edited

Loading

Nithin9585 Jan 3, 2026 •

edited

Loading