Conversation
…d multiple times per step - Added _collection_steps tracking in DataCollector to record when each collection occurs - Modified batch_run to use binary search (bisect) to map model steps to collection indices - Previously assumed step number equals list index, causing incorrect data retrieval - Added test_batch_run_time_dilation to verify fix with multiple collections per step Fixes issue where batch_run would retrieve wrong data when collect() is called multiple times within a single model step.
|
Performance benchmarks:
|
- Added test_batch_run_legacy_datacollector to test backwards compatibility with DataCollectors lacking _collection_steps - Added test_batch_run_missing_step to test sparse collection scenarios where requested steps may not have data - Enhanced batchrunner.py error handling with try-except blocks for IndexError cases - Made datacollection.py backwards compatible by checking for _collection_steps existence before appending - All 11 batch_run tests now pass including edge cases
db2f12e to
c8fbfa1
Compare
| def collect(self, model): | ||
| """Collect all the data for the given model object.""" | ||
| if self.model_reporters: | ||
| if hasattr(self, "_collection_steps"): |
There was a problem hiding this comment.
As per #2903, it would be better to use model.time. model.step is likely to be deprecated in the near future.
There was a problem hiding this comment.
Thank you for the feedback! You're absolutely right. I'll update the implementation to use model.time instead of model.steps . I'll push the updated code shortly.
| dc = model.datacollector | ||
|
|
||
| model_data = {param: values[step] for param, values in dc.model_vars.items()} | ||
| if hasattr(dc, "_collection_steps"): |
There was a problem hiding this comment.
What is going in in this if block? The pr description is about data collection, yet this changes the batchrunner.
There was a problem hiding this comment.
The fix requires changes in both DataCollector and BatchRunner because they work together to solve the bug.
The Root Cause:
The bug is in BatchRunner's _collect_data() function, specifically this line:
model_data = {param: values[step] for param, values in dc.model_vars.items()}|
@quaquel Thank you for the feedback! . I've updated the implementation to use Changes made: In mesa/datacollection.py:
In tests/test_batch_run.py:
This ensures consistency throughout the fix and avoids mixing deprecated ( All tests pass with these changes. Pushing the update now. |
- Updated DataCollector to use model.time in _collection_steps tracking - Changed _record_agents and _record_agenttype to use model.time for dictionary keys - Updated all test models to use model.time in reporters and conditions - Ensures consistency with Mesa's deprecation plan (issue mesa#2903) - All tests passing, ruff checks clean
|
Performance benchmarks:
|
|
Can you update the PR description to reflect the shift to using model.time? Otherwise, this seems good to go. |
|
@quaquel I've updated the PR description to reflect the shift to using |
Resolves test failures after merge with main by properly integrating two fixes that were conflicting: 1. Sparse collection fix (PR mesa#2988 for issue mesa#2987): Handle models that collect data irregularly (e.g., only at steps 0, 5, 10) 2. Time dilation fix (PR mesa#3058): Handle DataCollector.collect() called multiple times per step Changes: - Prioritize _collection_steps when available (modern DataCollector) - Fall back to sparse collection logic when step not found - Support legacy DataCollectors without _collection_steps All 12 batch_run tests now pass, including both test_batch_run_sparse_collection and test_batch_run_time_dilation.
* Fix #2987: IndexError in batch_run with sparse data collection * Add docstrings to test classes to fix ruff linting errors * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix merge conflict: Integrate sparse collection and time dilation fixes Resolves test failures after merge with main by properly integrating two fixes that were conflicting: 1. Sparse collection fix (PR #2988 for issue #2987): Handle models that collect data irregularly (e.g., only at steps 0, 5, 10) 2. Time dilation fix (PR #3058): Handle DataCollector.collect() called multiple times per step Changes: - Prioritize _collection_steps when available (modern DataCollector) - Fall back to sparse collection logic when step not found - Support legacy DataCollectors without _collection_steps All 12 batch_run tests now pass, including both test_batch_run_sparse_collection and test_batch_run_time_dilation. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add test for empty collection edge case to improve coverage Adds test_batch_run_empty_collection_edge_case to cover the scenario where data is requested before any collection has occurred. This tests the IndexError exception handler in _collect_data when idx is out of bounds and model_vars is empty (lines 277-282 in batchrunner.py). This improves patch coverage for the merge conflict fix. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jan Kwakkel <[email protected]>
fixes #3057
Problem
The
batch_runutility incorrectly assumed that the step number equals the list index when retrieving data fromDataCollector. This caused wrong data retrieval whenDataCollector.collect()was called multiple times within a single model step (e.g., collecting data at initialization, mid-step, and end-of-step).Example Scenario
collect()called 3 times per step (initialization, mid-step, end-of-step)batch_runrequested data for step 5, it accessed index 5 instead of indices 12-14Solution
Changes Made
DataCollector (
mesa/datacollection.py):_collection_stepslist to track which model time each collection occurred atmodel.timeto_collection_stepson everycollect()callmodel.time(instead of deprecatedmodel.steps) throughout for consistency with Addmodel.timeas universal source of truth for simulation time #2903BatchRunner (
mesa/batchrunner.py):_collect_data()to use binary search (bisect.bisect_right)_collection_stepsTests (
tests/test_batch_run.py):test_batch_run_time_dilationto verify the fixmodel.timein all test model reporters for consistency