-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[BUG] batch_run silently returns incorrect data #3057
Description
The batch_run utility in mesa/batchrunner.py relies on a critical but incorrect assumption: that the simulation step number N directly corresponds to the list index N in the DataCollector's internal storage.
Specifically, in mesa/batchrunner.py:
# The code retrieves data using the step number as a specific list index
model_data = {param: values[step] for param, values in dc.model_vars.items()}If a model calls datacollector.collect() more frequently (e.g., twice per step for sub-step resolution) or less frequently (e.g., conditionally) than exactly once per step, this assumption breaks.
- Scenario A (Over-collection): If
collect()is called twice per step, Step 5 is at Index 10.batch_runrequestsIndex 5, returning data effectively from Step 2.5. This results in silent data corruption where the time axis is completely decoupled from the data values. - Scenario B (Sparse-collection): If
collect()is called every 10 steps, Step 100 is at Index 10.batch_runrequestsIndex 100, causing anIndexErrorcrash.
Expected behavior
batch_run should retrieve data corresponding to the specific Model Step, regardless of how many items are in the DataCollector list.
If the user requests data for "Step 5", batch_run should look up the record where step == 5, rather than simply grabbing the 5th item in the list.
To Reproduce
Run the following script. It simulates a model that collects data twice per step (e.g., "start" and "end" of a step).
import unittest
from mesa.model import Model
from mesa.datacollection import DataCollector
from mesa.batchrunner import batch_run
class CorruptModel(Model):
def __init__(self, *args, **kwargs):
self.schedule = None
super().__init__()
self.datacollector = DataCollector(
model_reporters={"RealStep": lambda m: m.steps}
)
# Collect INITIAL state
self.datacollector.collect(self)
def step(self):
super().step()
# Collect data TWICE per step to simulate sub-step resolution
# or separate 'move' and 'trade' phases
self.datacollector.collect(self)
self.datacollector.collect(self)
class TestBatchRunnerTimeDilation(unittest.TestCase):
def test_time_dilation(self):
# Run for 5 steps
# Expected collections: Init(1) + 5 steps * 2 collections = 11 items.
# BatchRunner will try to read 'Step 5'.
# It naively grabs index 5.
# Index 5 actually corresponds to the 2nd collection of Step 2 (Step 2b).
results = batch_run(
CorruptModel,
parameters={},
iterations=1,
max_steps=5,
data_collection_period=1,
display_progress=False
)
last_result = results[-1]
reported_step = last_result["Step"] # BatchRunner says this is Step 5
actual_step_data = last_result["RealStep"] # The data proves it is Step 2 or 3
print(f"BatchRunner reports: Step {reported_step}")
print(f"Actual data contains: Step {actual_step_data}")
# This assert WILL FAIL
assert reported_step == actual_step_data, \
f"FATAL: BatchRunner returned data from Step {actual_step_data} when asked for Step {reported_step}"
if __name__ == '__main__':
unittest.main()Output:
BatchRunner reports: Step 5
Actual data contains: Step 3
FATAL: BatchRunner returned data from Step 3 when asked for Step 5
Additional context
- Severity: High.
- Why it is dangerous: In the "over-collection" scenario, the code does not crash. It produces a
DataFramethat looks valid but contains scientifically invalid time-series data (e.g., epidemic curves will appear compressed or dilated). - Proposed Fix: The
DataCollectorstructure might need to support key-based lookup (dictionary of steps), orbatch_runneeds to search the list for the correct step index rather than assumingindex == step.