Skip to content

[BUG] batch_run silently returns incorrect data #3057

@Nithin9585

Description

@Nithin9585

The batch_run utility in mesa/batchrunner.py relies on a critical but incorrect assumption: that the simulation step number N directly corresponds to the list index N in the DataCollector's internal storage.

Specifically, in mesa/batchrunner.py:

# The code retrieves data using the step number as a specific list index
model_data = {param: values[step] for param, values in dc.model_vars.items()}

If a model calls datacollector.collect() more frequently (e.g., twice per step for sub-step resolution) or less frequently (e.g., conditionally) than exactly once per step, this assumption breaks.

  • Scenario A (Over-collection): If collect() is called twice per step, Step 5 is at Index 10. batch_run requests Index 5, returning data effectively from Step 2.5. This results in silent data corruption where the time axis is completely decoupled from the data values.
  • Scenario B (Sparse-collection): If collect() is called every 10 steps, Step 100 is at Index 10. batch_run requests Index 100, causing an IndexError crash.

Expected behavior
batch_run should retrieve data corresponding to the specific Model Step, regardless of how many items are in the DataCollector list.

If the user requests data for "Step 5", batch_run should look up the record where step == 5, rather than simply grabbing the 5th item in the list.

To Reproduce
Run the following script. It simulates a model that collects data twice per step (e.g., "start" and "end" of a step).

import unittest
from mesa.model import Model
from mesa.datacollection import DataCollector
from mesa.batchrunner import batch_run

class CorruptModel(Model):
    def __init__(self, *args, **kwargs):
        self.schedule = None
        super().__init__()
        self.datacollector = DataCollector(
            model_reporters={"RealStep": lambda m: m.steps}
        )
        # Collect INITIAL state
        self.datacollector.collect(self)

    def step(self):
        super().step()
        # Collect data TWICE per step to simulate sub-step resolution 
        # or separate 'move' and 'trade' phases
        self.datacollector.collect(self) 
        self.datacollector.collect(self) 

class TestBatchRunnerTimeDilation(unittest.TestCase):
    def test_time_dilation(self):
        # Run for 5 steps
        # Expected collections: Init(1) + 5 steps * 2 collections = 11 items.
        
        # BatchRunner will try to read 'Step 5'.
        # It naively grabs index 5.
        # Index 5 actually corresponds to the 2nd collection of Step 2 (Step 2b).
        
        results = batch_run(
            CorruptModel,
            parameters={},
            iterations=1,
            max_steps=5,
            data_collection_period=1, 
            display_progress=False
        )
        
        last_result = results[-1]
        reported_step = last_result["Step"] # BatchRunner says this is Step 5
        actual_step_data = last_result["RealStep"] # The data proves it is Step 2 or 3
        
        print(f"BatchRunner reports: Step {reported_step}")
        print(f"Actual data contains: Step {actual_step_data}")
        
        # This assert WILL FAIL
        assert reported_step == actual_step_data, \
            f"FATAL: BatchRunner returned data from Step {actual_step_data} when asked for Step {reported_step}"

if __name__ == '__main__':
    unittest.main()

Output:

BatchRunner reports: Step 5
Actual data contains: Step 3
FATAL: BatchRunner returned data from Step 3 when asked for Step 5

Additional context

  • Severity: High.
  • Why it is dangerous: In the "over-collection" scenario, the code does not crash. It produces a DataFrame that looks valid but contains scientifically invalid time-series data (e.g., epidemic curves will appear compressed or dilated).
  • Proposed Fix: The DataCollector structure might need to support key-based lookup (dictionary of steps), or batch_run needs to search the list for the correct step index rather than assuming index == step.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions