-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
DataCollector Mutable Reference Leak #3035
Description
The DataCollector fails to create deep copies of agent-level data when collecting. When an agent reporter returns a mutable object (list, dict, numpy array, etc.), the DataCollector stores a reference to that object instead of a copy. As the agent modifies this object in subsequent steps, all historical records are retroactively updated to reflect the current state, completely invalidating longitudinal data analysis.
Expected behavior
When collecting agent data at each step, the DataCollector should preserve the state of mutable objects at that specific time step. Historical records should remain unchanged when agents modify their attributes in later steps.
For example:
- Step 1: Agent has
grades = [], DataCollector should record[] - Step 2: Agent has
grades = [85], DataCollector should record[85] - Step 3: Agent has
grades = [85, 92], DataCollector should record[85, 92]
When reviewing historical data, Step 1 should still show [], Step 2 should show [85], and Step 3 should show [85, 92].
To Reproduce
from mesa.datacollection import DataCollector
from mesa.model import Model
from mesa.agent import Agent
class TestAgent(Agent):
def __init__(self, model):
super().__init__(model)
self.my_list = [] # Mutable attribute
def step(self):
self.my_list.append(self.model.steps)
class TestModel(Model):
def __init__(self):
super().__init__()
self.agent = TestAgent(self)
# Track the mutable list
self.datacollector = DataCollector(
agent_reporters={"MyList": lambda a: a.my_list}
)
def step(self):
self.datacollector.collect(self)
self.agent.step()
# Run simulation
model = TestModel()
model.step() # Step 1: list is []
model.step() # Step 2: list is [1]
model.step() # Step 3: list is [1, 2]
# Check historical data
df = model.datacollector.get_agent_vars_dataframe()
print(df)
# BUG: All steps show [1, 2, 3] instead of their historical values
# Expected:
# Step 1: []
# Step 2: [1]
# Step 3: [1, 2]
# Actual:
# Step 1: [1, 2, 3]
# Step 2: [1, 2, 3]
# Step 3: [1, 2, 3]Additional context
- This bug affects any mutable data type: lists, dicts, sets, numpy arrays, custom objects, etc.
- The
DataCollectoralready usesdeepcopy()for model-level reporters (line 330 indatacollection.py) but not for agent-level reporters - The fix is to apply
deepcopy()to agent reporter results in the_record_agents()method (around line 284)
This bug silently corrupts historical data, making research conclusions based on Mesa simulations potentially invalid when tracking mutable agent attributes.