Skip to content

Fix: IndexError in batch_run with sparse data collection#2988

Merged
quaquel merged 12 commits intomesa:mainfrom
Nithin9585:fix-batchrunner-sparse-collection
Jan 6, 2026
Merged

Fix: IndexError in batch_run with sparse data collection#2988
quaquel merged 12 commits intomesa:mainfrom
Nithin9585:fix-batchrunner-sparse-collection

Conversation

@Nithin9585
Copy link
Copy Markdown
Contributor

Fixes #2987

This PR resolves an IndexError that occurs in batch_run() when models use sparse data collection (collecting data only at specific steps rather than every step).

Problem

The _collect_data() function in batchrunner.py incorrectly used step numbers as list indices:

model_data = {param: values[step] for param, values in dc.model_vars.items()}

When a model collects data sparsely (e.g., only at steps 0, 5, 10), the model_vars list has only 3 items (indices 0, 1, 2). Attempting to access values[5] or values[10] causes an IndexError.

Solution

Changed the implementation to map step numbers to their corresponding collection indices:

available_steps = sorted(dc._agent_records.keys())
if step not in available_steps:
    step = max((s for s in available_steps if s <= step), default=0)

try:
    collection_index = available_steps.index(step)
except ValueError:
    collection_index = 0

model_data = {param: values[collection_index] for param, values in dc.model_vars.items()}

Testing

Added test_batch_run_sparse_collection() to verify the fix works with models that collect data every N steps.

Test output:

tests/test_batch_run.py::test_batch_run_sparse_collection PASSED [100%]

Changes

  • mesa/batchrunner.py: Fixed the IndexError in _collect_data()
  • tests/test_batch_run.py: Added regression test for sparse data collection

@github-actions
Copy link
Copy Markdown

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔵 +0.8% [+0.4%, +1.3%] 🔵 -1.0% [-1.2%, -0.9%]
BoltzmannWealth large 🔵 +1.3% [-0.6%, +4.6%] 🔵 -1.0% [-2.9%, +0.5%]
Schelling small 🔵 -0.3% [-1.3%, +0.7%] 🔵 -1.1% [-1.7%, -0.4%]
Schelling large 🔵 -0.3% [-1.7%, +1.8%] 🔵 -1.4% [-2.3%, -0.6%]
WolfSheep small 🔵 +0.2% [-0.6%, +0.8%] 🔵 -0.4% [-0.8%, +0.0%]
WolfSheep large 🔵 -1.1% [-4.6%, +1.2%] 🔵 -0.2% [-0.9%, +0.5%]
BoidFlockers small 🔵 -0.7% [-1.0%, -0.5%] 🔵 +0.0% [-0.2%, +0.3%]
BoidFlockers large 🔵 -1.2% [-1.9%, -0.5%] 🔵 -0.3% [-0.7%, +0.1%]

@Nithin9585 Nithin9585 force-pushed the fix-batchrunner-sparse-collection branch from ee4f9b5 to 43b87aa Compare December 22, 2025 11:40
@falloficarus22
Copy link
Copy Markdown
Contributor

falloficarus22 commented Dec 22, 2025

Hey, isn't it better to simply use values[-1] to get the latest collected data instead of trying to map steps to indices. This would be safer and work for all cases.

@Nithin9585
Copy link
Copy Markdown
Contributor Author

@falloficarus22 Good point! But if data exists at steps [0, 5, 10] and we request step 3, wouldn't values[-1] return step 10's data when we actually want step 0's?

How would your approach handle different requested steps?

@falloficarus22
Copy link
Copy Markdown
Contributor

@falloficarus22 Good point! But if data exists at steps [0, 5, 10] and we request step 3, wouldn't values[-1] return step 10's data when we actually want step 0's?

How would your approach handle different requested steps?

Well I thought it was possible to do with values[-1] and tried to apply the changes locally and I'm running into multiple errors.
So, I guess this approach is fine.

Your pre-commit is failing (maybe you didn't ruff check and ruff format the code).

@Nithin9585 Nithin9585 force-pushed the fix-batchrunner-sparse-collection branch from e69551a to dae37fe Compare December 23, 2025 10:25
@Nithin9585
Copy link
Copy Markdown
Contributor Author

@quaquel @EwoutH can you please review this pr ?

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Jan 5, 2026

I tried to resolve the conflicts with main, but failed. Tests are falling at the moment. Do you have time to investigate?

Nithin9585 and others added 4 commits January 6, 2026 18:57
Resolves test failures after merge with main by properly integrating
two fixes that were conflicting:

1. Sparse collection fix (PR mesa#2988 for issue mesa#2987): Handle models
   that collect data irregularly (e.g., only at steps 0, 5, 10)

2. Time dilation fix (PR mesa#3058): Handle DataCollector.collect()
   called multiple times per step

Changes:
- Prioritize _collection_steps when available (modern DataCollector)
- Fall back to sparse collection logic when step not found
- Support legacy DataCollectors without _collection_steps

All 12 batch_run tests now pass, including both
test_batch_run_sparse_collection and test_batch_run_time_dilation.
Adds test_batch_run_empty_collection_edge_case to cover the scenario
where data is requested before any collection has occurred. This tests
the IndexError exception handler in _collect_data when idx is out of
bounds and model_vars is empty (lines 277-282 in batchrunner.py).

This improves patch coverage for the merge conflict fix.
@Nithin9585
Copy link
Copy Markdown
Contributor Author

@quaquel I've resolved the merge conflict!

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Jan 6, 2026

@quaquel I've resolved the merge conflict!

Thanks! I'll try to look at later today.

@quaquel quaquel closed this Jan 6, 2026
@quaquel quaquel reopened this Jan 6, 2026
@quaquel quaquel merged commit e1eafb3 into mesa:main Jan 6, 2026
22 of 25 checks passed
@quaquel quaquel added the enhancement Release notes label label Jan 6, 2026
@EwoutH
Copy link
Copy Markdown
Member

EwoutH commented Jan 7, 2026

Thanks for the review, and reviewing.

I think this is a bug fix right, and should be labeled as such?

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Jan 7, 2026

it's not so much a bug fix but an enhancement of what used to be an IndexError (at least in my understanding)

@Nithin9585 Nithin9585 deleted the fix-batchrunner-sparse-collection branch January 7, 2026 08:51
@EwoutH EwoutH added bug Release notes label and removed enhancement Release notes label labels Jan 9, 2026
@EwoutH EwoutH changed the title Fix #2987: IndexError in batch_run with sparse data collection Fix: IndexError in batch_run with sparse data collection Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Release notes label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BatchRunner crashes with sparse data collection (IndexError)

4 participants