Perf: Optimize Cell memory usage by removing dynamic dict by falloficarus22 · Pull Request #3108 · mesa/mesa

falloficarus22 · 2026-01-10T14:23:06Z

Summary

This PR optimizes the memory efficiency of the Cell class and its subclasses by enforcing strict usage of __slots__ and removing instance __dict__. It also introduces a manual caching mechanism for spatial queries to maintain performance without the overhead of dynamic dictionaries.

Motive

Mesa environments often consist of millions of cells. Previously, the Cell class included __dict__ in its __slots__, which resulted in an overhead of approximately 300 bytes per instance (due to the presence of an empty dictionary). For high-resolution grids (e.g., 1000x1000), this contributed ~300MB of unnecessary RAM usage. This bottleneck limited the scalability of complex spatial models.

Implementation

The implementation focused on three main areas:

Strict Slot Enforcement: Removed __dict__ from Cell.__slots__ and ensured the dynamically created GridCell subclass in Grid defines __slots__ = () to prevent the re-introduction of dictionaries.
Manual Caching: Since functools.cache and functools.cached_property require __dict__ to store results, I implemented a manual_neighborhood_cache slot and logic for neighborhood and get_neighborhood methods.
Property Management: To allow Grid to continue using high-performance NumPy-backed
PropertyLayers for the empty attribute without triggering slot inheritance conflicts, I refactored empty into a property/setter backed by an internal _empty slot.
Robust Clash Detection: Updated the HasPropertyLayers mixin to correctly detect inherited slots and properties, ensuring the PropertyLayer system remains safe while supporting the new memory-efficient structure.

Usage Examples

The changes are transparent to the user but allow for significantly larger environments. For example, a model that previously crashed due to memory limits at a certain resolution can now scale further:

# Before this PR: 1,000,000 cells occupied ~300MB of overhead RAM
# After this PR: The overhead is eliminated, saving ~300MB per million cells
grid = OrthogonalMooreGrid((1000, 1000))

No changes to user code are required, as the Cell API (.agents, .empty, .neighborhood) remains identical.

Additional Notes

Backward Compatibility: All existing tests in tests/discrete_space/ passed, confirming that the internal refactoring did not change the public API or behavior.
Performance: Benchmark tests show that neighborhood query performance is preserved through the manual caching implementation.
Dependencies: No new dependencies were added.

Closes #3107

The Cell class currently includes in its , which retains a dynamic dictionary for every instance, adding significant memory overhead (~300 bytes per cell). For large-scale models, this overhead prevents scaling to millions of cells. Remove from Cell slots and transition to a fully slotted architecture. Since and rely on the existence of a , implement manual caching for spatial neighborhood queries using a dedicated slot. Update the dynamic GridCell creation to specify empty slots, ensuring subclasses do not re-introduce dictionaries. Refactor the [empty](cci:1://file://wsl.localhost/Ubuntu/root/mesa/mesa/discrete_space/cell.py:136:4-138:27) attribute into a managed property to avoid conflicts with the optimized PropertyLayer system used in Grids. Adjust the property layer clash detection to correctly identify slotted attributes and properties while exempting the standard [empty](cci:1://file://wsl.localhost/Ubuntu/root/mesa/mesa/discrete_space/cell.py:136:4-138:27) optimization. This change reduces RAM usage by approximately 300MB per million cells without impacting spatial query performance.

for more information, see https://pre-commit.ci

github-actions · 2026-01-10T14:30:43Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🟢 -15.0% [-16.0%, -14.2%]	🔴 +10.1% [+9.9%, +10.2%]
BoltzmannWealth	large	🟢 -22.3% [-23.0%, -21.5%]	🔴 +10.9% [+7.0%, +15.4%]
Schelling	small	🟢 -22.4% [-23.1%, -22.0%]	🟢 -3.5% [-3.9%, -3.1%]
Schelling	large	🟢 -19.1% [-19.5%, -18.9%]	🔵 +2.6% [+0.3%, +5.2%]
WolfSheep	small	🟢 -10.5% [-10.7%, -10.3%]	🔴 +4.8% [+4.5%, +5.1%]
WolfSheep	large	🟢 -13.3% [-16.0%, -11.7%]	🔴 +4.8% [+3.2%, +6.3%]
BoidFlockers	small	🔵 +1.8% [+1.2%, +2.4%]	🔵 +1.4% [+1.1%, +1.7%]
BoidFlockers	large	🔵 +1.1% [+0.5%, +1.6%]	🔵 -0.3% [-0.8%, +0.3%]

falloficarus22 · 2026-01-10T14:54:47Z

@quaquel Performance benchmarks provide a very clear picture of the trade-offs I've made. Here is my analysis of what’s happening:

The Win: Initialization Time (-10% to -22%)
The Initalization time improved significantly across almost all grid-based models.
Why? By removing __dict__ from millions of Cell instances, Python’s memory allocator has much less work to do. Allocation of a small, fixed-size object (slotted) is much faster than allocating an object that might eventually need a dynamic hash table.
Verdict: This is a massive win for users setting up large simulations.
The Concern: Run Time Regression (+5% to +10%)
We are seeing a noticeable regression in Run time, particularly in BoltzmannWealth (+10%) and WolfSheep (+4.8%).
Why? The culprit is almost certainly the manual caching logic I implemented for neighborhood.
The Bottleneck:

Python's functools.cache is implemented in C and is extremely fast at handling arguments. My replacement creates a tuple ("get_neighborhood", radius, include_center) on every call and performs a manual dictionary lookup in pure Python.
In models like Boltzmann Wealth, agents are constantly asking for their neighbors. The overhead of creating that tuple and the Python-level dict lookup is overriding the small speed gain provided by __slots__ attribute access.

To fix the regression, we should optimize the "hot path." In most Mesa models, radius=1 is called the vast majority of the time.

My Suggestion: Instead of a general dictionary for all radii, we can use a dedicated slot for the most common case:

Add a _neighborhood_1 slot specifically for the radius-1 neighborhood.
Store the CellCollection directly there.
Only fall back to the dictionary-based cache for radii > 1.

Why this would work:

It avoids tuple creation for radius=1.
It keeps the memory savings of __slots__.
It should bring the Run time back down (likely even faster than the original, since we'd be doing a simple attribute check instead of a functools.cache call).

I haven't implemented it yet but would like to give it a try. What are your thoughts?

quaquel · 2026-01-10T16:07:32Z

intialization is nanoseconds, so not a major issue. The 5%-10% runtime increase is more relevant because that is in milliseconds to seconds. So, runtime is orders of magnitude more relevant.

To be clear: I am open to exploring getting rid of __dict__. However, I find the memory premise underpinning this PR not convincing. Because of this, I am also skeptical about adding more code complications just to regain the lost performance. In short, I am inclined to prefer a simple @cache and __dict__ solution, with the additional memory footprint, preferable over more code complexity and lost performance. But again: if there is an elegant solution, I am open to considering it.

quaquel · 2026-01-11T12:42:05Z

Thanks for this PR. I am closing it in favor of #3113, which seems to offer a more promising direction for achieving the main aim of this PR, while also resolving interactions with #3080.

falloficarus22 and others added 2 commits January 10, 2026 14:12

[pre-commit.ci] auto fixes from pre-commit.com hooks

9c7efe1

for more information, see https://pre-commit.ci

quaquel added the performance Release notes label label Jan 10, 2026

quaquel mentioned this pull request Jan 11, 2026

pathfinding for __dict__ in Cell #3113

Closed

quaquel closed this Jan 11, 2026

falloficarus22 deleted the optimize-cell-memory branch January 11, 2026 16:26

quaquel mentioned this pull request Jan 12, 2026

Ensure Cell only uses __slots__ #3121

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Perf: Optimize Cell memory usage by removing dynamic dict#3108

Perf: Optimize Cell memory usage by removing dynamic dict#3108
falloficarus22 wants to merge 2 commits intomesa:mainfrom
falloficarus22:optimize-cell-memory

falloficarus22 commented Jan 10, 2026

Uh oh!

github-actions bot commented Jan 10, 2026

Uh oh!

falloficarus22 commented Jan 10, 2026 •

edited

Loading

Uh oh!

quaquel commented Jan 10, 2026 •

edited

Loading

Uh oh!

quaquel commented Jan 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

falloficarus22 commented Jan 10, 2026

Summary

Motive

Implementation

Usage Examples

Additional Notes

Uh oh!

github-actions bot commented Jan 10, 2026

Uh oh!

falloficarus22 commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quaquel commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quaquel commented Jan 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

falloficarus22 commented Jan 10, 2026 •

edited

Loading

quaquel commented Jan 10, 2026 •

edited

Loading