Fix for Memory leak by quaquel · Pull Request #3180 · mesa/mesa

quaquel · 2026-01-19T21:11:56Z

This is a bugfix for the memory leak identified in #3179.

The problem is that Agent._ids is a defaultdict that stores references to model instances. This was done to ensure that uniqiue_id is unique relative to a given model. However, since it is a class attribute, this reference persists across the Python process, preventing the entire model blob from being garbage-collected.

There are various solutions

Use a weakref of the model
Use the hash of the model
Move the assignment of unique_id into register_agent
Add some method to Model to clean up (not yet explored), including removing the ref in Agent._ids

Here, I implement option 3 because, among the options tested, it was the fastest locally. I also moved away from itertool.count and instead just use an index that is being incremented. The main reason is that itertools.count will not be pickleable in Python 3.14, and count is overkill for the simple integer increments needed here anyway.

For reasons that escape me at present, it is still necessary to remove all agents from the model before it can be garbage-collected, at least in the updated test_examples. But when I try a minimal version of Boltzmann, this seems unnecessary. So, there might be some other memory issue remaining.

for more information, see https://pre-commit.ci

github-actions · 2026-01-19T21:34:49Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🔵 -1.8% [-2.2%, -1.4%]	🔵 -1.2% [-1.4%, -1.1%]
BoltzmannWealth	large	🔵 +3.2% [-2.0%, +9.6%]	🔴 +8.8% [+5.8%, +11.7%]
Schelling	small	🟢 -4.1% [-4.4%, -3.8%]	🔵 -1.1% [-1.2%, -1.0%]
Schelling	large	🔵 +2.0% [-0.8%, +4.9%]	🔵 -0.6% [-1.7%, +0.4%]
WolfSheep	small	🔵 -1.6% [-2.8%, -0.0%]	🔵 -1.2% [-1.5%, -0.9%]
WolfSheep	large	🔴 +24.5% [+13.5%, +35.4%]	🔵 -0.3% [-1.0%, +0.5%]
BoidFlockers	small	🔵 -2.3% [-2.9%, -1.7%]	🔵 -0.3% [-0.5%, -0.0%]
BoidFlockers	large	🔵 -3.0% [-3.5%, -2.4%]	🔵 -0.6% [-0.9%, -0.4%]

quaquel · 2026-01-19T21:37:09Z

Ok, these results are better than when testing locally, so this seems a reasonable solution.

Now we just need to add support for pickling the model because itertools.count won't be pickleable anymore (not that difficult to achieve I think), and we need to figure out why the alliance formation model breaks.

commit 4b71cbe Author: Jan Kwakkel <[email protected]> Date: Tue Jan 20 07:57:16 2026 +0100 Update meta_agent.py commit 702944a Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue Jan 20 06:56:05 2026 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit 0f2c81a Author: Jan Kwakkel <[email protected]> Date: Tue Jan 20 07:54:25 2026 +0100 Update meta_agent.py commit 1820d89 Author: Jan Kwakkel <[email protected]> Date: Tue Jan 20 07:53:24 2026 +0100 Update meta_agent.py

github-actions · 2026-01-20T08:37:55Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🟢 -13.9% [-14.4%, -13.4%]	🔵 -1.5% [-1.8%, -1.3%]
BoltzmannWealth	large	🔵 +0.6% [-4.6%, +6.9%]	🟢 -7.3% [-9.6%, -4.7%]
Schelling	small	🟢 -4.7% [-5.1%, -4.2%]	🔵 -1.2% [-1.4%, -0.9%]
Schelling	large	🔵 +3.0% [-0.2%, +6.6%]	🟢 -8.2% [-9.9%, -6.4%]
WolfSheep	small	🔵 -1.6% [-3.1%, +0.2%]	🔵 +0.7% [+0.5%, +1.0%]
WolfSheep	large	🔴 +27.2% [+14.0%, +41.3%]	🔵 +0.7% [-1.3%, +2.4%]
BoidFlockers	small	🔵 -2.6% [-3.1%, -2.1%]	🔵 +0.2% [+0.0%, +0.4%]
BoidFlockers	large	🟢 -5.3% [-6.1%, -4.3%]	🔵 -0.1% [-0.3%, +0.1%]

codebreaker32 · 2026-01-20T08:45:29Z

Hi @quaquel

I was also working on it to fix the example error and modifying register_agent passed all the test

# In model.py

def register_agent(self, agent: Agent) -> None:
    # Check if the agent already has a valid ID
    if agent.unique_id is not None:
        # It's already registered! Don't touch the ID.
        # Just ensure it's in the internal list if needed.
        self._agents[agent] = None 
        return

    # Only generate a NEW ID if it completely lacks one
    agent.unique_id = next(self.agent_id_counter)
    self._agents[agent] = None

The reason I could've think of is(I might be wrong):

# meta_agent.py
def add_constituting_agents(self, new_agents: set[Agent]):
    for agent in new_agents:
        self._constituting_set.add(agent)
        agent.meta_agent = self
        self.model.register_agent(agent)  # <--- Culprit

Suppose Agent A's id 5 is used as a key to store data in various dictionaries. Suddenly, the agent's unique_id attribute is overwritten to 105. The Python dictionaries are not corrupted, but they are now out of sync: the dictionary still holds the data under the old key (5), but the code is now trying to retrieve it using the new key (105). This mismatch leads to a KeyError or logic errors because the system looks for an ID that isn't there

quaquel · 2026-01-20T09:54:21Z

@codebreaker32, yes, this is indeed the source of the problem. But in my view, agents should never call register_agent twice. In fact, a Mesa user should never have to call it if they use super properly in their custom agents. So, I see this as a bug in create_meta_agents, not in register_agent (See #3183 for my proposed fix.)

commit e5b3a09 Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue Jan 20 08:15:05 2026 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit 228b8b5 Author: Jan Kwakkel <[email protected]> Date: Tue Jan 20 09:14:54 2026 +0100 Update meta_agent.py commit 4b71cbe Author: Jan Kwakkel <[email protected]> Date: Tue Jan 20 07:57:16 2026 +0100 Update meta_agent.py commit 702944a Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue Jan 20 06:56:05 2026 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit 0f2c81a Author: Jan Kwakkel <[email protected]> Date: Tue Jan 20 07:54:25 2026 +0100 Update meta_agent.py commit 1820d89 Author: Jan Kwakkel <[email protected]> Date: Tue Jan 20 07:53:24 2026 +0100 Update meta_agent.py

for more information, see https://pre-commit.ci

github-actions · 2026-01-20T10:20:59Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🔵 +3.0% [+1.9%, +4.1%]	🔵 -1.0% [-1.1%, -0.8%]
BoltzmannWealth	large	🔵 +6.1% [+0.3%, +13.3%]	🔴 +25.5% [+20.7%, +30.4%]
Schelling	small	🔵 -2.1% [-2.6%, -1.7%]	🔵 +0.9% [+0.6%, +1.2%]
Schelling	large	🔵 +3.3% [+0.5%, +6.5%]	🔵 +3.8% [+0.9%, +7.1%]
WolfSheep	small	🔵 +2.8% [+1.1%, +4.7%]	🔵 +0.4% [+0.0%, +0.8%]
WolfSheep	large	🔴 +32.7% [+18.1%, +48.5%]	🔵 +2.6% [+0.8%, +4.5%]
BoidFlockers	small	🟢 -7.6% [-8.0%, -7.1%]	🔵 -1.2% [-1.5%, -0.9%]
BoidFlockers	large	🟢 -9.3% [-9.7%, -8.9%]	🔵 -1.2% [-1.4%, -0.9%]

quaquel · 2026-01-20T10:23:07Z

The benchmarks remain strange. For example, Boltman Wealth has no additional agents during the run, so nothing in this PR would change the runtime.

github-actions · 2026-01-20T10:29:51Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🟢 -3.7% [-4.0%, -3.3%]	🔵 -1.1% [-1.2%, -0.9%]
BoltzmannWealth	large	🔵 +4.2% [-1.5%, +10.8%]	🔵 +6.1% [+2.9%, +9.5%]
Schelling	small	🟢 -4.1% [-4.4%, -3.8%]	🔵 -1.3% [-1.4%, -1.2%]
Schelling	large	🔵 +0.1% [-1.9%, +2.1%]	🔵 -1.6% [-2.7%, -0.4%]
WolfSheep	small	🔵 -0.6% [-1.6%, +0.5%]	🔵 +0.9% [+0.6%, +1.2%]
WolfSheep	large	🔴 +20.5% [+10.0%, +31.6%]	🔵 +1.1% [+0.2%, +2.0%]
BoidFlockers	small	🟢 -5.5% [-5.9%, -5.1%]	🔵 -0.4% [-0.6%, -0.2%]
BoidFlockers	large	🟢 -6.8% [-7.2%, -6.4%]	🔵 +0.1% [-0.3%, +0.5%]

EwoutH

I'm going to pre-approve this to unblock you.

Please document the problem, the solution, the rejected alternatives and the reasoning behind all those well. If this ever bites us back we can trace this back.

quaquel · 2026-01-20T12:09:08Z

Please document the problem, the solution, the rejected alternatives and the reasoning behind all those well. If this ever bites us back we can trace this back.

It was included in the start post. In short, moving the assignment of agent.unique_id into the model ensures it remains unique within a model instance, while being faster than the alternatives I tested.

The main issue encountered next is that ideally model.register_agent and modelderegister_agent should be never called by the user directly but allways via super on the agent. It might be good to document this (and hard refs in general) for Mesa 4.

A memory leak was discovered in Mesa where model instances could never be garbage collected after agents were created. The root cause was the `Agent._ids` class attribute—a `defaultdict` that stored references to model instances to ensure `unique_id` values were unique on a per-model basis. Because `_ids` was a class-level attribute that persisted across the Python process, any model instance used as a key in this dictionary maintained a hard reference indefinitely, preventing the garbage collector from cleaning up the model and all its associated objects (agents, grids, etc.) even after the model went out of scope or was explicitly deleted. This bug had significant practical consequences for Mesa users, particularly those running multiple simulations or batch experiments. Each time a model was instantiated and run within a function, the model objects would accumulate in RAM rather than being cleaned up when the function exited. This meant that running many model instances—common in parameter sweeps, sensitivity analyses, or optimization workflows—would cause unbounded memory growth, eventually exhausting available RAM. The issue was especially problematic because it was invisible to users: simply letting a model go out of scope or calling `del model` appeared to work but silently retained all the memory, and even explicitly removing agents with `model.remove_all_agents()` only partially addressed the problem depending on the space types used. The fix moved the `unique_id` assignment logic from the `Agent` class into the `Model.register_agent()` method, eliminating the problematic class-level `_ids` defaultdict entirely. Instead of tracking IDs across all model instances in a shared dictionary, each model now maintains its own `agent_id_counter` instance attribute that starts at 1 and increments with each registered agent. This approach ensures that `unique_id` remains unique within each model instance while allowing the garbage collector to properly clean up model objects when they go out of scope, since there are no longer any persistent class-level references to model instances. The fix also replaced `itertools.count` with simple integer incrementation, which avoids upcoming pickle compatibility issues in Python 3.14.

quaquel and others added 4 commits January 19, 2026 21:15

test all examples for memory leak

d4e8fc5

Update test_examples.py

708d2ea

updates

f02ab8c

[pre-commit.ci] auto fixes from pre-commit.com hooks

2361c63

for more information, see https://pre-commit.ci

EwoutH marked this pull request as draft January 19, 2026 21:13

quaquel added bug Release notes label trigger-benchmarks Special label that triggers the benchmarking CI labels Jan 19, 2026

Update compare_timings.py

d4456cb

quaquel added trigger-benchmarks Special label that triggers the benchmarking CI and removed trigger-benchmarks Special label that triggers the benchmarking CI labels Jan 19, 2026

mesa deleted a comment from github-actions bot Jan 19, 2026

quaquel added trigger-benchmarks Special label that triggers the benchmarking CI and removed trigger-benchmarks Special label that triggers the benchmarking CI labels Jan 19, 2026

This comment was marked as duplicate.

Sign in to view

quaquel added 2 commits January 20, 2026 07:57

Update agent.py

465bc2a

quaquel added example Changes the examples or adds to them. and removed example Changes the examples or adds to them. labels Jan 20, 2026

quaquel and others added 5 commits January 20, 2026 10:57

[pre-commit.ci] auto fixes from pre-commit.com hooks

ac56ae0

for more information, see https://pre-commit.ci

replace count with just increments

b6be1a3

[pre-commit.ci] auto fixes from pre-commit.com hooks

42522ef

for more information, see https://pre-commit.ci

properly start from 1

2da1945

quaquel removed the trigger-benchmarks Special label that triggers the benchmarking CI label Jan 20, 2026

quaquel added the trigger-benchmarks Special label that triggers the benchmarking CI label Jan 20, 2026

quaquel marked this pull request as ready for review January 20, 2026 10:23

Merge branch 'main' into memory_leak

4e51a4a

EwoutH approved these changes Jan 20, 2026

View reviewed changes

quaquel merged commit b6e96d1 into mesa:main Jan 20, 2026
14 checks passed

quaquel deleted the memory_leak branch January 21, 2026 07:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix for Memory leak#3180

Fix for Memory leak#3180
quaquel merged 13 commits intomesa:mainfrom
quaquel:memory_leak

quaquel commented Jan 19, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

quaquel commented Jan 19, 2026

Uh oh!

This comment was marked as duplicate.

github-actions bot commented Jan 20, 2026

Uh oh!

codebreaker32 commented Jan 20, 2026 •

edited

Loading

Uh oh!

quaquel commented Jan 20, 2026

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

quaquel commented Jan 20, 2026

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

EwoutH left a comment

Uh oh!

quaquel commented Jan 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

quaquel commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

quaquel commented Jan 19, 2026

Uh oh!

This comment was marked as duplicate.

github-actions bot commented Jan 20, 2026

Uh oh!

codebreaker32 commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quaquel commented Jan 20, 2026

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

quaquel commented Jan 20, 2026

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

EwoutH left a comment

Choose a reason for hiding this comment

Uh oh!

quaquel commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

quaquel commented Jan 19, 2026 •

edited

Loading

codebreaker32 commented Jan 20, 2026 •

edited

Loading

quaquel commented Jan 20, 2026 •

edited

Loading