Optimise select_random_empty_cell() in grid.py#3087
Conversation
mesa/discrete_space/grid.py
Outdated
| empty_coords = self.select_cells(only_empty=True) | ||
| random_coord = self.random.choice(empty_coords) | ||
| return self._cells[random_coord] |
There was a problem hiding this comment.
would it not be a lot faster to do something like
coords = np.argwhere(self.empty.data)
random_coord = self.random.choice(empty_coords)There was a problem hiding this comment.
You are right.
There was a problem hiding this comment.
but if that works, I am wondering whether its not just allways faster to do this instead of the random trying.
I know that there has been some research on this before (see. mesa.space for details). I am just not sure how those findings translate to the new discrete_space.
There was a problem hiding this comment.
I looked into the existing mesa.space implementation(which led me to #1052), and it actually uses a specific cutoff formula (cutoff_empties) to switch between random sampling and list generation. However, I believe numpy definitely shifts the cutoff point so we might reduce the N
There was a problem hiding this comment.
Yeah, my thinking exactly. I suppose some of the testing will need to be rerun to determine the cutoff in this case.
This comment was marked as outdated.
This comment was marked as outdated.
|
So this crashed: Basically, the numpy array for the coordinates needs to be cast to a tuple I guess. Also, it highlights the need for test coverage. update: I fixed it for you after confirming that it was indeed working. Main thing now is to add test coverage. |
|
Performance benchmarks:
|
|
I will add tests for it today only. |
cd8d18c to
2b4581e
Compare
ed76b23 to
5d763d7
Compare
|
Hi @quaquel def ufunc_requires_additional_input(ufunc): # noqa: D103
# NumPy ufuncs have a 'nargs' attribute indicating the number of input arguments
# For binary ufuncs (like np.add), nargs is 2
return ufunc.nargs > 1Isn't it misleading then? |
I guess so. I just copied this from |
|
|
What did you change exactly? I guess the problem triggers on the comment, not the actual code? |
Both i.e method and comments. Yes the problem was triggered by comment Update: Replaced |
4e207e8 to
5d763d7
Compare
cf9ed0f to
bc4cee8
Compare
|
I think its ready to be merged now. |
|
Performance benchmarks:
|
Summary
This PR optimizes the fallback mechanism in
Grid.select_random_empty_cellto use vectorized NumPy operations, replacing the slow O(N) Python iteration. It also fixes a critical bug inHasPropertyLayers.select_cellspreventing the correct masking of empty cells.Motive
1. Performance on Dense Grids:
The
select_random_empty_cellmethod uses a fast random heuristic (O(1)). However, when the grid is nearly full (dense), this heuristic fails, and the code previously fell back to iterating over all cells:return self.random.choice(list(self.empties))This triggered the descriptor protocol for every single cell, causing a massive performance penalty. By switching to
self.select_cells(only_empty=True), we bypass theCellobjects entirely and scan the underlying NumPy array directly.2. Bug Fix:
The
select_cells(only_empty=True)method inproperty_layer.pywas previously broken. It attempted to apply a logical mask to thePropertyLayerobject instead of its underlying.dataarray, causing operations to fail or return incorrect results.( I missed it in #3074 )Implementation
mesa/discrete_space/grid.py:select_random_empty_cellto callself.select_cells(only_empty=True)in the fallback block.mesa/discrete_space/property_layer.py:select_cells: Changedself._mesa_property_layers["empty"]toself._mesa_property_layers["empty"].data.Usage Examples
There is no change to the public API.
Additional Notes
PropertyLayerarchitecture being present.