Enforce read-only safety for 'empty' layer#3080
Enforce read-only safety for 'empty' layer#3080codebreaker32 wants to merge 6 commits intomesa:mainfrom
Conversation
|
Performance benchmarks:
|
|
Please review @quaquel |
|
Thanks for this PR. I hope to take a closer look somewhere this week. However, I also have a few ideas of my own that I want to test, and since |
|
As indicated, empty is performance critical. So, I did a quick comparison of an attribute, a property, and a descriptor. Setting and getting a normal attribute is evidently the fastest. Setting a property is an order of magnitude slower. Setting via a descriptor is double the time of a property. Getting a property is about a factor of 5 more expensive, while via descriptor is an order of magnitude slower compared to a normal attribute. So what does this mean? To make So, what about turning empty simply into a numpy array instead of using the property layer indirection? This is, for now, a discussion question, so please refrain from modifying this PR just yet. |
|
I agree that descriptors introduce overhead compared to raw attributes. However, I believe replacing The "Bottleneck" is the Descriptor, not the Layer Object
Agree. whether
The difference is a single Python attribute lookup ( The Cost of Dropping
A raw array would also need manual state management which |
|
@quaquel def select_random_empty_cell(self) -> T:
# 1. Fast Heuristic
if self._try_random:
for _ in range(50):
# ... tries random cells ...
# 2. Slow Fallback (Iterate EVERYTHING)
# This calls DiscreteSpace.select_random_empty_cell
# which runs: return self.random.choice(list(self.empties))
# 'self.empties' runs: return self.all_cells.select(lambda cell: cell.is_empty)
return super().select_random_empty_cell()When the grid is full (e.g., >90% density), the heuristic fails. The fallback then forces Python to check We can improve it with numpy def select_random_empty_cell(self) -> T:
if self._try_random:
for _ in range(50):
cell = self.all_cells.select_random_cell()
if cell.is_empty:
return cell
# 2. Vectorized Fallback
# Instead of iterating objects, ask NumPy for the coordinates of all empty cells.
# self.select_cells is provided by HasPropertyLayers
empty_coords = self.select_cells(only_empty=True) #
# 3. Pick one coordinate and retrieve the cell
random_coord = self.random.choice(empty_coords)
return self._cells[random_coord]I believe |
|
My longer-term thinking is what you call the raw array path. That is, getting rid of the I also agree that there is a lot more that could be done with However, Since raising the idea of read-only property layers, I have also been wondering about other examples where one might need the protection of read-only access. For |
|
Thanks for sharing! I completely agree that flattening the I am very curious about your point on "custom properties" halving the overhead compared to descriptors. By this, do you mean generating specific I am curious to see how it turns out and I will surely follow the discussions :) |
I ran tests with the following code. So I am comparing class MyClass_A:
def __init__(self):
self.a =1
class MyClass_B:
@property
def a(self):
return self._a
@a.setter
def a(self, value):
self._a = value
def __init__(self):
self._a =1
class Property:
def __set_name__(self, owner, name):
self.private_name = f"_{name}"
def __set__(self, obj, value):
setattr(obj, self.private_name, value)
def __get__(self, obj, objtype=None):
return getattr(obj, self.private_name)
class MyClass_C:
a = Property()
def __init__(self):
self.a =1 |
|
@quaquel Thanks for sharing your long-term vision. After careful reading( It confirms that the overhead comes from the Python-level function calls ( Instead of attaching a It also led me to thought: If we go down this path, we don't need the If the Cell communicates directly with the array via the property closure,
I plan to spend more time researching this architecture and will try to come up with a faster solution with flattened hierarchy. If you feel I’m heading in the wrong direction or there’s a better approach, I’d really appreciate your guidance :) |
|
Feel free to submit any pathfinding style draft PR so I can review it as well. For this, don't worry too much about backward compatibility. Its highly likely that we'll open a Mesa 4 branch in the coming days to weeks anyway. |
|
@codebreaker32, in case you are curious about your idea of a property factory pattern, please check #3113, where I am doing some pathfinding and have included this. |
|
Thanks again for this PR! Some of the ideas have already been implemented. However, the current plan is to first make property layers work for all discrete spaces by explicitly separating their x,y position from their logical index. Next, we can revisit this PR. All this will be part of the mesa 4 development. So I am closing this PR but will revisit the idea hopefully soon. |
Summary
This PR implements a robust mechanism for Read-Only Property Layers, primarily to secure the
cell.emptystate. It utilizes a Polymorphism pattern to ensure that thePropertyLayerremains perfectly synchronized withCellagent lists, while maintaining strict decoupling between theCellclass and theGrid.Motive
See #3072
Implementation
mesa/discrete_space/property_layer.py:ReadOnlyPropertyDescriptor: A specific descriptor class that raisesAttributeErroron__set__.HasPropertyLayers: Added a closure(_make_setter) that captures the specificlayerinstance, allowing theCellto write to the NumPy array without holding a reference to it.mesa/discrete_space/cell.py:self.empty = val) with a protected method call (self._set_empty(val))._set_emptythat updates a local variable, ensuring theCellworks correctly in non-grid contexts (e.g.,Network).mesa/discrete_space/grid.py:emptylayer withread_only=True.Attempting to manually modify the state now raises an
AttributeErrorAdditional Notes
Networkinstances (which lack property layers) fallback gracefully to the default local storage.