Core: Speed up CollectionState sweeping#3688
Core: Speed up CollectionState sweeping#3688Mysteryem wants to merge 52 commits intoArchipelagoMW:mainfrom
Conversation
The loop conditions of `while reachable_events:` could result in looping once more than necessary when there were reachable events, but none of the items at the reachable events had any logical impact.
The filters may be incorrect, see ArchipelagoMW#3680.
As mentioned in comments, `self.multiworld.player_ids` does not include the player IDs used for item links, but the initial `players_to_check` was using `self.multiworld.player_ids`. This has been updated to use `self.multiworld.regions.location_cache.keys()`.
alwaysintreble
left a comment
There was a problem hiding this comment.
i think my biggest comment here is that i'm a bit confused why this is using sets. i don't see anything in particular that seems like it needs to be a set (instead of a list). sets are more expensive to create and slower to iterate over. the only bit in the old code as far as i can see that justifies using a set is locations -= reachable_events. it also used to not be guaranteed for locations to have unique names, but that's enforced nowadays so we don't have to worry about duplicates
BaseClasses.py
Outdated
| # links, so iterate the keys of the location_cache instead. | ||
| events_per_player = {player: set() for player in self.multiworld.regions.location_cache.keys()} | ||
| for location in locations: | ||
| if event_filter(location): |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
The original implementation filters the passed in locations, so I have not modified it. If the filter should be removed, that may be more suitable to a separate PR.
BaseClasses.py
Outdated
| events_per_player[location.player].add(location) | ||
|
|
||
| # Remove empty sets to reduce the number of iterations required to iterate through `events_per_player.items()`. | ||
| for player, events in list(events_per_player.items()): |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
It's to do with the length of the dictionary itself rather than the sets because the while loop iterates events_per_player.items() many times, so reducing the length of events_per_player should make it faster to iterate.
It should be possible to have a list of locations, where inaccessible locations are iterated into a new list that replaces the old list after the iteration. Another alternative could be to iterate the list in reverse along with a reversed enumeration, that way, reachable locations could be popped by index. I'll investigate some list-based options. |
Reduces generation time by about 4% for a multiworld with 1 of each template yaml. This can probably be improved further, but is a good starting point.
This should allow additional items to be collected in each sweep. Reduces generation time by about 5% for a multiworld with 1 of each template yaml.
BaseClasses.py
Outdated
| self.collect(event.item, True, event) | ||
| item = event.item | ||
| assert isinstance(item, Item), "tried to collect Event with no Item" | ||
| changed = self.collect(item, True, event) |
There was a problem hiding this comment.
It turns out that self.collect(item, True, event) always returns True and always adds the item into self.prog_items (ignoring the possibility of an overridden World.collect() that can return True without adding the item into self.prog_items).
However, the only non-advancement items that can be placed at this point are from a world's pre-fill, so checking each collected item for if item.advancement: to determine if item.player's locations should be included in the next iteration might actually be slower than just assuming that every item collected in the sweep is advancement.
There was a problem hiding this comment.
that's because you're filtering to location.advancement. if an item is progression it should alter state and return True (i say should because the PR to enforce this isn't merged). assert changed would probably be good to have
There was a problem hiding this comment.
self.collect(item=item, event=True, location=event) always returns True because it has
if not changed and event:
self.prog_items[item.player][item.name] += 1
changed = True
[...]
return changedLocations with non-advancement items are only sneaking into the locations for the key_only=True case, which appears to be a symptom of #3680 and I think makes it clear that it is a bug.
There was a problem hiding this comment.
right, but i'm of the opinion we should remove the event arg from state.collect. i don't think there's any reason we should want/allow items to be added to state if they aren't progression, and for custom state modification we already have the overridable world.collect
Region accessibility is cached, so is fast, and the CollectionState should not become stale when only checking accessibility, so the cache will not need to be rebuilt. Reduces generation time by about 12% for a multiworld with 1 of each template yaml.
I think it's clear now that the change to location filtering semantics was a bug because it was allowing the `key_only=True` case to have locations which did not contain an advancement item.
Quite often, `sweep_for_events()` is called with locations that all belong to the same player, so creating a full dict with empty lists for every player is wasteful. The temporary dictionary is now also deleted after use to prevent using it by accident.
Adds `World.player_dependencies`, a `set[int]` of player numbers that
the world's logic depends on. Automatically set to `{self.player}` by
default.
It is quite likely there are better ways to do this.
One alternative would be to assume all worlds are dependent on their own
world's items/locations/entrances/regions and then
`World.player_dependencies` could become an `Optional[set[int]]` that
gets set to only player numbers other than its own.
I tried precalculating a dict mapping from player to the worlds
dependent on that player in `Main.main()`, but worlds can call
`CollectionState.sweep_for_events()` very early, so this seems like it
may be difficult to do. Out of the supported games, Lingo calls
`sweep_for_events()` the earliest, in its `create_regions()`.
|
I think there's some more places where performance can be improved, but some good performance improvements have been made since I learned more about the cache for region accessibility, and I have updated the table. This has also increased the scope of the PR to a more general performance improvement, hopefully it will not increase much further because I should probably start thinking about splitting up the PR if it does, since I don't want to make it too difficult for whoever has to review this once its ready. Given the growing size of the Replacing the sets with lists was a good call and reduced generation time by about 4%. |
The performance difference was negligible and this makes the code more readable.
Use a defaultdict(list). Check for `in worlds` rather than `in groups`. Use list.append for item link group IDs in-case a world could decide to depend on an item link location. Remove check for dependencies not being None. This was a holdover from an in-development version that had player_dependencies as None by default, where it would be assumed that all worlds would depend on their own world for logic.
collection_state.collect(event=True) always returns True anyway.
…checks the wrong state With no starting state, the state checked should be an empty state.
…tself Reduces generation duration with --skip_output and no progression balancing by about 2%.
If a player was skipped on the last iteration due to the sole dependent optimization, then can_beat_game could skip checking if that player can now beat their game and then return False because it was the last iteration.
The name of the variable is no longer true because the 'sole dependents' optimization allows for skipping worlds that were logically affected by the last sweep, but were not logically affected after their locations were checked.
Caching the result for each player could be done, but worlds with logic depending on other worlds are going to be so rare that it's probably not worth the extra code.
|
The decrease in generation duration for the 2-of-each-yaml case seems a bit high, so I want to try comparing some other seeds, in-case there was something about that seed that the original code particularly struggled with. Maybe I'll profile generating with 4 of each template yaml again too. Aside from that, I think the implementation is all here now. If there's no fast movement on this PR, I'll probably check in in a week or two and do some more self-review once I've lost familiarity with some of the changes. |
…goMW#3723) Collecting an item with prevent_sweep=True (previously event=True), no longer always returns True, so a player should only be considered to have received advancement when CollectionState.collect() returns True.
|
I'm closing this PR because I'm breaking it up into smaller parts. |
What is this fixing or adding? Increases the performance of sweeping state in CollectionState.sweep_for_events() and MultiWorld.can_beat_game, without changing the generated output (at least for supported games with their template yamls). CollectionState.sweep_for_events() takes up a significant amount of generation time, mostly from filling progression items, and the majority when not generating a playthrough, so improvements to its performance can have a noticeable effect on generation duration. Similarly, MultiWorld.can_beat_game() takes up a significant amount of playthrough calculation time from culling spheres. This patch reduces the duration of both and refactors them to use the same sweeping code. This patch has been split from ArchipelagoMW#3688 to focus only on the sweep itself, and has been simplified to not require any additions to the existing world API. Make better use of region accessibility caching Region accessibility is cached per-player and the cache for that player becomes stale whenever an item is collected or removed for that player. The original sweeping implementations make use of this by only collecting items once all reachable locations have been found for all players. This patch changes to the sweep to find reachable locations per-player, collect the items from those locations and then move onto the next player. This keeps region accessibility cached as before, but has the advantage that if a player's reachable locations contain an item that gives a later player access to more locations, it will take fewer iterations for all reachable locations to be collected from. Skip checking location accessibility for locations belonging to a player which did not receive any advancement in the previous iteration If no advancement items belonging to a player were collected in the previous sweep iteration (after the point at which their reachable locations were found), then it is assumed that player should have zero newly reachable locations, so checking for their reachable locations can be skipped. While almost all worlds only logically depend on only their own items/locations etc., it is possible for a world to logically depend on a part of another world. To account for this, once the sweep runs out of accessible locations, the locations of all players are checked to confirm that the sweep is indeed finished. My previous idea to account for worlds logically depending on other worlds was to require all worlds to specify which worlds they logically depended on. Performing extra sweeps to confirm will be slower than this previous idea, but it is simpler and does not require any changes to the existing world API. How was this tested? Fixed seeds were generated, with and without this patch, with different sets of template yamls in the Players directory, using games known to have deterministic generation. The output from each set of template yamls was seen to match. Playthrough calculation is non-deterministic, so was not considered here. Seeds using item links have also been tested. To test the case of worlds depending on other worlds, I modified the A Hat in Time world to find all other AHiT worlds in the multiworld and then added additional rules to a few locations to require specific progression items belonging to one of those other AHiT worlds. I added some extra print statements to make sure that it would sometimes check if the sweep was finished (by checking every player), but then find newly reachable locations and then continue sweeping. I also tried adding additional rules to a few entrances to require specific progression items belonging to a different world, but, unsurprisingly, this does not work well because entrance accessibility is cached per-player and only staled when the world the entrance belongs to collects an item, so the required item belonging to a different world could be collected, but without the entrance accessibility ever being staled to result in checking if the entrance is now accessible. This would often result in an error at the end of playthrough calculation because the end of playthrough calculation builds up locations by sphere, but MultiWorld.can_beat_game(), used to cull the spheres, no longer sweeps by spheres, so the two methods of sweeping could stale the cached entrance accessibility at different times, leading to the per-sphere sweep at the end of playthrough calculation to sometimes not be able to reach every required location. Values in the table are rounded to 1 decimal place. Note that generation times with every game included do have a small amount of variance in generation time because maybe 1 or 2 games have non-deterministic generation issues. These times were recorded based on the main branch at f5218fa (before Kingdom Hearts was merged). As the logic in the main branch changes and is merged into this PR, the same seed may take longer or shorter to generate, so these times may not match the current state of main and this PR. Generated with python -O .\Generate.py --skip_output --seed 1 with a meta.yaml to disable progression balancing. Yamls Before Patch/s After Patch/s Decrease in duration 1 of each template YAML (67), averaged over 5 individual generations 122.5 74.9 38.9% 2 of each template YAML (134), averaged over 5 individual generations* 371.6 192.0 48.3% *generated with --seed 2 because --seed 1 produced two sets of very different results, indicating a non-deterministic generation issue that caused two sets of very different seeds to be generated Generated with python -O .\Generate.py --seed 1 with a meta.yaml to disable progression balancing, and spoiler: 2 in host.yaml. Yamls Before Patch/s Only 'Make better use of region accessibility caching' After Patch/s Decrease in duration before -> after 1 of each non-rom template YAML (53), averaged over 5 individual generations 504.5 436.6 376.4 25.4% Above, but playthrough calculation only 446.9 390.4 335.4 25.0% Above, but without playthrough calculation 57.6 46.2 41.0 28.7%
Increases the performance of sweeping state in `CollectionState.sweep_for_events()` and `MultiWorld.can_beat_game`, without changing the generated output (at least for supported games with their template yamls). `CollectionState.sweep_for_events()` takes up a significant amount of generation time, mostly from filling progression items, and the majority when not generating a playthrough, so improvements to its performance can have a noticeable effect on generation duration. Similarly, `MultiWorld.can_beat_game()` takes up a significant amount of playthrough calculation time from culling spheres. This patch reduces the duration of both and refactors them to use the same sweeping code. This patch has been split from ArchipelagoMW#3688 to focus only on the sweep itself, and has been simplified to not require any additions to the existing world API. ### Make better use of region accessibility caching Region accessibility is cached per-player and the cache for that player becomes stale whenever an item is collected or removed for that player. The original sweeping implementations make use of this by only collecting items once all reachable locations have been found for all players. This patch changes to the sweep to find reachable locations per-player, collect the items from those locations and then move onto the next player. This keeps region accessibility cached as before, but has the advantage that if a player's reachable locations contain an item that gives a later player access to more locations, it will take fewer iterations for all reachable locations to be collected from. ### Skip checking location accessibility for locations belonging to a player which did not receive any advancement in the previous iteration If no advancement items belonging to a player were collected in the previous sweep iteration (after the point at which their reachable locations were found), then it is assumed that player should have zero newly reachable locations, so checking for their reachable locations can be skipped. While almost all worlds only logically depend on only their own items/locations etc., it is possible for a world to logically depend on a part of another world. To account for this, once the sweep runs out of accessible locations, the locations of all players are checked to confirm that the sweep is indeed finished. My previous idea to account for worlds logically depending on other worlds was to require all worlds to specify which worlds they logically depended on. Performing extra sweeps to confirm will be slower than this previous idea, but it is simpler and does not require any changes to the existing world API.
Increases the performance of sweeping state in `CollectionState.sweep_for_events()` and `MultiWorld.can_beat_game`, without changing the generated output (at least for supported games with their template yamls). `CollectionState.sweep_for_events()` takes up a significant amount of generation time, mostly from filling progression items, and the majority when not generating a playthrough, so improvements to its performance can have a noticeable effect on generation duration. Similarly, `MultiWorld.can_beat_game()` takes up a significant amount of playthrough calculation time from culling spheres. This patch reduces the duration of both and refactors them to use the same sweeping code. This patch has been split from ArchipelagoMW#3688 to focus only on the sweep itself, and has been simplified to not require any additions to the existing world API. ### Make better use of region accessibility caching Region accessibility is cached per-player and the cache for that player becomes stale whenever an item is collected or removed for that player. The original sweeping implementations make use of this by only collecting items once all reachable locations have been found for all players. This patch changes to the sweep to find reachable locations per-player, collect the items from those locations and then move onto the next player. This keeps region accessibility cached as before, but has the advantage that if a player's reachable locations contain an item that gives a later player access to more locations, it will take fewer iterations for all reachable locations to be collected from. ### Skip checking location accessibility for locations belonging to a player which did not receive any advancement in the previous iteration If no advancement items belonging to a player were collected in the previous sweep iteration (after the point at which their reachable locations were found), then it is assumed that player should have zero newly reachable locations, so checking for their reachable locations can be skipped. While almost all worlds only logically depend on only their own items/locations etc., it is possible for a world to logically depend on a part of another world. To account for this, once the sweep runs out of accessible locations, the locations of all players are checked to confirm that the sweep is indeed finished. My previous idea to account for worlds logically depending on other worlds was to require all worlds to specify which worlds they logically depended on. Performing extra sweeps to confirm will be slower than this previous idea, but it is simpler and does not require any changes to the existing world API.
Increases the performance of sweeping state in `CollectionState.sweep_for_events()` and `MultiWorld.can_beat_game`, without changing the generated output (at least for supported games with their template yamls). `CollectionState.sweep_for_events()` takes up a significant amount of generation time, mostly from filling progression items, and the majority when not generating a playthrough, so improvements to its performance can have a noticeable effect on generation duration. Similarly, `MultiWorld.can_beat_game()` takes up a significant amount of playthrough calculation time from culling spheres. This patch reduces the duration of both and refactors them to use the same sweeping code. This patch has been split from ArchipelagoMW#3688 to focus only on the sweep itself, and has been simplified to not require any additions to the existing world API. Region accessibility is cached per-player and the cache for that player becomes stale whenever an item is collected or removed for that player. The original sweeping implementations make use of this by only collecting items once all reachable locations have been found for all players. This patch changes to the sweep to find reachable locations per-player, collect the items from those locations and then move onto the next player. This keeps region accessibility cached as before, but has the advantage that if a player's reachable locations contain an item that gives a later player access to more locations, it will take fewer iterations for all reachable locations to be collected from. player which did not receive any advancement in the previous iteration If no advancement items belonging to a player were collected in the previous sweep iteration (after the point at which their reachable locations were found), then it is assumed that player should have zero newly reachable locations, so checking for their reachable locations can be skipped. While almost all worlds only logically depend on only their own items/locations etc., it is possible for a world to logically depend on a part of another world. To account for this, once the sweep runs out of accessible locations, the locations of all players are checked to confirm that the sweep is indeed finished. My previous idea to account for worlds logically depending on other worlds was to require all worlds to specify which worlds they logically depended on. Performing extra sweeps to confirm will be slower than this previous idea, but it is simpler and does not require any changes to the existing world API.
Increases the performance of sweeping state in `CollectionState.sweep_for_events()` and `MultiWorld.can_beat_game`, without changing the generated output (at least for supported games with their template yamls). `CollectionState.sweep_for_events()` takes up a significant amount of generation time, mostly from filling progression items, and the majority when not generating a playthrough, so improvements to its performance can have a noticeable effect on generation duration. Similarly, `MultiWorld.can_beat_game()` takes up a significant amount of playthrough calculation time from culling spheres. This patch reduces the duration of both and refactors them to use the same sweeping code. This patch has been split from ArchipelagoMW#3688 to focus only on the sweep itself, and has been simplified to not require any additions to the existing world API. ### Make better use of region accessibility caching Region accessibility is cached per-player and the cache for that player becomes stale whenever an item is collected or removed for that player. The original sweeping implementations make use of this by only collecting items once all reachable locations have been found for all players. This patch changes to the sweep to find reachable locations per-player, collect the items from those locations and then move onto the next player. This keeps region accessibility cached as before, but has the advantage that if a player's reachable locations contain an item that gives a later player access to more locations, it will take fewer iterations for all reachable locations to be collected from. ### Skip checking location accessibility for locations belonging to a player which did not receive any advancement in the previous iteration If no advancement items belonging to a player were collected in the previous sweep iteration (after the point at which their reachable locations were found), then it is assumed that player should have zero newly reachable locations, so checking for their reachable locations can be skipped. While almost all worlds only logically depend on only their own items/locations etc., it is possible for a world to logically depend on a part of another world. To account for this, once the sweep runs out of accessible locations, the locations of all players are checked to confirm that the sweep is indeed finished. My previous idea to account for worlds logically depending on other worlds was to require all worlds to specify which worlds they logically depended on. Performing extra sweeps to confirm will be slower than this previous idea, but it is simpler and does not require any changes to the existing world API.
Increases the performance of sweeping state in `CollectionState.sweep_for_events()` and `MultiWorld.can_beat_game`, without changing the generated output (at least for supported games with their template yamls). `CollectionState.sweep_for_events()` takes up a significant amount of generation time, mostly from filling progression items, and the majority when not generating a playthrough, so improvements to its performance can have a noticeable effect on generation duration. Similarly, `MultiWorld.can_beat_game()` takes up a significant amount of playthrough calculation time from culling spheres. This patch reduces the duration of both and refactors them to use the same sweeping code. This patch has been split from ArchipelagoMW#3688 to focus only on the sweep itself, and has been simplified to not require any additions to the existing world API. ### Make better use of region accessibility caching Region accessibility is cached per-player and the cache for that player becomes stale whenever an item is collected or removed for that player. The original sweeping implementations make use of this by only collecting items once all reachable locations have been found for all players. This patch changes to the sweep to find reachable locations per-player, collect the items from those locations and then move onto the next player. This keeps region accessibility cached as before, but has the advantage that if a player's reachable locations contain an item that gives a later player access to more locations, it will take fewer iterations for all reachable locations to be collected from. ### Skip checking location accessibility for locations belonging to a player which did not receive any advancement in the previous iteration If no advancement items belonging to a player were collected in the previous sweep iteration (after the point at which their reachable locations were found), then it is assumed that player should have zero newly reachable locations, so checking for their reachable locations can be skipped. While almost all worlds only logically depend on only their own items/locations etc., it is possible for a world to logically depend on a part of another world. To account for this, once the sweep runs out of accessible locations, the locations of all players are checked to confirm that the sweep is indeed finished. My previous idea to account for worlds logically depending on other worlds was to require all worlds to specify which worlds they logically depended on. Performing extra sweeps to confirm will be slower than this previous idea, but it is simpler and does not require any changes to the existing world API.
What is this fixing or adding?
Increases the performance of sweeping state in
CollectionState.sweep_for_events()andMultiWorld.can_beat_game, without changing the generated output (at least for supported games with their template yamls).CollectionState.sweep_for_events()takes up a significant amount of generation time, mostly from filling progression items, and the majority when not generating a playthrough, so improvements to its performance can have a noticeable effect on generation duration.Similarly,
MultiWorld.can_beat_game()takes up a significant amount of playthrough calculation time from culling spheres.This patch reduces the duration of both by about 30-40% and refactors them to use the same sweeping code.
Skip checking location accessibility for locations belonging to a player which did not receive any advancement in the previous iteration
If a world did not receive any advancement from collected items in the previous sweep iteration, then it should have zero newly accessible locations, so checking for its accessible locations can be skipped.
This decreases generation time by not having to check accessibility of some players' locations in each sweep iteration.
A large amount of generation time is usually taken up by checking location accessibility, so reducing the number of locations that need to be checked can reduce generation time.
There is an issue, however, because it is possible for a world to gain accessible locations, while not receiving any advancement itself, if it has access rules that depend on a different world or a different world's items. Normally, a world only has access rules that depend on its own world/items, but some custom worlds could connect multiple worlds together using access rules.
To support worlds with logic that depend on other worlds, all World instances must now specify the player numbers of the World instances that their logic depends on. By default, each world is now set to depend on its itself. This is a breaking change for any worlds that already have logic that depends on other worlds.
Make better use of region accessibility caching
Region accessibility is cached per-player and the cache for that player becomes stale whenever an item is collected or removed for that player. The original sweeping implementations make use of this by only collecting items once all accessible locations have been found, but this can be improved upon.
Check accessible locations and collect from them per-player
Because the region accessibility caches are stored and staled per-player, all reachable locations can be found for a single player and then collected before repeating for the next player. This keeps region accessibility cached as before, but has the advantage that if a player's accessible locations contain an item that gives a later player access to more locations, it will take fewer iterations for all reachable locations to be collected.
As an additional optimization for the common case of a player's world only logically depending on itself and no other world's logically depending on it, if that player receives no advancement after finding their accessible locations, they are skipped in the next sweep iteration because they should not have any reachable locations at the start of the next sweep iteration.
Skip re-checking if a player can beat their game if they already could do so in a previous sweep
Specific to
MultiWorld.can_beat_game(), each sphere sweep iteration would re-check if every player could beat their games, this has been changed to only check players that were not already able to beat their games.The start of
MultiWorld.can_beat_game()has been rewritten to add the players than can already beat their games into a set and fixes #3742 as part of the rewrite.Recap of important changes/assumptions
With this patch:
World.player_dependenciesSet[int]that is set to{self.player}by default for all worlds. A section providing example usage of this has been added toworld api.md.How was this tested?
Fixed seeds were generated, with and without this patch, with different sets of template yamls in the Players directory, using games known to have deterministic generation. The output from each set of template yamls was seen to match. Playthrough calculation is non-deterministic, so was not considered here.
Seeds using item links have also been tested.
Values in the table are rounded to 1 decimal place.
Note that generation times with every game included do have a small amount of variance in generation time because maybe 1 or 2 games have non-deterministic generation issues.
These times were recorded based on the main branch at c66a860. As the logic in the main branch changes and is merged into this PR, the same seed may take longer or shorter to generate, so these times may not match the current state of main and this PR.
Generated with
python -O .\Generate.py --skip_output --seed 1with a meta.yaml to disable progression balancing.Generated with
python -O .\Generate.py --seed 1with a meta.yaml to disable progression balancing, andspoiler: 2inhost.yaml.