-
Notifications
You must be signed in to change notification settings - Fork 6k
[Impeller] removed collisions of dead command pools between tests. #41490
Conversation
|
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!). If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix? Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
b6036f7 to
35c283e
Compare
35c283e to
aec065b
Compare
| thread_local uint64_t tls_context_count = 0; | ||
| uint64_t CalculateHash(void* ptr) { | ||
| // You could make a context once per nanosecond for 584 years on one thread | ||
| // before this overflows. | ||
| return ++tls_context_count; | ||
| } | ||
| } // namespace | ||
|
|
||
| ContextVK::ContextVK() : hash_(CalculateHash(this)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler to just use the address of the object instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(really I think we want a weak_ptr instead)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler to just use the address of the object instead?
That's what it was before and that doesn't work because eventually you'll get a duplicate address and everything will come crashing down (details in the linked issue).
(really I think we want a weak_ptr instead)
It can't be a weak_ptr because the way the vulkan backend was designed is that this global variable is supposed to keep the pool alive until the thread is destroyed. One of the first things I did was try to make it be a weak_ptr but that would lead to a crash on the worker thread. I'm still seeing racy crashes so this may have to be changed eventually. I'm just trying to make the original design not crash.
| } | ||
| CommandPoolMap& pool_map = *tls_command_pool.get(); | ||
| auto found = pool_map.find(context); | ||
| auto found = pool_map.find(context->GetHash()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so here, just static_cast<uintptr_t>(context)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
|
|
||
| // TODO(tbd): This is storing tons of CommandPoolVK's in the test runner since | ||
| // many contexts are created on the same thread. We need to come up | ||
| // with a different way to clean these up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option that I'm not 100% sure would work - make the key to this a std::weak_ptr<ContextVk>, and whenever the call is made to GetThreadLocal iterate through the map and delete dead weak_ptr keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should work if we change ContextVK to std::enable_shared_from_this and update ContextVK::Create to return shared_from_this() instead of creating a shared pointer, and then call GetThreadLocal(std::weak_from_this()).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted above, this is how it was designed to work (the global variable keeps the pool alive long enough for the worker thread to use it, then destroys it when the thread is killed). I think we'll probably need to fix this design at some point and is a good reminder why the C++ style guide disallows nontrivially deleted global variables.
The alternative design would be something like make this variable a raw pointer and have the context flush and synchronize the worker thread before deleting the pool.
I'm just removing the crash and keeping the same design for now.
dnfield
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM to try to make this less flaky
|
Ahh shoot, didn't update the TODO with an issue, i'll follow up real quick with another pr. |
followup from #41490 ## Pre-launch Checklist - [x] I read the [Contributor Guide] and followed the process outlined there for submitting PRs. - [x] I read the [Tree Hygiene] wiki page, which explains my responsibilities. - [x] I read and followed the [Flutter Style Guide] and the [C++, Objective-C, Java style guides]. - [ ] I listed at least one issue that this PR fixes in the description above. - [ ] I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See [testing the engine] for instructions on writing and running engine tests. - [x] I updated/added relevant documentation (doc comments with `///`). - [x] I signed the [CLA]. - [x] All existing and new tests are passing. If you need help, consider asking for advice on the #hackers-new channel on [Discord]. <!-- Links --> [Contributor Guide]: https://github.com/flutter/flutter/wiki/Tree-hygiene#overview [Tree Hygiene]: https://github.com/flutter/flutter/wiki/Tree-hygiene [Flutter Style Guide]: https://github.com/flutter/flutter/wiki/Style-guide-for-Flutter-repo [C++, Objective-C, Java style guides]: https://github.com/flutter/engine/blob/main/CONTRIBUTING.md#style [testing the engine]: https://github.com/flutter/flutter/wiki/Testing-the-engine [CLA]: https://cla.developers.google.com/ [flutter/tests]: https://github.com/flutter/tests [breaking change policy]: https://github.com/flutter/flutter/wiki/Tree-hygiene#handling-breaking-changes [Discord]: https://github.com/flutter/flutter/wiki/Chat
…125578) flutter/engine@f2882af...cf97541 2023-04-26 [email protected] Migrate Windows AOT Engine to Engine V2. (flutter/engine#41515) 2023-04-26 [email protected] [Impeller] Store the root stencil on iOS (flutter/engine#41509) 2023-04-26 [email protected] [Impeller] iOS/macOS: Only wait for command scheduling prior to present (redux) (flutter/engine#41501) 2023-04-26 [email protected] [Impeller] removed collisions of dead command pools between tests. (flutter/engine#41490) If this roll has caused a breakage, revert this CL and stop the roller using the controls here: https://autoroll.skia.org/r/flutter-engine-flutter-autoroll Please CC [email protected],[email protected],[email protected] on the revert to ensure that a human is aware of the problem. To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose To report a problem with the AutoRoller itself, please file a bug: https://bugs.chromium.org/p/skia/issues/entry?template=Autoroller+Bug Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
fixes flutter/flutter#125321
Tests coverage from existing tests (see issue).
Pre-launch Checklist
///).If you need help, consider asking for advice on the #hackers-new channel on Discord.