Respect NoSync region in FabArray::Copy and setBndry#4955
Respect NoSync region in FabArray::Copy and setBndry#4955WeiqunZhang merged 1 commit intoAMReX-Codes:developmentfrom
Conversation
|
I've not seen, but only found this because it didn't work :) Then I spent 2 hours debugging my opt build and why it didn't go async because I edited the _deps file |
|
I am sure there are still places we need to fix. I started a while ago making this kind of changes, but then lost steam. I also started something in this branch https://github.com/WeiqunZhang/amrex/tree/more_no_sync_region that adds a helper function Gpu::SyncAtExitOnly. Even if we are not in nosync region and the users expect an amrex function to be synchronous, in that function, we can still do sync only once in the end. |
Sorry to hear @bathmatt, here is some guidance on local cross-repo compiles :) |
|
finds about 300 of them. One simple example is In case you're impressed by my python skills, don't be in the code base at https://github.com/AMReX-Codes/amrex.git can you find calls to Gpu::streamSynchronize(); which are not protected by if (!Gpu::inNoSyncRegion()) { |
|
Let's see if I have enough tokens left on cursor to combine it with that find call to auto-update 300 at a time... while sitting in 6+hrs of meetings today. |
|
After device-to-host memory copies, there has to be a stream sync before the host can use the data. These syncs should not be put in if conditions. Note that this can happen a bit more subtly without a call to dtoh_memcpy_async etc. if the kernel directly writes to pinned memory. Another thing is syncs before deallocating temporary memory. We can remove them if we change the arena of the memory to The_Async_Arena. |
## Summary Follow-up to #4955 as discussed - [x] vibe code - [x] vibe review - [x] coarse manual review (1) - [x] vibe fix lifetimes by example - [x] manual self-review (2) - [x] run HPSF GPU CI - [ ] peer review cc @bathmatt @roelof-groenewald --------- Co-authored-by: Weiqun Zhang <[email protected]>
No description provided.