Skip to content

Respect NoSync region in FabArray::Copy and setBndry#4955

Merged
WeiqunZhang merged 1 commit intoAMReX-Codes:developmentfrom
WeiqunZhang:fabarray_no_sync
Feb 19, 2026
Merged

Respect NoSync region in FabArray::Copy and setBndry#4955
WeiqunZhang merged 1 commit intoAMReX-Codes:developmentfrom
WeiqunZhang:fabarray_no_sync

Conversation

@WeiqunZhang
Copy link
Copy Markdown
Member

No description provided.

Copy link
Copy Markdown
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, let's see if @bathmatt spotted more locations.

@bathmatt
Copy link
Copy Markdown

I've not seen, but only found this because it didn't work :) Then I spent 2 hours debugging my opt build and why it didn't go async because I edited the _deps file

@WeiqunZhang
Copy link
Copy Markdown
Member Author

I am sure there are still places we need to fix. I started a while ago making this kind of changes, but then lost steam.

I also started something in this branch https://github.com/WeiqunZhang/amrex/tree/more_no_sync_region that adds a helper function Gpu::SyncAtExitOnly. Even if we are not in nosync region and the users expect an amrex function to be synchronous, in that function, we can still do sync only once in the end.

@ax3l
Copy link
Copy Markdown
Member

ax3l commented Feb 19, 2026

Then I spent 2 hours debugging my opt build and why it didn't go async because I edited the _deps file

Sorry to hear @bathmatt, here is some guidance on local cross-repo compiles :)
https://warpx.readthedocs.io/en/latest/developers/how_to_compile_locally.html

@bathmatt
Copy link
Copy Markdown

bathmatt commented Feb 19, 2026

find.py

finds about 300 of them. One simple example is

⚠️ Potential unprotected call in: ./Src/Base/AMReX_BaseFab.H (Line 2186)
Context:
2183:             }
2184:         });
2185:         Gpu::dtoh_memcpy_async(ha.data(), p, sizeof(int)*ha.size());
2186:         Gpu::streamSynchronize();

In case you're impressed by my python skills, don't be

in the code base at https://github.com/AMReX-Codes/amrex.git can you find calls to Gpu::streamSynchronize(); which are not protected by if (!Gpu::inNoSyncRegion()) {

@ax3l
Copy link
Copy Markdown
Member

ax3l commented Feb 19, 2026

Let's see if I have enough tokens left on cursor to combine it with that find call to auto-update 300 at a time... while sitting in 6+hrs of meetings today.

@AlexanderSinn
Copy link
Copy Markdown
Member

After device-to-host memory copies, there has to be a stream sync before the host can use the data. These syncs should not be put in if conditions. Note that this can happen a bit more subtly without a call to dtoh_memcpy_async etc. if the kernel directly writes to pinned memory. Another thing is syncs before deallocating temporary memory. We can remove them if we change the arena of the memory to The_Async_Arena.

@WeiqunZhang WeiqunZhang merged commit 2e4b667 into AMReX-Codes:development Feb 19, 2026
74 checks passed
@WeiqunZhang WeiqunZhang deleted the fabarray_no_sync branch February 19, 2026 17:54
ax3l added a commit to ax3l/amrex that referenced this pull request Feb 19, 2026
@ax3l ax3l mentioned this pull request Feb 19, 2026
12 tasks
WeiqunZhang added a commit that referenced this pull request Mar 2, 2026
## Summary

Follow-up to #4955 as discussed

- [x] vibe code
- [x] vibe review
- [x] coarse manual review (1)
- [x] vibe fix lifetimes by example
- [x] manual self-review (2)
- [x] run HPSF GPU CI
- [ ] peer review 

cc @bathmatt @roelof-groenewald 

---------

Co-authored-by: Weiqun Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants