Skip to content

Clean up RedistributeCPU#4529

Merged
atmyers merged 20 commits intoAMReX-Codes:developmentfrom
AlexanderSinn:clean_up_RedistributeCPU
Mar 13, 2026
Merged

Clean up RedistributeCPU#4529
atmyers merged 20 commits intoAMReX-Codes:developmentfrom
AlexanderSinn:clean_up_RedistributeCPU

Conversation

@AlexanderSinn
Copy link
Copy Markdown
Member

@AlexanderSinn AlexanderSinn commented Jun 27, 2025

Summary

This PR simplifies RedistributeCPU to be independent of particle layout in preparation for #4404.
For this, push_back is replaced by a resize with a geometric growth strategy. push_back is error-prone due to having the possibility to desynchronize the sizes of the individual component vectors if used incorrectly.

Additional background

Checklist

The proposed changes:

  • fix a bug or incorrect behavior in AMReX
  • add new capabilities to AMReX
  • changes answers in the test suite to more than roundoff level
  • are likely to significantly affect the results of downstream AMReX users
  • include documentation in the code and/or rst files, if appropriate

@AlexanderSinn AlexanderSinn requested a review from atmyers July 8, 2025 16:53
@AlexanderSinn
Copy link
Copy Markdown
Member Author

AlexanderSinn commented Jul 26, 2025

Performance test on a 64 core CPU with 8 MPI ranks and 4 threads per rank, warm electron plasma with WarpX

amr.n_cell = 128 128 128
amr.max_grid_size = 16
amr.blocking_factor = 16

Dev:

Name                                                       NCalls  Incl. Min  Incl. Avg  Incl. Max   Max %
----------------------------------------------------------------------------------------------------------
ParticleContainer::RedistributeCPU()                          101      1.774      1.782      1.786   7.76%
ParticleContainer::RedistributeCPU()                          101      1.775      1.784      1.787   7.71%
ParticleContainer::RedistributeCPU()                          101      1.823       1.83      1.835   7.95%

PR:

Name                                                       NCalls  Incl. Min  Incl. Avg  Incl. Max   Max %
----------------------------------------------------------------------------------------------------------
ParticleContainer::RedistributeCPU()                          101      1.748      1.787      1.797   7.76%
ParticleContainer::RedistributeCPU()                          101       1.74      1.771      1.778   7.70%
ParticleContainer::RedistributeCPU()                          101       1.73      1.739      1.744   7.56%

@AlexanderSinn
Copy link
Copy Markdown
Member Author

WarpX CI passes with this change BLAST-WarpX/warpx#6042

auto dst_ptd = dst_ptile.getParticleTileData();
auto src_ptd = src_ptile[i].getParticleTileData();

for (Long j = 0; j < to_copy; ++j) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have guessed that this would be slower than the bulk insert operations it is replacing in the case that you have a lot of particles to move locally. But, your benchmark seems to cover this case.

Copy link
Copy Markdown
Member

@atmyers atmyers Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At any rate, the ultimate plan is to merge this with RedistributeGPU, so it's not an important point.

@atmyers atmyers merged commit 8778455 into AMReX-Codes:development Mar 13, 2026
74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants