Skip to content

Reuse Gpu tags in MLCellLinOp::applyBC()#4899

Merged
WeiqunZhang merged 3 commits intoAMReX-Codes:developmentfrom
ankithadas:MLCellLinOp-applyBC-reuse-tags
Jan 19, 2026
Merged

Reuse Gpu tags in MLCellLinOp::applyBC()#4899
WeiqunZhang merged 3 commits intoAMReX-Codes:developmentfrom
ankithadas:MLCellLinOp-applyBC-reuse-tags

Conversation

@ankithadas
Copy link
Copy Markdown
Contributor

Summary

Same optimization as implemented in #4882.

Additional background

Checklist

The proposed changes:

  • fix a bug or incorrect behavior in AMReX
  • add new capabilities to AMReX
  • changes answers in the test suite to more than roundoff level
  • are likely to significantly affect the results of downstream AMReX users
  • include documentation in the code and/or rst files, if appropriate

@ankithadas
Copy link
Copy Markdown
Contributor Author

ankithadas commented Jan 16, 2026

Small performance improvement when MLCellLinOp::applyBC() is repeatedly called.

Test case: LinearSolvers/ABecLaplacian_C/3d

For n_cell=320 and using amrex.use_gpu_aware_mpi=1 tiny_profiler.device_synchronize_around_region = 1

After:

MLMG: Final Iter. 6 resid, resid/resid0 = 1.941288332e-05, 7.152966126e-11
MLMG: Timers: Solve = 0.597130692 Iter = 0.585457216 Bottom = 0.109930995
Level 0 max-norm error: 4.496977726e-05 1-norm error: 1.176279099e-05 2-norm error: 1.591549109e-05
Level 1 max-norm error: 3.196255245e-05 1-norm error: 6.879450408e-07 2-norm error: 3.154006476e-06
Level 2 max-norm error: 2.148941828e-06 1-norm error: 1.121850939e-08 2-norm error: 1.178754315e-07


TinyProfiler total time across processes [min...avg...max]: 2.185 ... 2.185 ... 2.185

--------------------------------------------------------------------------------------------
Name                                         NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
--------------------------------------------------------------------------------------------
MLPoisson::Fsmooth()                           2904     0.9028     0.9028     0.9028  41.31%
MLCellLinOp::applyBC()                         5878     0.2398     0.2398     0.2398  10.98%
MLPoisson::Fapply()                            2974     0.1544     0.1544     0.1544   7.06%
amrex::Dot()                                   4948     0.1241     0.1241     0.1241   5.68%

Before

MLMG: Final Iter. 6 resid, resid/resid0 = 1.941288332e-05, 7.152966126e-11
MLMG: Timers: Solve = 0.627737748 Iter = 0.616312779 Bottom = 0.118744412
Level 0 max-norm error: 4.496977726e-05 1-norm error: 1.176279099e-05 2-norm error: 1.591549109e-05
Level 1 max-norm error: 3.196255245e-05 1-norm error: 6.879450408e-07 2-norm error: 3.154006476e-06
Level 2 max-norm error: 2.148941828e-06 1-norm error: 1.121850939e-08 2-norm error: 1.178754315e-07


TinyProfiler total time across processes [min...avg...max]: 2.294 ... 2.294 ... 2.294

--------------------------------------------------------------------------------------------
Name                                         NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
--------------------------------------------------------------------------------------------
MLPoisson::Fsmooth()                           2904     0.9031     0.9031     0.9031  39.37%
MLCellLinOp::applyBC()                         5878     0.3404     0.3404     0.3404  14.84%
MLPoisson::Fapply()                            2974     0.1543     0.1543     0.1543   6.73%
amrex::Dot()                                   4948     0.1263     0.1263     0.1263   5.51%

@ankithadas ankithadas marked this pull request as ready for review January 16, 2026 11:27
@WeiqunZhang
Copy link
Copy Markdown
Member

/run-hpsf-gitlab-ci

@github-actions
Copy link
Copy Markdown

@amrex-gitlab-ci-reporter
Copy link
Copy Markdown

GitLab CI 1392856 finished with status: success. See details at https://gitlab.spack.io/amrex/amrex/-/pipelines/1392856.

@WeiqunZhang WeiqunZhang merged commit e585140 into AMReX-Codes:development Jan 19, 2026
74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants