Reuse GPU tags in `MLEBABecLap::applyBC()` by ankithadas · Pull Request #4882 · AMReX-Codes/amrex

ankithadas · 2026-01-08T10:47:01Z

Summary

Additional background

Checklist

The proposed changes:

fix a bug or incorrect behavior in AMReX
add new capabilities to AMReX
changes answers in the test suite to more than roundoff level
are likely to significantly affect the results of downstream AMReX users
include documentation in the code and/or rst files, if appropriate

ankithadas · 2026-01-08T12:28:13Z

This is currently broken. Using more than 1 AMR level causes convergence issues

ankithadas · 2026-01-08T13:24:08Z

Slight performance improvement when reusing GPU tags in MLEBABecLap::applyBC()

GPU Device: NVIDIA A5000

Test: amrex/Tests/LinearSolvers/CellEB
Inputs

n_cell = 320
#verbose = 10
#use_petsc = true
eb2.geom_type = sphere
eb2.sphere_center = 0.5  0.5  0.5
eb2.sphere_radius = 0.25
eb2.sphere_has_fluid_inside = 0

After

Initializing AMReX (51bd06566543-dirty)...
MPI initialized with 1 MPI processes
MPI initialized with thread support level 0
Initializing CUDA...
CUDA initialized with 1 device.
AMReX (51bd06566543-dirty) initialized
vfrc min = 2.267635963e-07
Initial max, 1 and 2-norm residuals at level 0 = 1157920.353 1.18632832e+11 172310653.1
MLMG: # of AMR levels: 1
      # of MG levels on the coarsest AMR level: 7
MLMG: Initial rhs               = 0
MLMG: Initial residual (resid0) = 1157920.353
MLCGSolver_BiCGStab: Initial error (error0) =        0.187443743
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 8.127541048e-05
MLMG: Iteration   1 Fine resid/resid0 = 0.01655981191
MLCGSolver_BiCGStab: Initial error (error0) =        0.07363543285
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 6.34748546e-05
MLMG: Iteration   2 Fine resid/resid0 = 0.00136280975
MLCGSolver_BiCGStab: Initial error (error0) =        0.009381498681
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 6.824751003e-05
MLMG: Iteration   3 Fine resid/resid0 = 4.951131385e-05
MLCGSolver_BiCGStab: Initial error (error0) =        0.0005124706739
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 4.49532358e-05
MLMG: Iteration   4 Fine resid/resid0 = 2.410026621e-06
MLCGSolver_BiCGStab: Initial error (error0) =        1.978430222e-05
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 4.455584488e-05
MLMG: Iteration   5 Fine resid/resid0 = 1.007094382e-07
MLCGSolver_BiCGStab: Initial error (error0) =        1.049685501e-06
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 5.420743823e-05
MLMG: Iteration   6 Fine resid/resid0 = 4.673451236e-09
MLCGSolver_BiCGStab: Initial error (error0) =        3.884517412e-08
MLCGSolver_BiCGStab: Final: Iteration    7 rel. err. 3.635062157e-05
MLMG: Iteration   7 Fine resid/resid0 = 2.012107808e-10
MLCGSolver_BiCGStab: Initial error (error0) =        1.865120721e-09
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 5.896060112e-05
MLMG: Iteration   8 Fine resid/resid0 = 9.116323764e-12
MLCGSolver_BiCGStab: Initial error (error0) =        7.506198938e-11
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 7.780028184e-05
MLMG: Iteration   9 Fine resid/resid0 = 3.83532538e-13
MLMG: Final Iter. 9 resid, resid/resid0 = 4.441001316e-07, 3.83532538e-13
MLMG: Timers: Solve = 1.209650513 Iter = 1.186354926 Bottom = 0.012291025
Final max, 1 and 2-norm residuals at level 0 = 3.701518851e-05 0.8073304517 0.0003706030728


TinyProfiler total time across processes [min...avg...max]: 1.451 ... 1.451 ... 1.451

--------------------------------------------------------------------------------------------
Name                                         NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
--------------------------------------------------------------------------------------------
MLEBABecLap::Fsmooth()                          432     0.9391     0.9391     0.9391  64.71%
MLEBABecLap::Fapply()                           175     0.1428     0.1428     0.1428   9.84%
main                                              1    0.04345    0.04345    0.04345   2.99%
EB2::GShopLevel()-fine                            1    0.03669    0.03669    0.03669   2.53%
FillBoundary_nowait()                           637    0.03089    0.03089    0.03089   2.13%
FabArray::setVal()                              288    0.03001    0.03001    0.03001   2.07%
FabArray::Xpay()                                 66     0.0267     0.0267     0.0267   1.84%
MLLinOp::defineGrids()                            1    0.02224    0.02224    0.02224   1.53%
MLCellLinOp::defineAuxData()                      1    0.01503    0.01503    0.01503   1.04%
FabArray::ParallelCopy_nowait()                 273    0.01293    0.01293    0.01293   0.89%
EB2::GShopLevel()-coarse                          6    0.01173    0.01173    0.01173   0.81%
amrex::Add()                                      9    0.01091    0.01091    0.01091   0.75%
MLMG::ResNormInf()                               10    0.01074    0.01074    0.01074   0.74%
MLEBABecLap::interpolation()                     54    0.01059    0.01059    0.01059   0.73%
MLMG::prepareForSolve()                           1   0.009654   0.009654   0.009654   0.67%
FabArrayBase::CPC::define()                     104   0.008786   0.008786   0.008786   0.61%
MLMG::mgVcycle_down::1                            9   0.008778   0.008778   0.008778   0.60%
MLEBABecLap::define()                             1    0.00851    0.00851    0.00851   0.59%
MLABecLaplacian::prepareForSolve()                1   0.007636   0.007636   0.007636   0.53%
MLMG::addInterpCorrection()                      54   0.007334   0.007334   0.007334   0.51%
BndryData::define()                               1   0.005956   0.005956   0.005956   0.41%
MLMG::mgVcycle_down::0                            9   0.005507   0.005507   0.005507   0.38%
FabArrayBase::FB::FB()                           67   0.005479   0.005479   0.005479   0.38%
amrex::Copy()                                    33   0.005037   0.005037   0.005037   0.35%
MLEBABecLap::applyBC()                          607   0.004428   0.004428   0.004428   0.31%
FabArray::norminf()                             120   0.003285   0.003285   0.003285   0.23%
amrex::Dot()                                    218   0.003103   0.003103   0.003103   0.21%
MLCGSolver::bicgstab                              9   0.002477   0.002477   0.002477   0.17%
MultiFab::Subtract()                              2   0.002356   0.002356   0.002356   0.16%
MLCellLinOp::prepareForSolve()                    1   0.002229   0.002229   0.002229   0.15%
MLMG::mgVcycle_down::2                            9    0.00197    0.00197    0.00197   0.14%
MLMG::apply()                                     2   0.001962   0.001962   0.001962   0.14%
FabArray::BuildMask()                             7   0.001088   0.001088   0.001088   0.07%
MLMG::mgVcycle_down::3                            9  0.0006272  0.0006272  0.0006272   0.04%
FabArrayBase::getCPC()                          195  0.0001845  0.0001845  0.0001845   0.01%
MLMG::mgVcycle_down::4                            9  0.0001448  0.0001448  0.0001448   0.01%
FabArray::FillBoundary()                        637  0.0001352  0.0001352  0.0001352   0.01%
MLMG::mgVcycle_down::5                            9  0.0001299  0.0001299  0.0001299   0.01%
MLMG::solve()                                     1  0.0001273  0.0001273  0.0001273   0.01%
FabArrayBase::getFB()                           644  0.0001254  0.0001254  0.0001254   0.01%
MLCellLinOp::smooth()                           117  8.841e-05  8.841e-05  8.841e-05   0.01%
FabArray::ParallelCopy()                        273  7.435e-05  7.435e-05  7.435e-05   0.01%
MLEBABecLap::apply()                            175  4.818e-05  4.818e-05  4.818e-05   0.00%
EB2::Initialize()                                 1  4.557e-05  4.557e-05  4.557e-05   0.00%
MLCellLinOp::defineBC()                           1  3.304e-05  3.304e-05  3.304e-05   0.00%
MLMG::actualBottomSolve()                         9   2.36e-05   2.36e-05   2.36e-05   0.00%
MLCellLinOp::correctionResidual()                54  2.327e-05  2.327e-05  2.327e-05   0.00%
MLMG::mgVcycle()                                  9   2.23e-05   2.23e-05   2.23e-05   0.00%
MLMG::oneIter()                                   9  1.775e-05  1.775e-05  1.775e-05   0.00%
MLMG:computeResOfCorrection()                    54  1.768e-05  1.768e-05  1.768e-05   0.00%
MLCellLinOp::solutionResidual()                  12  1.181e-05  1.181e-05  1.181e-05   0.00%
MLMG::computeResidual()                           9  5.926e-06  5.926e-06  5.926e-06   0.00%
MLMG::mgVcycle_up::0                              9  5.535e-06  5.535e-06  5.535e-06   0.00%
MLLinOp::define()                                 1    5.5e-06    5.5e-06    5.5e-06   0.00%
MLMG::mgVcycle_bottom                             9  4.803e-06  4.803e-06  4.803e-06   0.00%
MLMG::mgVcycle_up::5                              9  4.496e-06  4.496e-06  4.496e-06   0.00%
MLMG::mgVcycle_up::1                              9  3.287e-06  3.287e-06  3.287e-06   0.00%
MLMG::mgVcycle_up::4                              9  2.472e-06  2.472e-06  2.472e-06   0.00%
MLMG::mgVcycle_up::3                              9  2.462e-06  2.462e-06  2.462e-06   0.00%
MLMG::mgVcycle_up::2                              9  2.446e-06  2.446e-06  2.446e-06   0.00%
MLMG::computeMLResidual()                         1   7.52e-07   7.52e-07   7.52e-07   0.00%
Other                                          1448   0.009917   0.009917   0.009917   0.68%
--------------------------------------------------------------------------------------------

Before

Initializing AMReX (51bd06566543-dirty)...
MPI initialized with 1 MPI processes
MPI initialized with thread support level 0
Initializing CUDA...
CUDA initialized with 1 device.
AMReX (51bd06566543-dirty) initialized
vfrc min = 2.267635963e-07
Initial max, 1 and 2-norm residuals at level 0 = 1157920.353 1.18632832e+11 172310653.1
MLMG: # of AMR levels: 1
      # of MG levels on the coarsest AMR level: 7
MLMG: Initial rhs               = 0
MLMG: Initial residual (resid0) = 1157920.353
MLCGSolver_BiCGStab: Initial error (error0) =        0.1874375103
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 8.15655458e-05
MLMG: Iteration   1 Fine resid/resid0 = 0.01655981191
MLCGSolver_BiCGStab: Initial error (error0) =        0.07363980466
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 6.582736103e-05
MLMG: Iteration   2 Fine resid/resid0 = 0.001362809017
MLCGSolver_BiCGStab: Initial error (error0) =        0.009383728425
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 6.662239496e-05
MLMG: Iteration   3 Fine resid/resid0 = 4.951338737e-05
MLCGSolver_BiCGStab: Initial error (error0) =        0.0005125537207
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 4.707445947e-05
MLMG: Iteration   4 Fine resid/resid0 = 2.409647954e-06
MLCGSolver_BiCGStab: Initial error (error0) =        1.98425658e-05
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 4.635330225e-05
MLMG: Iteration   5 Fine resid/resid0 = 1.007922962e-07
MLCGSolver_BiCGStab: Initial error (error0) =        1.049047774e-06
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 5.403871333e-05
MLMG: Iteration   6 Fine resid/resid0 = 4.67256324e-09
MLCGSolver_BiCGStab: Initial error (error0) =        3.888555559e-08
MLCGSolver_BiCGStab: Final: Iteration    7 rel. err. 3.442317299e-05
MLMG: Iteration   7 Fine resid/resid0 = 2.013351971e-10
MLCGSolver_BiCGStab: Initial error (error0) =        1.865300846e-09
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 5.686638177e-05
MLMG: Iteration   8 Fine resid/resid0 = 9.116421951e-12
MLCGSolver_BiCGStab: Initial error (error0) =        7.540114335e-11
MLCGSolver_BiCGStab: Final: Iteration    6 rel. err. 7.540490835e-05
MLMG: Iteration   9 Fine resid/resid0 = 3.838467205e-13
MLMG: Final Iter. 9 resid, resid/resid0 = 4.444639299e-07, 3.838467205e-13
MLMG: Timers: Solve = 1.269669695 Iter = 1.24614023 Bottom = 0.013362239
Final max, 1 and 2-norm residuals at level 0 = 3.626202169e-05 0.808276678 0.0003671337233


TinyProfiler total time across processes [min...avg...max]: 1.509 ... 1.509 ... 1.509

--------------------------------------------------------------------------------------------
Name                                         NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
--------------------------------------------------------------------------------------------
MLEBABecLap::Fsmooth()                          432     0.8934     0.8934     0.8934  59.21%
MLEBABecLap::Fapply()                           175     0.1263     0.1263     0.1263   8.37%
MLEBABecLap::applyBC()                          607     0.1257     0.1257     0.1257   8.33%
main                                              1    0.04373    0.04373    0.04373   2.90%
EB2::GShopLevel()-fine                            1    0.03685    0.03685    0.03685   2.44%
FillBoundary_nowait()                           637    0.03095    0.03095    0.03095   2.05%
FabArray::setVal()                              288    0.02994    0.02994    0.02994   1.98%
FabArray::Xpay()                                 66    0.02665    0.02665    0.02665   1.77%
MLLinOp::defineGrids()                            1    0.02096    0.02096    0.02096   1.39%
MLCellLinOp::defineAuxData()                      1    0.01413    0.01413    0.01413   0.94%
FabArray::ParallelCopy_nowait()                 273    0.01274    0.01274    0.01274   0.84%
EB2::GShopLevel()-coarse                          6    0.01178    0.01178    0.01178   0.78%
amrex::Add()                                      9    0.01095    0.01095    0.01095   0.73%
MLMG::ResNormInf()                               10    0.01079    0.01079    0.01079   0.72%
MLEBABecLap::interpolation()                     54    0.01067    0.01067    0.01067   0.71%
MLMG::prepareForSolve()                           1   0.009721   0.009721   0.009721   0.64%
MLMG::mgVcycle_down::1                            9    0.00894    0.00894    0.00894   0.59%
FabArrayBase::CPC::define()                     104   0.008569   0.008569   0.008569   0.57%
MLEBABecLap::define()                             1   0.008326   0.008326   0.008326   0.55%
MLABecLaplacian::prepareForSolve()                1   0.007643   0.007643   0.007643   0.51%
MLMG::addInterpCorrection()                      54   0.007416   0.007416   0.007416   0.49%
BndryData::define()                               1   0.005828   0.005828   0.005828   0.39%
MLMG::mgVcycle_down::0                            9    0.00559    0.00559    0.00559   0.37%
FabArrayBase::FB::FB()                           67   0.005561   0.005561   0.005561   0.37%
amrex::Copy()                                    33   0.005037   0.005037   0.005037   0.33%
FabArray::norminf()                             120   0.003387   0.003387   0.003387   0.22%
amrex::Dot()                                    218   0.003202   0.003202   0.003202   0.21%
MLCGSolver::bicgstab                              9   0.002576   0.002576   0.002576   0.17%
MultiFab::Subtract()                              2    0.00236    0.00236    0.00236   0.16%
MLMG::mgVcycle_down::2                            9   0.002014   0.002014   0.002014   0.13%
MLMG::apply()                                     2   0.001973   0.001973   0.001973   0.13%
FabArray::BuildMask()                             7   0.000967   0.000967   0.000967   0.06%
MLMG::mgVcycle_down::3                            9  0.0006366  0.0006366  0.0006366   0.04%
FabArrayBase::getCPC()                          195    0.00018    0.00018    0.00018   0.01%
MLMG::solve()                                     1  0.0001654  0.0001654  0.0001654   0.01%
MLMG::mgVcycle_down::4                            9  0.0001463  0.0001463  0.0001463   0.01%
FabArrayBase::getFB()                           644  0.0001388  0.0001388  0.0001388   0.01%
FabArray::FillBoundary()                        637  0.0001362  0.0001362  0.0001362   0.01%
MLMG::mgVcycle_down::5                            9  0.0001339  0.0001339  0.0001339   0.01%
MLCellLinOp::smooth()                           117  9.462e-05  9.462e-05  9.462e-05   0.01%
FabArray::ParallelCopy()                        273  6.667e-05  6.667e-05  6.667e-05   0.00%
MLEBABecLap::apply()                            175  4.681e-05  4.681e-05  4.681e-05   0.00%
EB2::Initialize()                                 1  4.624e-05  4.624e-05  4.624e-05   0.00%
MLCellLinOp::defineBC()                           1   3.14e-05   3.14e-05   3.14e-05   0.00%
MLMG::mgVcycle()                                  9  2.969e-05  2.969e-05  2.969e-05   0.00%
MLCellLinOp::correctionResidual()                54  2.852e-05  2.852e-05  2.852e-05   0.00%
MLMG::actualBottomSolve()                         9  2.748e-05  2.748e-05  2.748e-05   0.00%
MLMG:computeResOfCorrection()                    54  2.373e-05  2.373e-05  2.373e-05   0.00%
MLCellLinOp::solutionResidual()                  12  1.617e-05  1.617e-05  1.617e-05   0.00%
MLMG::oneIter()                                   9  1.596e-05  1.596e-05  1.596e-05   0.00%
MLMG::computeResidual()                           9  8.129e-06  8.129e-06  8.129e-06   0.00%
MLMG::mgVcycle_up::0                              9  7.414e-06  7.414e-06  7.414e-06   0.00%
MLMG::mgVcycle_bottom                             9  5.569e-06  5.569e-06  5.569e-06   0.00%
MLLinOp::define()                                 1  4.885e-06  4.885e-06  4.885e-06   0.00%
MLMG::mgVcycle_up::1                              9  4.365e-06  4.365e-06  4.365e-06   0.00%
MLMG::mgVcycle_up::5                              9  4.163e-06  4.163e-06  4.163e-06   0.00%
MLMG::mgVcycle_up::3                              9   2.71e-06   2.71e-06   2.71e-06   0.00%
MLMG::mgVcycle_up::2                              9  2.572e-06  2.572e-06  2.572e-06   0.00%
MLMG::mgVcycle_up::4                              9  2.483e-06  2.483e-06  2.483e-06   0.00%
MLMG::computeMLResidual()                         1  1.267e-06  1.267e-06  1.267e-06   0.00%
Other                                          1449    0.01214    0.01214    0.01214   0.80%
--------------------------------------------------------------------------------------------

WeiqunZhang · 2026-01-08T19:10:02Z

/run-hpsf-gitlab-ci

github-actions · 2026-01-08T19:10:11Z

GitLab CI has started at https://gitlab.spack.io/amrex/amrex/-/pipelines/1380239.

amrex-gitlab-ci-reporter · 2026-01-08T19:48:11Z

GitLab CI 1380239 finished with status: success. See details at https://gitlab.spack.io/amrex/amrex/-/pipelines/1380239.

ankithadas · 2026-01-08T23:17:46Z

It looks like this might be a very small performance improvement at best. Adding Gpu::streamSynchronize() after amrex::ParallelFor(tags... shows the actual cost of applyBC().

MLMG: Iteration   9 Fine resid/resid0 = 3.839056296e-13
MLMG: Final Iter. 9 resid, resid/resid0 = 4.44532142e-07, 3.839056296e-13
MLMG: Timers: Solve = 1.223781896 Iter = 1.200463613 Bottom = 0.013117793
Final max, 1 and 2-norm residuals at level 0 = 4.136757882e-05 0.8063068762 0.0003748244268


TinyProfiler total time across processes [min...avg...max]: 1.462 ... 1.462 ... 1.462

--------------------------------------------------------------------------------------------
Name                                         NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
--------------------------------------------------------------------------------------------
MLEBABecLap::Fsmooth()                          432     0.8718     0.8718     0.8718  59.64%
MLEBABecLap::Fapply()                           175     0.1231     0.1231     0.1231   8.42%
MLEBABecLap::applyBC()                          607     0.1049     0.1049     0.1049   7.18%
main                                              1    0.04349    0.04349    0.04349   2.97%
EB2::GShopLevel()-fine                            1    0.03701    0.03701    0.03701   2.53%

AlexanderSinn · 2026-01-08T23:43:45Z

The total runtime is faster by 3.7% I think that's quite decent if repeatable.

AlexanderSinn · 2026-01-08T23:46:47Z

You can benchmark with both versions tiny_profiler.device_synchronize_around_region = 1 that adds stream synchronization around all profiler sections.

ankithadas · 2026-01-11T22:51:11Z

I could rewrite this without using goto statements.

WeiqunZhang · 2026-01-11T22:52:52Z

Sure. That would be better. Thanks!

WeiqunZhang · 2026-01-11T23:51:07Z

This is great! Thanks!

I did a test using 4 AMD GPUs on a Frontier node with 512^3 cells. The applyBC time went down from 0.157 to 0.118. The total MLMG::solve time went down from 0.679 to 0.639, consistent with the performance improvement of applyBC. I used tiny_profiler.device_synchronize_around_region = 1.

ankithadas · 2026-01-12T00:00:45Z

Oh wow that's great. In that case, I will update MLCellLinOp::applyBC() as well. That said, it does raise a concern about the cost of setting up tags. I had assumed this overhead would be quite small.

WeiqunZhang · 2026-01-12T00:09:14Z

Please do MLCellLinOp::applyBC() in a different PR.

ankithadas · 2026-01-13T03:12:07Z

@WeiqunZhang Is there a specific reason for iterating over idim and explicitly setting low and high sides, instead of directly looping over the Orientations?

WeiqunZhang · 2026-01-13T19:11:58Z

I don't think there are any reasons other than personal taste.

WeiqunZhang · 2026-01-13T19:18:27Z

I could rewrite this without using goto statements.

Do you plan to rewrite it? Either way is fine with me.

Same optimization as implemented in #4882. --------- Co-authored-by: Ankith A Das <[email protected]> Co-authored-by: Weiqun Zhang <[email protected]>

ankithadas added 4 commits January 8, 2026 21:36

Testing reusing tags

241b6f5

Attempting to fix tags

a84457c

Cleanup

05e78ac

Fix warnings

c7b780a

This comment was marked as outdated.

Sign in to view

ankithadas marked this pull request as ready for review January 8, 2026 10:56

ankithadas changed the title ~~Reuse GPU tags in MLEBABecLap:: applyBC()~~ Reuse GPU tags in MLEBABecLap::applyBC() Jan 8, 2026

ankithadas marked this pull request as draft January 8, 2026 12:26

ankithadas added 2 commits January 8, 2026 23:49

bndryVals cannot be included in tags. Testing new implementation

9c15e6b

Removed bcval from tags array

40e97f0

ankithadas marked this pull request as ready for review January 8, 2026 13:24

ankithadas and others added 3 commits January 9, 2026 14:27

Removed #include "AMReX_Orientation.H"

86e4c1a

Return earlier if not run_on_gpu

e5204f3

Fix MFItInfo initialization error introduced in my last commit

41ea9be

Removed goto

8eb39bf

WeiqunZhang approved these changes Jan 14, 2026

View reviewed changes

WeiqunZhang merged commit 443bf9e into AMReX-Codes:development Jan 14, 2026
88 of 89 checks passed

ankithadas deleted the MLEBABecLap-Reuse-Tags branch January 14, 2026 04:13

ankithadas mentioned this pull request Jan 16, 2026

Reuse Gpu tags in MLCellLinOp::applyBC() #4899

Merged

5 tasks

WeiqunZhang added a commit that referenced this pull request Jan 19, 2026

Reuse Gpu tags in MLCellLinOp::applyBC() (#4899)

e585140

Same optimization as implemented in #4882. --------- Co-authored-by: Ankith A Das <[email protected]> Co-authored-by: Weiqun Zhang <[email protected]>

Conversation

ankithadas commented Jan 8, 2026

Summary

Additional background

Checklist

Uh oh!

This comment was marked as outdated.

ankithadas commented Jan 8, 2026

Uh oh!

ankithadas commented Jan 8, 2026

Uh oh!

WeiqunZhang commented Jan 8, 2026

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

amrex-gitlab-ci-reporter bot commented Jan 8, 2026

Uh oh!

ankithadas commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlexanderSinn commented Jan 8, 2026

Uh oh!

AlexanderSinn commented Jan 8, 2026

Uh oh!

ankithadas commented Jan 11, 2026

Uh oh!

WeiqunZhang commented Jan 11, 2026

Uh oh!

WeiqunZhang commented Jan 11, 2026

Uh oh!

ankithadas commented Jan 12, 2026

Uh oh!

WeiqunZhang commented Jan 12, 2026

Uh oh!

ankithadas commented Jan 13, 2026

Uh oh!

WeiqunZhang commented Jan 13, 2026

Uh oh!

WeiqunZhang commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ankithadas commented Jan 8, 2026 •

edited

Loading