Skip to content

Controls for plotting residuals when MLMG fails#598

Merged
baperry2 merged 15 commits intoAMReX-Combustion:developmentfrom
d-montgomery:mlmg_fail_dump_residuals
Dec 12, 2025
Merged

Controls for plotting residuals when MLMG fails#598
baperry2 merged 15 commits intoAMReX-Combustion:developmentfrom
d-montgomery:mlmg_fail_dump_residuals

Conversation

@d-montgomery
Copy link
Copy Markdown
Contributor

When MLMG linear solvers fail to converge, it can be difficult to diagnose why. This PR helps users visualize the state and MLMG residuals after a given solve fails by providing:

  • Automatic residual and state dumping to plot files called pltMLMGResidual_<solver>_<step>_<iters> when MLMG solvers fail. This is accomplished by wrapping MLMG calls in try-catch blocks when peleLM.mlmg_fail_plt_residuals = true
  • Per-solver iteration controls so users can configure when to dump residuals and limit MLMG iterations based on the number of SDC and, when appropriate, deltaT iterations.
  • Enhanced solver identification when verbosity is enabled for a given solver. The output now shows which specific solver is running (species/temperature/velocity diffusion, MAC/nodal projection)

This PR also contains a minor unrelated LaTeX math formatting correction in the Smagorinsky model documentation.

Input parameters for dumping residuals to plot files:

peleLM.mlmg_fail_plt_residuals = true                         # [OPT, DEF=false] Dump MLMG residuals plotfiles on MLMG failure

# MAC projection controls
mac_proj.mlmg_fail_sdc_miniter = 2                             # [OPT, DEF=-1] Minimum SDC iterations before dumping residuals on MLMG failure
mac_proj.mlmg_fail_maxiter_after_sdc_miniter = 3               # [OPT, DEF=-1] Maximum MLMG solver iters after minimum SDC iters have been reached 

# Species/Temperature diffusion controls
diffusion.mlmg_fail_sdc_miniter = 1                            # [OPT, DEF=-1] Minimum SDC iterations before dumping residuals on MLMG failure
diffusion.mlmg_fail_deltaT_miniter = 4                         # [OPT, DEF=-1] Minimum deltaT iterations before dumping residuals on MLMG failure (only for temperature diffusion)
diffusion.mlmg_fail_species_maxiter_after_sdc_miniter = 5      # [OPT, DEF=-1] Maximum species MLMG solver iters after minimum SDC iters have been reached 
diffusion.mlmg_fail_temp_maxiter_after_sdc_deltaT_miniter = 3  # [OPT, DEF=-1] Maximum temp MLMG solver iters after minimum SDC and deltaT iters have been reached 

# Velocity diffusion controls
tensor_diffusion.mlmg_fail_sdc_miniter = 1                     # [OPT, DEF=-1] Minimum SDC iterations before dumping residuals on MLMG failure
tensor_diffusion.mlmg_maxiter_after_sdc_miniter = 6            # [OPT, DEF=-1] Maximum MLMG solver iters after minimum SDC iters have been reached

Note that there are no solver-specific parameters for the Nodal projection solve because it is the last solve of a given time step and the maximum number of iterations can be controlled through nodal_proj.maxiter.

As an example, consider setting diffusion.mlmg_fail_sdc_miniter = 2, diffusion.mlmg_fail_deltaT_miniter = 1, and diffusion.mlmg_fail_temp_maxiter_after_sdc_deltaT_miniter = 3. This would result in dumping the temperature diffusion residuals for debugging only if the failure occurs after at least 2 SDC iterations and 1 deltaT iterations, and the MLMG solver has performed at least 3 iterations after those minimums have been reached as shown below:

SDC iter [2] 
   - oneSDC()::Update t^{n+1,k}  --> Time: 1.160221
      Before SDC 2: max relative P mismatch is 0.005735648384
MLMG: MAC Projection
   - oneSDC()::MACProjection()   --> Time: 0.755772
   - oneSDC()::ScalarAdvection() --> Time: 0.994092
MLMG: Species Diffusion
MLMG: Initial rhs               = 0.2607270861
MLMG: Initial residual (resid0) = 0.002231022801
MLMG: Final Iter. 3 resid, resid/bnorm = 2.450817327e-13, 9.399933716e-13
MLMG: Timers: Solve = 0.992347541 Iter = 0.897934042 Bottom = 0.000242834
 Iterative solve for deltaT 
MLMG: DeltaT solve [0]
MLMG: Initial rhs               = 5106.275976
MLMG: Initial residual (resid0) = 5106.275976
MLMG: Final Iter. 4 resid, resid/bnorm = 2.874003258e-10, 5.62837432e-14
MLMG: Timers: Solve = 0.171877708 Iter = 0.159280209 Bottom = 9.5459e-05
   DeltaT solve norm [0] = 10.9453211
MLMG: DeltaT solve [1]
      Limiting temperature diffusion MLMG max_iter to 3 (SDC iter [2] >= 2, deltaT solve [1] >= 1)
MLMG: Initial rhs               = 5101.242622
MLMG: Initial residual (resid0) = 5101.242622
MLMG: Failed to converge after 3 iterations. resid, resid/bnorm = 5.650636012e-07, 1.107697953e-10

  *** Temperature diffusion MLMG solve failed (non-EB)! ***
  Error: MLMG failed to converge.
  Dumping residuals for debugging...

amrex::Abort::0::MLMG solve for scalar diffusion failed !!!

Copy link
Copy Markdown
Collaborator

@baperry2 baperry2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple minor things to clean up.

Moved the plotting function to PeleLMeX_Plot.cpp and made the function a member of PeleLM like the other plotting functions. The function was renamed for consistency. Also, made use of GetVecOfPtrs and GetVecOfConstPtrs in the MAC projection.
@d-montgomery d-montgomery force-pushed the mlmg_fail_dump_residuals branch from 004b4cf to 542bfc4 Compare December 12, 2025 04:50
@baperry2 baperry2 merged commit f461c96 into AMReX-Combustion:development Dec 12, 2025
25 checks passed
@d-montgomery d-montgomery deleted the mlmg_fail_dump_residuals branch December 17, 2025 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants