chemistry reordering to increase cache performance by hsitaram · Pull Request #10 · AMReX-Combustion/PelePhysics

hsitaram · 2019-04-03T19:18:56Z

Some new functions are added in fuego to reorder chemistry so that the memory performance can be improved. Reactions are clustered in a way that the production rate calculation reuses the same/similar species data. The species are reordered in a way that reactions tend to use species ordered continuously in memory.

We achieve this by constructing a reaction - species matrix and using a traveling salesman reorodering of the rows and columns.

Right now, this reordering has not been turned off in the cpickler (there is a flag that can be used to turn it on) by default and the fuego generated files in Mechanism/models have not been changed.

Also adding a picture to explain this. see pdf below
rmat_all.pdf

This shows a matrix representation of the LuDME chemistry (39 spec, 169 reactions). Each row is a reaction and the non-zero (colored) entries are the corresponding species. I have also added all 3rd bodies involved also as a participating specie in a reaction.

Fig 1 from left: the matrix straight out of chemkin
Fig 2 from left: after fuego sorts by reaction type (troe,sri etc..) - this is the DEFAULT
Fig3 from left: after the rows are reordered
Fig4 from left: after columns are reordered

cache performance

reactions

drummerdoc · 2019-04-03T19:25:09Z

Any idea how this interacts with the vectorization? Are there conditions where this is good and conditions where it is a bad thing to do?

hsitaram · 2019-04-03T19:41:08Z

I have seen reduced L2 cache misses when I ran reactEval test case with the grimech; I used intel vtune to measure it. But, the performance improvement I got was marginal. I am hoping this will have implications on gpu or with larger chemistries.

So far, with all the cases I have run it has not decreased performance. So its not a bad thing to do. There is also of course a limit on how much one can reorder the chemistry.

AnneFelden · 2019-04-03T19:41:27Z

I might have to take a look at how the routine looks like after this but I believe it won't affect the vectorization. However, it will affect all the pmf's, and it just stresses once again that we should implement an automatic checking of the species order.

hsitaram · 2019-04-03T19:42:36Z

Anne - I have not changed the .cpp files that you had put in. So the test cases should pass.

drummerdoc · 2019-04-03T20:39:32Z

Since this defaults to False, seems reasonable to merge in. I'm a little worried about how this might impact future work (in that I may need to understand more of this than I do to make sure I don't break it later). But a good thing to add soon is a series of tests to ensure that the reordering doesn't change the answers. Then if we do something to FUEGO later, we can run those tests to see if the mods need to account for your stuff. Apparently Jon is doing something that will make this "easy", but a test could be based on the ReactEval in the Testing folder of this repo.

hsitaram · 2019-04-03T20:43:01Z

Thank you Marc. I should also mention that I had done testing with the reordered chemistry and it gives exactly the same results.

drummerdoc · 2019-04-03T20:44:20Z

You probably mentioned that. I'm thinking of things that might happen by someone else in the future though. Having an easy way to test would be a good thing.

hsitaram · 2019-04-03T20:45:42Z

yes. I agree. Although the "diff"ing of the gold regression files will be tricky if the chemistry is reordered.

* Missing AMReX-Hydro home in convergence testing. * Update Make.PeleLMeX * Update reactor in Sources. * Remove one call too many to deallocate of transport. * Update FlameSheet GMake * Add CVODE input keys. * Udpate GMake in Periodic and HotBubble * Fix typo on ReactorNull default:

* Move the CoveredMask reset flag out of the if statement. * Add user-defined chemistry tolerances. * Add a Triple flame in Exec/Cases/TripleFlame * While making new level from coarse, get 1 ghost cell right so that AmrNewTime data have one ghost cell properly set. * Add a simple 2D CH4/Air premixed bunsen case. Tested up to 5 levels. * Add a 3Dversion of the premixed bunsen flame. * Need to reset the covered cell mask after restart in case regrid is not called right after. * Add option to restart an efield simulation from a non-efield one. * When restarting from non-efield chk, initialize electro-neutral nE field. * Fix PlotFile to plot I_R(nE) with efield and write/read in phiV/nE in chk. * Machinery to restart from non-efield chk. * Update Precond operator to implement the second approximation of Stilda. * Make use of the absolute tolerance in MLGMRES. * Add fillpatch functions for the non-linear state components. * Enable Schur complement approximation 2. * Define and ParmParse m_ef_schur_approx. * Floor nE along with species if required. * Fix typos in PeleLMBC.cpp * Update PeleLMeX header for new fillpatch functions. * Replace mechanism.h by mechanism.H * Setup gravity. * Use gravity in velocity forces. * Replace .copy by .ParallelCopy. * Add a lifting hot bubble case in RegTests. * Default the pprocConvOrder.py to CoVo inputs. * Enable CoVo in all directions. * Default input.2d_CoVo to diagonal direction. * Update pprocConvOrder.py with version checking the convergence order. * Init convergence testing CI. * Fix convergence CI (#6) * Fix deps in convergence CI. * Update pprocConvOrder.py. * Fix Pele Physics (#7) * Update make system to catch up with PP. * Update sources for latest PP. * Fix parsing of constant transport parameters. * Switch AMReX-Hydro (#8) * Add AMReX-Hydro to deps and GMake. * Remove LMX Godunov folder from listed sources. * Fix AMReX-Hydro makefile again and need MOL. * Swtich to HydroUtils to predict velocity and use create_umac_grown_constrained. * Remove stranded Godunov include. * Add m_advection_type. Only Godunov in LMeX. * Switch to HydroUtils::ComputeFluxesOnBoxFromState for advection fluxes. * Remove Godunov from LMeX sources. * Rewritte the advective flux divergence to comply with AMReX-Hydro way. * Update CI. * Feature balance (#9) * Add kin. energy derive function. * Add kin.energy derive and ParmParse temporals input. * Start setting-up run-time diagnostics. * Update MLNorm0 to not account for fine-covered cells. * Setup mass balance in runtime diagnostics. * Add temporal keys to FlameSheet regtests * Git ignore * Add gnuplot script for mass balance. * Fix runtime selection of reactor. (#10) * Missing AMReX-Hydro home in convergence testing. * Update Make.PeleLMeX * Update reactor in Sources. * Remove one call too many to deallocate of transport. * Update FlameSheet GMake * Add CVODE input keys. * Udpate GMake in Periodic and HotBubble * Fix typo on ReactorNull default: * Make LMeX GPU-compatible (#11) * Remove device from lineaChmeForcing. * Can't init capture in host_device functions. * Fix call to host function on device lambdas. * Change OMP. * Should be the same, but make sure. * Don't init transport for incompressible flows. * Missing Gpu Managed. Will be updated later. * Minor clean up in PeriodicCases. * Fix parm in FLameSheet too. * Restore FlameSheet 3D. * Remove unused. * Add Sundials memory helper. * Remove auto-TPL. Recompile all the source each time. Need fix. * Restore make TPL in GH workflow for now. * Fix velocity ghost cells for Nodal projection. Function (#12) should only overwrite Inflow BCs. * Implement closed chamber algorithm. (#13) * Add auto-detection of closed chamber and unable overwrite. Add PPquery of linear solves tolerances. * Add GammmaInv kernel. * Move MFSum in Utils and initialize uncovered volume computation. * MAC projection function handles closed chamber corrections. * Nodal projection functions handle -/+ of Sbar in RHS. * Remove TODO comment. * Pass dp0dt in diffusion forcing. * Pass dp0dt in Advection forcing. * Add pOld <-> pNew in advance function. * Add adjustPandDivU in Eos file. * Add accessor to divU levels vector * Add declarations. * Add ambient pressure to checkpoint file header. * Uses pNew to get dPdt. * Fix BL_PROF in UMAC * Add an enclosed flame test to test closed chamber. * Add a CI testing closed chamber.

Hariswaran Sitaraman and others added 3 commits March 25, 2019 17:33

adding functions to reorder reactions and species for

6237c88

cache performance

added agglomerative clustering as an option to reorder

8e6da2b

reactions

turning reordering to false

c9ee2b6

hsitaram requested review from EnnaDelfen, drummerdoc and rgrout April 3, 2019 19:18

drummerdoc merged commit fb65aa8 into development Apr 3, 2019

hsitaram deleted the hari/chemreorder branch April 3, 2019 20:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chemistry reordering to increase cache performance#10

chemistry reordering to increase cache performance#10
drummerdoc merged 3 commits intodevelopmentfrom
hari/chemreorder

hsitaram commented Apr 3, 2019 •

edited

Loading

Uh oh!

drummerdoc commented Apr 3, 2019

Uh oh!

hsitaram commented Apr 3, 2019

Uh oh!

AnneFelden commented Apr 3, 2019

Uh oh!

hsitaram commented Apr 3, 2019

Uh oh!

drummerdoc commented Apr 3, 2019

Uh oh!

hsitaram commented Apr 3, 2019 •

edited

Loading

Uh oh!

drummerdoc commented Apr 3, 2019

Uh oh!

hsitaram commented Apr 3, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

hsitaram commented Apr 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drummerdoc commented Apr 3, 2019

Uh oh!

hsitaram commented Apr 3, 2019

Uh oh!

AnneFelden commented Apr 3, 2019

Uh oh!

hsitaram commented Apr 3, 2019

Uh oh!

drummerdoc commented Apr 3, 2019

Uh oh!

hsitaram commented Apr 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drummerdoc commented Apr 3, 2019

Uh oh!

hsitaram commented Apr 3, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hsitaram commented Apr 3, 2019 •

edited

Loading

hsitaram commented Apr 3, 2019 •

edited

Loading