Skip to content

chemistry reordering to increase cache performance#10

Merged
drummerdoc merged 3 commits intodevelopmentfrom
hari/chemreorder
Apr 3, 2019
Merged

chemistry reordering to increase cache performance#10
drummerdoc merged 3 commits intodevelopmentfrom
hari/chemreorder

Conversation

@hsitaram
Copy link
Copy Markdown
Contributor

@hsitaram hsitaram commented Apr 3, 2019

Some new functions are added in fuego to reorder chemistry so that the memory performance can be improved. Reactions are clustered in a way that the production rate calculation reuses the same/similar species data. The species are reordered in a way that reactions tend to use species ordered continuously in memory.

We achieve this by constructing a reaction - species matrix and using a traveling salesman reorodering of the rows and columns.

Right now, this reordering has not been turned off in the cpickler (there is a flag that can be used to turn it on) by default and the fuego generated files in Mechanism/models have not been changed.

Also adding a picture to explain this. see pdf below
rmat_all.pdf

This shows a matrix representation of the LuDME chemistry (39 spec, 169 reactions). Each row is a reaction and the non-zero (colored) entries are the corresponding species. I have also added all 3rd bodies involved also as a participating specie in a reaction.

Fig 1 from left: the matrix straight out of chemkin
Fig 2 from left: after fuego sorts by reaction type (troe,sri etc..) - this is the DEFAULT
Fig3 from left: after the rows are reordered
Fig4 from left: after columns are reordered

@drummerdoc
Copy link
Copy Markdown
Contributor

Any idea how this interacts with the vectorization? Are there conditions where this is good and conditions where it is a bad thing to do?

@hsitaram
Copy link
Copy Markdown
Contributor Author

hsitaram commented Apr 3, 2019

I have seen reduced L2 cache misses when I ran reactEval test case with the grimech; I used intel vtune to measure it. But, the performance improvement I got was marginal. I am hoping this will have implications on gpu or with larger chemistries.

So far, with all the cases I have run it has not decreased performance. So its not a bad thing to do. There is also of course a limit on how much one can reorder the chemistry.

@AnneFelden
Copy link
Copy Markdown
Contributor

I might have to take a look at how the routine looks like after this but I believe it won't affect the vectorization. However, it will affect all the pmf's, and it just stresses once again that we should implement an automatic checking of the species order.

@hsitaram
Copy link
Copy Markdown
Contributor Author

hsitaram commented Apr 3, 2019

Anne - I have not changed the .cpp files that you had put in. So the test cases should pass.

@drummerdoc
Copy link
Copy Markdown
Contributor

Since this defaults to False, seems reasonable to merge in. I'm a little worried about how this might impact future work (in that I may need to understand more of this than I do to make sure I don't break it later). But a good thing to add soon is a series of tests to ensure that the reordering doesn't change the answers. Then if we do something to FUEGO later, we can run those tests to see if the mods need to account for your stuff. Apparently Jon is doing something that will make this "easy", but a test could be based on the ReactEval in the Testing folder of this repo.

@hsitaram
Copy link
Copy Markdown
Contributor Author

hsitaram commented Apr 3, 2019

Thank you Marc. I should also mention that I had done testing with the reordered chemistry and it gives exactly the same results.

@drummerdoc
Copy link
Copy Markdown
Contributor

You probably mentioned that. I'm thinking of things that might happen by someone else in the future though. Having an easy way to test would be a good thing.

@drummerdoc drummerdoc merged commit fb65aa8 into development Apr 3, 2019
@hsitaram
Copy link
Copy Markdown
Contributor Author

hsitaram commented Apr 3, 2019

yes. I agree. Although the "diff"ing of the gold regression files will be tricky if the chemistry is reordered.

@hsitaram hsitaram deleted the hari/chemreorder branch April 3, 2019 20:47
drummerdoc pushed a commit that referenced this pull request Aug 28, 2023
* Missing AMReX-Hydro home in convergence testing.

* Update Make.PeleLMeX

* Update reactor in Sources.

* Remove one call too many to deallocate of transport.

* Update FlameSheet GMake

* Add CVODE input keys.

* Udpate GMake in Periodic and HotBubble

* Fix typo on ReactorNull default:
drummerdoc pushed a commit that referenced this pull request Aug 28, 2023
* Move the CoveredMask reset flag out of the if statement.

* Add user-defined chemistry tolerances.

* Add a Triple flame in Exec/Cases/TripleFlame

* While making new level from coarse, get 1 ghost cell right so that
AmrNewTime data have one ghost cell properly set.

* Add a simple 2D CH4/Air premixed bunsen case. Tested up to 5 levels.

* Add a 3Dversion of the premixed bunsen flame.

* Need to reset the covered cell mask after restart in case regrid is not called right after.

* Add option to restart an efield simulation from a non-efield one.

* When restarting from non-efield chk, initialize electro-neutral nE
field.

* Fix PlotFile to plot I_R(nE) with efield and write/read in phiV/nE in
chk.

* Machinery to restart from non-efield chk.

* Update Precond operator to implement the second approximation of Stilda.

* Make use of the absolute tolerance in MLGMRES.

* Add fillpatch functions for the non-linear state components.

* Enable Schur complement approximation 2.

* Define and ParmParse m_ef_schur_approx.

* Floor nE along with species if required.

* Fix typos in PeleLMBC.cpp

* Update PeleLMeX header for new fillpatch functions.

* Replace mechanism.h by mechanism.H

* Setup gravity.

* Use gravity in velocity forces.

* Replace .copy by .ParallelCopy.

* Add a lifting hot bubble case in RegTests.

* Default the pprocConvOrder.py to CoVo inputs.

* Enable CoVo in all directions.

* Default input.2d_CoVo to diagonal direction.

* Update pprocConvOrder.py with version checking the convergence order.

* Init convergence testing CI.

* Fix convergence CI (#6)

* Fix deps in convergence CI.

* Update pprocConvOrder.py.

* Fix Pele Physics (#7)

* Update make system to catch up with PP.

* Update sources for latest PP.

* Fix parsing of constant transport parameters.

* Switch AMReX-Hydro (#8)

* Add AMReX-Hydro to deps and GMake.

* Remove LMX Godunov folder from listed sources.

* Fix AMReX-Hydro makefile again and need MOL.

* Swtich to HydroUtils to predict velocity and use
create_umac_grown_constrained.

* Remove stranded Godunov include.

* Add m_advection_type. Only Godunov in LMeX.

* Switch to HydroUtils::ComputeFluxesOnBoxFromState for advection fluxes.

* Remove Godunov from LMeX sources.

* Rewritte the advective flux divergence to comply with AMReX-Hydro way.

* Update CI.

* Feature balance (#9)

* Add kin. energy derive function.

* Add kin.energy derive and ParmParse temporals input.

* Start setting-up run-time diagnostics.

* Update MLNorm0 to not account for fine-covered cells.

* Setup mass balance in runtime diagnostics.

* Add temporal keys to FlameSheet regtests

* Git ignore

* Add gnuplot script for mass balance.

* Fix runtime selection of reactor. (#10)

* Missing AMReX-Hydro home in convergence testing.

* Update Make.PeleLMeX

* Update reactor in Sources.

* Remove one call too many to deallocate of transport.

* Update FlameSheet GMake

* Add CVODE input keys.

* Udpate GMake in Periodic and HotBubble

* Fix typo on ReactorNull default:

* Make LMeX GPU-compatible (#11)

* Remove device from lineaChmeForcing.

* Can't init capture in host_device functions.

* Fix call to host function on device lambdas.

* Change OMP.

* Should be the same, but make sure.

* Don't init transport for incompressible flows.

* Missing Gpu Managed. Will be updated later.

* Minor clean up in PeriodicCases.

* Fix parm in FLameSheet too.

* Restore FlameSheet 3D.

* Remove unused.

* Add Sundials memory helper.

* Remove auto-TPL. Recompile all the source each time. Need fix.

* Restore make TPL in GH workflow for now.

* Fix velocity ghost cells for Nodal projection. Function (#12)

should only overwrite Inflow BCs.

* Implement closed chamber algorithm. (#13)

* Add auto-detection of closed chamber and unable overwrite.
Add PPquery of linear solves tolerances.

* Add GammmaInv kernel.

* Move MFSum in Utils and initialize uncovered volume computation.

* MAC projection function handles closed chamber corrections.

* Nodal projection functions handle -/+ of Sbar in RHS.

* Remove TODO comment.

* Pass dp0dt in diffusion forcing.

* Pass dp0dt in Advection forcing.

* Add pOld <-> pNew in advance function.

* Add adjustPandDivU in Eos file.

* Add accessor to divU levels vector

* Add declarations.

* Add ambient pressure to checkpoint file header.

* Uses pNew to get dPdt.

* Fix BL_PROF in UMAC

* Add an enclosed flame test to test closed chamber.

* Add a CI testing closed chamber.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants