Enable Restarting RMG Jobs from a Seed Mechanism#1641
Conversation
51cdc99 to
bf43881
Compare
Codecov Report
@@ Coverage Diff @@
## master #1641 +/- ##
==========================================
- Coverage 41.79% 41.66% -0.14%
==========================================
Files 177 176 -1
Lines 30207 30215 +8
Branches 6252 6256 +4
==========================================
- Hits 12625 12588 -37
- Misses 16659 16703 +44
- Partials 923 924 +1
Continue to review full report at Codecov.
|
47aee7b to
4f73f35
Compare
4f73f35 to
f25abe5
Compare
63a0590 to
47cf386
Compare
amarkpayne
left a comment
There was a problem hiding this comment.
Thanks for the helpful comments @mliu49 ! There are a couple of things I need to change depending on what we want to do with the old style restart
d980302 to
27fd30b
Compare
mjohnson541
left a comment
There was a problem hiding this comment.
In general I think this looks like it avoids a lot of the bugs that could easily occur with this feature. I've brought up a few issues I noticed below.
27fd30b to
89290b0
Compare
mliu49
left a comment
There was a problem hiding this comment.
Are the changes to seed paths done as well?
mjohnson541
left a comment
There was a problem hiding this comment.
Couple small additional things.
89290b0 to
325dc8b
Compare
|
I have force pushed the requested changes and confirmed that the (updated, the species map changed from a dictionary to a list) minimal restart example works. |
rmgpy/rmg/main.py
Outdated
| reordered_core_species = [] | ||
| for spc in restartSpeciesList: | ||
| for j, oldCoreSpc in enumerate(self.reactionModel.core.species): | ||
| if oldCoreSpc.isIsomorphic(spc): |
There was a problem hiding this comment.
I think we should use the strict=False argument since we don't generate resonance structures for the restart species.
rmgpy/rmg/main.py
Outdated
|
|
||
| try: | ||
| restart = kwargs['restart'] | ||
| self.restart_seed_path = kwargs['restart'] |
There was a problem hiding this comment.
I think this doesn't do anything. It should set the coreSeedPath, edgeSeedPath, filtersPath, and speciesMapPath like you do when loading the input file.
Alternatively, if you don't think we should support both methods, you can remove this and set the -r flag to simply print a warning that the old restart method has been removed and the flag does not do anything.
There was a problem hiding this comment.
Fixed, thanks for the catch!
325dc8b to
0075b11
Compare
|
@amarkpayne was correct, and the problem was in fact the extra indentation, which was messing up the family name parsing and causing the reaction to be loaded as a library reaction with I added a few commits to resolve this and some related issues:
|
We now perform this check when/if these edge species are moved to the core
Has been deprecated for a while, and is now completely replaced by the restart from seed mechanism feature.
We will still use the job name when outputting to the database, but the name should be fixed in the output directory
More robust methods to determine template and family name Make sure to strip family name
Otherwise, indents are retained over each read/write operation
3a00ee3 to
9bd9589
Compare
|
I have also confirmed that with @mliu49 fixes enable unlimited successive restarts. I have also addressed the last remaining comment that I need to fixup, and I have squashed all of the fixups and force pushed. This should be ready for final approval/merging |
mliu49
left a comment
There was a problem hiding this comment.
I think this is good to go now. @mjohnson541, do you have any comments on the fixes that I added?
|
Took a look now, looks fine to me! |
Motivation or Problem
There are many instances where it is desirable to restart an RMG job from the last iteration instead of having to start the mechanism generation from the beginning. RMG currently has a restart file, but it has not been maintained and will be deprecated starting in the next release. Furthermore, these restart files are extremely slow.
The agreed upon solution to this problem is to restart RMG jobs from the seed mechanism files that RMG already outputs. These files allow for the complete reconstruction of the model core and edge at the time of the restart. The only missing piece of information missing is the filter tensors, which help RMG decide which species to react, which species have already reacted, and which species not to react because their expected rate is well below what would be included based on the tolerance. This PR saves these filter tensors and allows for them to be loaded so that RMG jobs can be restarted using this technique
Description of Changes
Filterssub folder of the seed mechanismrestart_from_seed.pythat can be run to restart the jobTesting
I have restarted the superminimal and minimal examples using this technique, and the same mechanism was obtained if the job was run to completion or if the job was restarted from a seed mechanism from an intermediate iteration seed mechanism.
I have not added anything in the way of unit testing yet, but I am happy to add tests for the crucial features of this PR.