Skip to content

Allow RMG to load thermo or kinetic libraries from a file#2340

Merged
xiaoruiDong merged 8 commits intomainfrom
ext_lib
Nov 2, 2022
Merged

Allow RMG to load thermo or kinetic libraries from a file#2340
xiaoruiDong merged 8 commits intomainfrom
ext_lib

Conversation

@alongd
Copy link
Copy Markdown
Member

@alongd alongd commented Oct 23, 2022

Motivation or Problem

Users who execute RMG on a server where it is installed in a read-only partition cannot add new libraries for their runs.

Description of Changes

Added functionality to load thermo and kinetic libraries from a user specified location
Tests added

@codecov
Copy link
Copy Markdown

codecov bot commented Oct 23, 2022

Codecov Report

Merging #2340 (f01ae49) into main (38fdbc5) will increase coverage by 0.04%.
The diff coverage is 90.47%.

❗ Current head f01ae49 differs from pull request most recent head 6ea0a92. Consider uploading reports for the commit 6ea0a92 to get more accurate results

@@            Coverage Diff             @@
##             main    #2340      +/-   ##
==========================================
+ Coverage   47.72%   47.77%   +0.04%     
==========================================
  Files          98       98              
  Lines       27678    27688      +10     
  Branches     7317     7317              
==========================================
+ Hits        13210    13227      +17     
+ Misses      13073    13066       -7     
  Partials     1395     1395              
Impacted Files Coverage Δ
rmgpy/data/kinetics/database.py 51.44% <81.81%> (+1.07%) ⬆️
rmgpy/data/thermo.py 66.37% <100.00%> (+0.26%) ⬆️
arkane/kinetics.py 12.24% <0.00%> (ø)
rmgpy/molecule/fragment.py 73.63% <0.00%> (+0.29%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Copy Markdown
Contributor

@xiaoruiDong xiaoruiDong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alongd Thank you for adding this feature, and I agree that this feature is handy! I just had one comment for you to address. Otherwise, it looks great.

library = KineticsLibrary(label=short_library_name)
library.load(library_file, self.local_context, self.global_context)
self.libraries[library.label] = library
elif os.path.exists(library_file):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check if there is a need to update library.library_order?

At rmgpy/data/rmg.py line 165, items in the kinetics_libraries will be assigned as (library, 'Reaction Library')). Therefore, if a file path is provided, this will be (PATH, 'Reaction Library').
At rmgpy/data/kinetics/database.py line 438, the script iterates through library_order and uses the first elements (the PATH) as the label to query the library (self.libraries[label]), but the change here redefines label as short_library_name. So, it is possible to cause a KeyError.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing up these important points, I did not think of them originally.

I think we can either take an approach of converting an external library name (which is a path at first) to the actual label when loading the object, and only working with labels and not paths downstream, or we can retain the full path to an external library as its name within RMG.
The first approach is cleaner or at least has shorter names, but the second might be more robust if users make an external library that has the same name as an internal one (e.g., testing an updated existing library).

If we take the second approach, which I believe is currently implemented in this PR, then I think that the current implementation is OK as is. Here's the train of calls, as I see it, I list it in detail just to help me sort it out :)

1. We load the database, e.g., as in rmgpy/rmg/main.py line 374
2. We call load_kinetics() in rmgpy/data/rmg.py in line 99, passing the database path for the kinetics folder, and a list of library names as reaction_libraries.
3. We append (library, 'Reaction Library') tuples in rmgpy/data/rmg.py line 165
4. Using this correct library order, we then turn to load the libraries on rmgpt/data/kinetics/database.py line 230, which is the block this PR modifies.

I think that the modifications to process an external library we did here are downstream to setting the library order, hence it is OK and the order set by the user will still be preserved.

I see the concern that we carry both the database PATH as well as the specific path of an external library as the library name. The load_libraries() function modified here is in charge of loading the specific library correctly if it represents a complete external path, and in this case the db PATH is ignored.

This branch runs as expected: I copied an existing RMG library (e.g., FFCM1) into my run folder, here's part of the RMG.log file:

Loading kinetics library GRI-Mech3.0-N from /Data/Code/RMG-database/input/kinetics/libraries/GRI-Mech3.0-N/reactions.py...
Loading kinetics library FFCM1(-) from /Data/Code/runs/debug/RMG_ext_lib/FFCM1(-)...

I tried loading an external seed mechanism, and it also works as expected.

I'll add another commit of an example that shows potential users how to use this feature.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation and the addition of the example. I double check with the workflow, and it makes sense with the second approach.

xiaoruiDong
xiaoruiDong previously approved these changes Oct 31, 2022
Copy link
Copy Markdown
Contributor

@xiaoruiDong xiaoruiDong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Please rebase the branch and I will merge it!

This message was meant for a time when these were very well-known and wide use RMG thermo and kinetic libraries. They were superseded by the BurkeH2O2 libraries, and now also by PrimaryH2O2 for thermo. No need to keep these messages, users are not aware of this old library name anymore.
@xiaoruiDong xiaoruiDong merged commit 0df21b9 into main Nov 2, 2022
@xiaoruiDong xiaoruiDong deleted the ext_lib branch November 2, 2022 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants