Add yaml writer test to compare yaml file generated by RMG and the ya…#2770
Add yaml writer test to compare yaml file generated by RMG and the ya…#2770LekiaAnonim wants to merge 6 commits intoReactionMechanismGenerator:yaml_writer_additionfrom
Conversation
…lder. So they don't over-write the ones that are being written directly.
This allows the Cantera yaml writer to live alongside it
Nora did most of the work. Richard did some cleanup. Nick fixed species_to_dict function to use correct parameter as rxn species. Co-authored-by: Nora Khalil <[email protected]> Co-authored-by: Richard West <[email protected]> Co-authored-by: Nicholas Tietje <[email protected]>
Although Atom is a subclass of Vertex, it adds an element attribute. This leads to more efficient Cython code (and quietens a warning in vscode).
Basing it on the Chemkin version. For now only evaluate it once, and include everything.
…ml file converted from chemkin files
There was a problem hiding this comment.
Pull Request Overview
This PR adds a suite of tests and supporting utilities to compare YAML files generated by two different methods (direct Cantera generation versus Chemkin conversion) to ensure consistency in reaction and species data.
- Introduces new tests under test/rmgpy/yaml_writer/test_yaml.py for validating species counts, species names, phase-specific species counts, and reaction details.
- Implements helper classes (YamlAnalyst and CompareYaml) in test/rmgpy/yaml_writer/compare_yaml_outputs.py to load, analyze, and compare the YAML files.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| test/rmgpy/yaml_writer/test_yaml.py | Adds test cases for comparing YAML outputs. |
| test/rmgpy/yaml_writer/compare_yaml_outputs.py | Introduces classes to load, parse, and compare YAML file content. |
| for key, values in self.load_yaml_file().items(): | ||
| if key in [f"{phase['name']}-reactions" for phase in self.load_yaml_file()['phases']]: | ||
| reactions_dict[key] = self.load_yaml_file()[key] | ||
| elif key == 'gas_reactions': | ||
| reactions_dict['gas_reactions'] = self.load_yaml_file()['gas_reactions'] | ||
| elif key == 'surface_reactions': | ||
| reactions_dict['surface_reactions'] = self.load_yaml_file()['surface_reactions'] |
There was a problem hiding this comment.
Repeatedly calling self.load_yaml_file() within the loop may lead to unnecessary file reads. Consider loading the YAML content once and caching it to improve performance.
| for key, values in self.load_yaml_file().items(): | |
| if key in [f"{phase['name']}-reactions" for phase in self.load_yaml_file()['phases']]: | |
| reactions_dict[key] = self.load_yaml_file()[key] | |
| elif key == 'gas_reactions': | |
| reactions_dict['gas_reactions'] = self.load_yaml_file()['gas_reactions'] | |
| elif key == 'surface_reactions': | |
| reactions_dict['surface_reactions'] = self.load_yaml_file()['surface_reactions'] | |
| yaml_content = self.load_yaml_file() # Cache the YAML content | |
| for key, values in yaml_content.items(): | |
| if key in [f"{phase['name']}-reactions" for phase in yaml_content['phases']]: | |
| reactions_dict[key] = yaml_content[key] | |
| elif key == 'gas_reactions': | |
| reactions_dict['gas_reactions'] = yaml_content['gas_reactions'] | |
| elif key == 'surface_reactions': | |
| reactions_dict['surface_reactions'] = yaml_content['surface_reactions'] |
| phase_names1 = [phase['name'] for phase in self.yaml1.load_yaml_file()['phases']] | ||
| phase_names2 = [phase['name'] for phase in self.yaml2.load_yaml_file()['phases']] | ||
| all_phase_names = set(phase_names1).union(set(phase_names2)) | ||
| count_diff = {'gas': count_per_phase1[f"specie_count_{phase_names1[0]}"] - count_per_phase2[f"specie_count_{phase_names2[0]}"], | ||
| 'surface': count_per_phase1[f"specie_count_{phase_names1[1]}"] - count_per_phase2[f"specie_count_{phase_names2[1]}"] | ||
| } |
There was a problem hiding this comment.
Assuming that the gas and surface phases always appear as the first two elements in the phases list can be brittle. Consider matching phases by name to ensure robustness if the order changes.
| phase_names1 = [phase['name'] for phase in self.yaml1.load_yaml_file()['phases']] | |
| phase_names2 = [phase['name'] for phase in self.yaml2.load_yaml_file()['phases']] | |
| all_phase_names = set(phase_names1).union(set(phase_names2)) | |
| count_diff = {'gas': count_per_phase1[f"specie_count_{phase_names1[0]}"] - count_per_phase2[f"specie_count_{phase_names2[0]}"], | |
| 'surface': count_per_phase1[f"specie_count_{phase_names1[1]}"] - count_per_phase2[f"specie_count_{phase_names2[1]}"] | |
| } | |
| phase_names1 = {phase['name']: idx for idx, phase in enumerate(self.yaml1.load_yaml_file()['phases'])} | |
| phase_names2 = {phase['name']: idx for idx, phase in enumerate(self.yaml2.load_yaml_file()['phases'])} | |
| count_diff = {} | |
| for phase_name in ['gas', 'surface']: | |
| if phase_name in phase_names1 and phase_name in phase_names2: | |
| count_diff[phase_name] = count_per_phase1[f"specie_count_{phase_name}"] - count_per_phase2[f"specie_count_{phase_name}"] | |
| else: | |
| count_diff[phase_name] = None # Handle missing phases gracefully |
133a100 to
5e45b13
Compare
Motivation or Problem
Disparities were observed in the Ea, b, and A values for gas and surface reactions when comparing Cantera YAML files generated directly versus those converted from Chemkin files using the
cantera.ck2yamlcommand. A script and corresponding tests have been written to facilitate this comparison.Description of Changes
yaml_writer_datadirectory./test/rmgpy/test_data/yaml_writer_data/cantera/./test/rmgpy/test_data/yaml_writer_data/chemkin/.Testing
A test script was written to compare species names, the number of species, reaction parameters, and the number of reactions in the YAML files generated by both methods.
Reviewer Tips
To review this PR, run the tests in
/RMG-Py/test/rmgpy/yaml_writer/test_yaml.py. Note: You may need to adjust the file paths intest_yaml.pyto match your local setup.