Conversation
… systems of amino acids
… systems of nucleic acids
…denosine, Phosphate Ester, Zinc Complex, and Perchlorate Anion
…ybridization by deriving Error trait
…ith TyperError enum
… molecular representation
…ange representation to u8
…or enhanced molecular representation
…molecular representation
…Atom and BondEdge
…roved error handling
…r aromatic systems
…edMolecule structures
…processing functions
…ions and their arguments
…nd their arguments
…e and its functions
…adding detailed field descriptions
…ds, adding detailed descriptions and usage notes
…hancing clarity on rule parsing and application engines
…unctions, adding detailed descriptions and usage notes
…etailed usage examples and descriptions for topology assignment functions
…properDihedral structs in the final topology
…ing clarity on roles and transformations within the dreid-typer library
…y and structure, enhancing the overview and detailing each processing pass
8 tasks
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a comprehensive refactoring of the chemical perception engine to properly separate aromaticity detection from resonance modeling. The previous implementation conflated these two concepts, causing incorrect atom typing for non-aromatic resonant systems like amides and carboxylates. The new architecture implements a six-stage perception pipeline that provides more accurate hybridization and DREIDING atom type assignments.
Key Changes:
- Restructured perception into six distinct, ordered stages: Rings → Kekulize → Electrons → Aromaticity → Resonance → Hybridization
- Integrated the
paulingcrate for robust resonance system detection - Decoupled aromaticity (Hückel rule-based) from general resonance (conjugated π-systems)
- Introduced expert system for formal charge and lone pair perception
- Updated DREIDING rules to use
hybridization = "Resonant"instead ofis_aromatic = truefor_Rtypes - Removed ZINC_COMPLEX test case
- Updated hundreds of test expectations to reflect more accurate
_Rtype assignments
Reviewed Changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/integration_tests.rs | Removed ZINC_COMPLEX test case |
| tests/cases/nucleic_acids.rs | Updated expected types from O_2/N_3 to O_R/N_R for resonant systems in nucleic acid bases |
| tests/cases/dreiding_paper.rs | Corrected expected types for carboxylates, nitro groups, amides, thiourea, sulfonamides, and other resonant systems; removed ZINC_COMPLEX test |
| tests/cases/amino_acids.rs | Updated carboxylate and amide oxygens/nitrogens to use _R types across all amino acid zwitterions |
| src/typing/rules.rs | New module defining rule schema, parsing, and default ruleset management |
| src/typing/mod.rs | New module organizing typing engine and rules |
| src/typing/engine.rs | New priority-based iterative typing engine implementation |
| src/perception/rings.rs | New ring detection using minimal cycle basis and Gaussian elimination |
| src/perception/resonance.rs | New resonance detection integrating pauling crate with local heuristics |
| src/perception/model.rs | New annotated molecule model with pauling trait implementations |
| src/perception/mod.rs | New six-stage perception pipeline coordinator |
| src/perception/kekulize.rs | New Kekulé solver for aromatic bond resolution |
| src/perception/hybridization.rs | New VSEPR-based hybridization assignment |
| src/perception/electrons.rs | New expert system for charge and lone pair perception |
| docs/ARCHITECTURE.md | Updated to reflect new three-stage pipeline with detailed perception stages |
| Cargo.toml | Added thiserror 2.0.17 and pauling 0.1.0 dependencies; bumped version to 0.2.1 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
Introduces a major refactoring of the chemical perception engine to decouple aromaticity detection from the more general concept of resonance. The previous implementation incorrectly conflated the two, leading to mistyping of non-aromatic but resonant systems (e.g., amides, carboxylates). The new, more sophisticated pipeline now consists of six ordered stages, including explicit Kekulé expansion, charge/lone-pair perception via an expert template system, and a dedicated resonance detection pass. This results in significantly more accurate hybridization and DREIDING atom type assignments, particularly for
_Rtypes, across a broader range of chemical structures.Changes:
Implemented a Six-Stage Perception Pipeline:
processormodule into a formal, six-stage pipeline:Rings->Kekulize->Electrons->Aromaticity->Resonance->Hybridization.AnnotatedMoleculedata structure, ensuring a deterministic and chemically correct flow of information.paulingcrate for robust resonance system detection.Decoupled Aromaticity from Resonance:
Hybridizationenum now has a distinctResonantvariant, which is used as the primary condition for assigning_Rtypes.Enhanced Charge and Lone Pair Perception:
formal_chargefrom the publicMolecularGraph::add_atomAPI.electronsperception pass that automatically infers formal charges and lone pairs.Refined DREIDING Rule Set:
default.rules.toml) has been overhauled to leverage the new, more precise perception data._Rtypes (e.g.,C_R,N_R,O_R) are now correctly keyed onhybridization = "Resonant"instead of the overly restrictiveis_aromatic = true.Expanded and Refined Integration Test Suite:
chargeparameters from all test case definitions to validate the new charge perception logic._Rassignments produced by the new engine.Improved CI and Project Documentation: