Skip to content

refactor(perception): Overhaul Chemical Perception for Full Resonance Modeling#12

Merged
TKanX merged 149 commits intomainfrom
bugfix/11-core-typing-logic-incorrectly-equates-_r-type-with-aromaticity-failing-to-type-non-aromatic-resonance-systems
Nov 17, 2025
Merged

refactor(perception): Overhaul Chemical Perception for Full Resonance Modeling#12
TKanX merged 149 commits intomainfrom
bugfix/11-core-typing-logic-incorrectly-equates-_r-type-with-aromaticity-failing-to-type-non-aromatic-resonance-systems

Conversation

@TKanX
Copy link
Copy Markdown
Member

@TKanX TKanX commented Nov 17, 2025

Summary:

Introduces a major refactoring of the chemical perception engine to decouple aromaticity detection from the more general concept of resonance. The previous implementation incorrectly conflated the two, leading to mistyping of non-aromatic but resonant systems (e.g., amides, carboxylates). The new, more sophisticated pipeline now consists of six ordered stages, including explicit Kekulé expansion, charge/lone-pair perception via an expert template system, and a dedicated resonance detection pass. This results in significantly more accurate hybridization and DREIDING atom type assignments, particularly for _R types, across a broader range of chemical structures.

Changes:

  • Implemented a Six-Stage Perception Pipeline:

    • Re-architected the processor module into a formal, six-stage pipeline: Rings -> Kekulize -> Electrons -> Aromaticity -> Resonance -> Hybridization.
    • Each stage is now a distinct, well-defined pass that enriches an AnnotatedMolecule data structure, ensuring a deterministic and chemically correct flow of information.
    • Introduced the pauling crate for robust resonance system detection.
  • Decoupled Aromaticity from Resonance:

    • Aromaticity perception is now a dedicated pass that only flags atoms in rings satisfying Hückel's rule.
    • A new, separate resonance perception pass identifies all atoms in any conjugated π-system, whether cyclic or linear.
    • The Hybridization enum now has a distinct Resonant variant, which is used as the primary condition for assigning _R types.
  • Enhanced Charge and Lone Pair Perception:

    • Removed the formal_charge from the public MolecularGraph::add_atom API.
    • Implemented a new electrons perception pass that automatically infers formal charges and lone pairs.
    • The pass uses an expert system of functional group templates to correctly handle complex cases like nitro groups, sulfoxides, carboxylates, and zwitterionic amino acids.
  • Refined DREIDING Rule Set:

    • The default ruleset (default.rules.toml) has been overhauled to leverage the new, more precise perception data.
    • Rules for _R types (e.g., C_R, N_R, O_R) are now correctly keyed on hybridization = "Resonant" instead of the overly restrictive is_aromatic = true.
    • This corrects the typing for numerous functional groups (amides, carboxylates, guanidinium, etc.) that are resonant but not aromatic.
  • Expanded and Refined Integration Test Suite:

    • Removed explicit charge parameters from all test case definitions to validate the new charge perception logic.
    • Corrected the expected types for dozens of atoms in the test suite to match the more accurate _R assignments produced by the new engine.
    • Added new, complex test cases for nucleic acids and challenging structures from the original DREIDING paper to validate the enhanced perception capabilities.
  • Improved CI and Project Documentation:

    • Strengthened the CI pipeline with additional checks for formatting, linting, and documentation.
    • Updated all architectural and user-facing documentation to reflect the new perception pipeline and simplified API.

TKanX added 30 commits October 27, 2025 17:08
…denosine, Phosphate Ester, Zinc Complex, and Perchlorate Anion
TKanX added 20 commits November 16, 2025 09:32
…ds, adding detailed descriptions and usage notes
…hancing clarity on rule parsing and application engines
…unctions, adding detailed descriptions and usage notes
…etailed usage examples and descriptions for topology assignment functions
…properDihedral structs in the final topology
…ing clarity on roles and transformations within the dreid-typer library
…y and structure, enhancing the overview and detailing each processing pass
@TKanX TKanX self-assigned this Nov 17, 2025
Copilot AI review requested due to automatic review settings November 17, 2025 03:40
@TKanX TKanX added bug 🐛 Something isn't working enhancement ✨ New feature or request labels Nov 17, 2025
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive refactoring of the chemical perception engine to properly separate aromaticity detection from resonance modeling. The previous implementation conflated these two concepts, causing incorrect atom typing for non-aromatic resonant systems like amides and carboxylates. The new architecture implements a six-stage perception pipeline that provides more accurate hybridization and DREIDING atom type assignments.

Key Changes:

  • Restructured perception into six distinct, ordered stages: Rings → Kekulize → Electrons → Aromaticity → Resonance → Hybridization
  • Integrated the pauling crate for robust resonance system detection
  • Decoupled aromaticity (Hückel rule-based) from general resonance (conjugated π-systems)
  • Introduced expert system for formal charge and lone pair perception
  • Updated DREIDING rules to use hybridization = "Resonant" instead of is_aromatic = true for _R types
  • Removed ZINC_COMPLEX test case
  • Updated hundreds of test expectations to reflect more accurate _R type assignments

Reviewed Changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/integration_tests.rs Removed ZINC_COMPLEX test case
tests/cases/nucleic_acids.rs Updated expected types from O_2/N_3 to O_R/N_R for resonant systems in nucleic acid bases
tests/cases/dreiding_paper.rs Corrected expected types for carboxylates, nitro groups, amides, thiourea, sulfonamides, and other resonant systems; removed ZINC_COMPLEX test
tests/cases/amino_acids.rs Updated carboxylate and amide oxygens/nitrogens to use _R types across all amino acid zwitterions
src/typing/rules.rs New module defining rule schema, parsing, and default ruleset management
src/typing/mod.rs New module organizing typing engine and rules
src/typing/engine.rs New priority-based iterative typing engine implementation
src/perception/rings.rs New ring detection using minimal cycle basis and Gaussian elimination
src/perception/resonance.rs New resonance detection integrating pauling crate with local heuristics
src/perception/model.rs New annotated molecule model with pauling trait implementations
src/perception/mod.rs New six-stage perception pipeline coordinator
src/perception/kekulize.rs New Kekulé solver for aromatic bond resolution
src/perception/hybridization.rs New VSEPR-based hybridization assignment
src/perception/electrons.rs New expert system for charge and lone pair perception
docs/ARCHITECTURE.md Updated to reflect new three-stage pipeline with detailed perception stages
Cargo.toml Added thiserror 2.0.17 and pauling 0.1.0 dependencies; bumped version to 0.2.1

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@TKanX TKanX merged commit dd01bc3 into main Nov 17, 2025
8 checks passed
@TKanX TKanX deleted the bugfix/11-core-typing-logic-incorrectly-equates-_r-type-with-aromaticity-failing-to-type-non-aromatic-resonance-systems branch November 17, 2025 03:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug 🐛 Something isn't working enhancement ✨ New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Core Typing Logic Incorrectly Equates _R Type with Aromaticity, Failing to Type Non-Aromatic Resonance Systems

2 participants