Merged
Conversation
Member
|
Looks good, just a note you just the attachment point semantics instead of "R" that really should just be for Markush. It is OK as a '*' though |
This was referenced Sep 16, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This new functionality, the SugarDetectionUtility, can be used for algorithmically separating glycosides into their aglycone and sugar moieties. The main feature is the ability to create copies of both the aglycone and individual sugar fragments from a given molecule, with proper handling of attachment points and stereo chemistry, and some optional post-processing options. The new class extends the SugarRemovalUtility and uses the same algorithm (described here) to generate the aglycone (with slight deviations, see below).
Background:
The SugarRemovalUtility is focused on generating a sensible aglycone and extracts the removed sugars only as Tetrahydrofuran/Tetrahydropyran (CNP IDs refer to compounds in the COCONUT natural products database):
This shortcoming is now resolved by the Sugar Detection Utility:
The Sugar Detection Utility supports:
- Duplication of connecting hetero atoms in glycosidic bonds, to produce more sensible educts (see above)
- Preservation of stereo chemistry at connection points (see above)
- Detection and extraction of both circular and linear sugar moieties (see the Sugar Removal Utility documentation for more details on this)
- Saturation of broken bonds with either implicit hydrogen atoms or pseudo ("R") atoms
- Post-processing of sugar fragments including bond splitting (O-glycosidic, ether, ester, peroxide) to separate the individual sugars
- Limiting this post-processing by size, so that small modifications remain attached to the sugars
- Optional mapping of atoms and bonds from the original molecule to their copies in the aglycone and sugar fragments (to retrieve atoms indices or an extracted fragment or to get group indices for all atoms in the original molecule
All sugar detection and removal operations respect the settings inherited from the parent SugarRemovalUtility class, including terminal vs. non-terminal sugar removal, preservation mode settings, various detection thresholds, etc. (for documentation, see the code or the paper linked above)In two cases, the initial SugarRemovalUtility results are corrected for extraction:
Basic Usage Example:
The code is documented and extensively tested.
If you want to see the SugarDetectionUtility in action, check out our MOlecule fRagmenTAtion fRamework (MORTAR) application, the SDU will be part of the next release.
A paper describing the algorithm is coming up.
Looking forward to your feedback!