Conversation
…osition enums with AtomResidueInfo updates
…e and field correctness
… pipeline utilities
…larity and efficiency
…improved flexibility
…improved readability
…attern for improved clarity
… for improved clarity
…ions for Clean, Protonation, Solvate, and Topology
…ations and topology building
…ateConfig conversions
…ff from TopologyConfig
…system equivalence
…onversions for CTfile and MOL2 formats
…ort for Mol2 and SDF
…ort for Mol2 and SDF
44 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
This PR implements a comprehensive I/O subsystem that establishes the project's capability to read and write molecular structure data across multiple file formats. The implementation introduces a clean separation between chemical formats (small molecules) and biological formats (macromolecules), with a unified error handling system and robust bio-forge integration for structure preparation workflows.
Key Changes:
- Introduced builder pattern for
AtomResidueInfowith new biological metadata fields (StandardResidue,ResidueCategory,ResiduePositionenums) - Implemented
ChemReader/ChemWriterfor SDF and Mol2 formats with native parsers - Implemented
BioReader/BioWriterfor PDB and mmCIF with configurable preparation pipelines (cleaning, protonation, solvation) - Created sophisticated LAMMPS and BGF writers supporting complex forcefield parameters and system topologies
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/lib.rs |
Exports new io module |
src/model/metadata.rs |
Refactored to builder pattern, added biological metadata enums |
src/io/mod.rs |
Public API facade with reader/writer structs and configuration types |
src/io/util.rs |
Bio-forge integration with bidirectional model conversions |
src/io/error.rs |
Centralized error handling with thiserror and contextual information |
src/io/sdf/* |
V2000-compliant SDF reader/writer with element inference |
src/io/mol2/* |
TRIPOS Mol2 parser with section-based parsing |
src/io/pdb/* |
PDB reader/writer with structure preparation pipeline |
src/io/mmcif/* |
mmCIF reader/writer mirroring PDB functionality |
src/io/bgf/* |
BGF writer with chain/residue sorting and CONECT generation |
src/io/lammps/* |
LAMMPS Data/Settings file writer with hybrid style support |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
Implemented a comprehensive Input/Output subsystem that serves as the interface between external file formats and the internal
Systemmodel. Introduces a unified API for handling both macromolecular data (via an encapsulatedbio-forgepipeline) and small molecule informatics. It includes native parsers for chemical formats and robust writers for simulation engines, establishing the project's capability to ingest raw structure data and export simulation-ready configurations.Changes:
Designed Unified I/O Architecture:
ChemReader/ChemWriterfor small molecule formats andBioReader/BioWriterfor macromolecular structures.Errorhandling system covering I/O, parsing, and conversion failures.src/io/mod.rsas the public facade, exporting configuration structs for topology, cleaning, protonation, and solvation.Integrated
bio-forgePipeline:src/io/util.rsto adaptbio-forgemodels (Structure, Topology) to localSystemmodels.Refactored Metadata System:
AtomResidueInfoinsrc/model/metadata.rsto use a Builder Pattern, improving ergonomics for complex residue data.StandardResidue,ResidueCategory, andResiduePositionenums to richer biological context preservation.Implemented Native Chemical Format Support:
@<TRIPOS>ATOM,BOND, andMOLECULEsections.Developed Simulation Output Formats:
CONECTrecord generation.periodicandnon-periodicboundary conditions.