Implement Modular I/O Subsystem with Encapsulated `bio-forge` Pipeline

### Description:

This task focuses on building the complete Input/Output (`io`) subsystem for `dreid-forge`. The architecture will follow a highly modular, Facade-based design, where each file format is handled by its own dedicated submodule, further broken down into `reader` and `writer` components. A key requirement is the complete encapsulation of the `bio-forge` library, which will be used internally to handle complex biological file formats (PDB, mmCIF) and their preparation pipeline (repair, protonation, topology generation). For standard chemical formats (SDF, MOL2), the readers will perform direct parsing of topology. The entire subsystem will be exposed through a clean, high-level API in `src/io/mod.rs`, providing a unified interface for all data ingestion and serialization tasks.

### Tasks:

- [x] **Phase 1: Establish I/O Module Architecture**
  - [x] Create the full directory structure: `src/io/`, `src/io/error.rs`, `src/io/util.rs`, and subdirectories for `pdb`, `mmcif`, `sdf`, `mol2`, `lammps`, and `bgf`, each with `reader.rs` and/or `writer.rs`.
  - [x] **In `src/io/error.rs`**: Define the `io::Error` enum using `thiserror` to handle file parsing, I/O operations, missing metadata, and errors propagated from the internal `bio-forge` library.
  - [x] **In `src/io/mod.rs`**:
    - [x] Define the public-facing configuration structs: `BioReadConfig` and `ProtonationConfig`.
    - [x] Implement the top-level API functions: `read_structure`, `write_structure`, `read_template`, and `write_lammps_package`.
    - [x] Define the `WritableStructure` trait to allow `write_structure` to accept both `System` and `ForgedSystem`.
    - [x] Re-export public types and functions for a clean user-facing module.

- [x] **Phase 2: Implement Core Conversion Layer**
  - [x] **In `src/io/util.rs`**:
    - [x] Implement `from_bio_topology` function to convert a fully processed `bio_forge::Topology` into our `model::System`. This is the primary bridge *from* `bio-forge`.
    - [x] Implement `to_bio_topology` function to convert our `model::System` (with `BioMetadata`) back into a `bio_forge::Topology`. This is the primary bridge *to* `bio-forge` for writing.
    - [x] Implement necessary helper functions for converting enums (`Element`, `BondOrder`) between the two crates to ensure type safety.

- [x] **Phase 3: Implement Biological Format Readers (PDB & mmCIF)**
  - [x] **In `src/pdb/reader.rs`**:
    - [x] Implement the `read` function that orchestrates the full `bio-forge` pipeline:
      - [x] Call `bio_forge::io::read_pdb_structure`.
      - [x] Conditionally apply repair and protonation based on `BioReadConfig`.
      - [x] Build the topology using `bio_forge::ops::TopologyBuilder` (a mandatory step to get bonds).
      - [x] Convert the final `bio_forge::Topology` to `model::System` using the `util` layer.
  - [x] **In `src/mmcif/reader.rs`**:
    - [x] Implement the `read` function following the same pipeline as the PDB reader.

- [x] **Phase 4: Implement Chemical Format Readers (SDF & MOL2)**
  - [x] **In `src/sdf/reader.rs`**:
    - [x] Implement a direct-to-`System` parser for SDF/MOL format. It should read atom elements, coordinates, and the connection table (CT block) to populate `System.atoms` and `System.bonds`. `BioMetadata` will be `None`.
  - [x] **In `src/mol2/reader.rs`**:
    - [x] Implement a direct parser for MOL2 format molecules.
  - [x] **In `src/io/mod.rs`**:
    - [x] Implement the `read_template` wrapper around `bio_forge::io::read_mol2_template` as specified.

- [x] **Phase 5: Implement All Writers**
  - [x] **In `src/pdb/writer.rs` and `src/mmcif/writer.rs`**:
    - [x] Implement `write` functions that check for `BioMetadata`, convert the `System` to `bio_forge::Topology`, and call the corresponding `bio-forge` writer.
  - [x] **In `src/bgf/writer.rs`**:
    - [x] Implement a `write` function for the BGF format, leveraging the `bio-forge` writer.
  - [x] **In `src/sdf/writer.rs` and `src/mol2/writer.rs`**:
    - [x] Implement writers for standard chemical formats.
  - [x] **In `src/lammps/writer.rs`**:
    - [x] Implement the `write` function to generate the `*.data` and `*.in.settings` file pair.
    - [x] The settings writer must implement the "smart" `if/else` logic to adapt to user-defined boundary conditions.
    - [x] The data writer must correctly map `ForgedSystem` to all required LAMMPS sections, including `Masses`, `Atoms` (with molecule IDs), and topology sections with type IDs.

- [x] **Phase 6: Verification**
  - [x] Add unit tests for each reader and writer to ensure format correctness.
  - [x] Create integration tests that read a file, process it through a mock `forge` pipeline, and write it out, ensuring data integrity.
  - [x] Verify that the LAMMPS output can successfully run the water molecule test case without manual modification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Modular I/O Subsystem with Encapsulated `bio-forge` Pipeline #3

Description:

Tasks:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement Modular I/O Subsystem with Encapsulated bio-forge Pipeline #3

Description

Description:

Tasks:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Implement Modular I/O Subsystem with Encapsulated `bio-forge` Pipeline #3