Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 17, 2025

Consolidates four different CRC libraries (crc32c, crc32fast, crc64fast-nvme, and crc-fast) into a single dependency on crc-fast.

Changes

Dependency cleanup

  • Removed crc32c, crc32fast, crc64fast-nvme from workspace dependencies
  • Unified on crc-fast = "1.6.0" across all crates

Algorithm mappings

  • CRC32 IEEE: crc32fast::Hashercrc_fast::Digest::new(Crc32IsoHdlc)
  • CRC32C: crc32c::crc32c_appendcrc_fast::Digest::new(Crc32Iscsi)
  • CRC64-NVME: crc64fast_nvme::Digestcrc_fast::Digest::new(Crc64Nvme)

Code migrations

  • crates/rio/src/checksum.rs: Migrated hasher implementations for all three CRC variants
  • crates/rio/src/{encrypt_reader,compress_reader}.rs: Replaced one-shot hash() calls with inline digest usage
  • crates/filemeta/src/fileinfo.rs: Migrated hash computation for erasure coding index calculation
  • crates/utils/src/hash.rs: Updated CRC hash function for bucket distribution

Example

Before:

let crc = crc32fast::hash(data);

After:

let crc = {
    let mut hasher = crc_fast::Digest::new(crc_fast::CrcAlgorithm::Crc32IsoHdlc);
    hasher.update(data);
    hasher.finalize() as u32
};

Note: Old CRC libraries remain in dependency tree as transitive dependencies from aws-smithy-eventstream, google-cloud-storage, and s3s.

Original prompt

This section details on the original issue you should resolve

<issue_title>Unify CRC Implementation - Replace with crc-fast</issue_title>
<issue_description>Unify CRC Implementation - Replace with crc-fast

Problem Description

The project currently uses multiple different CRC calculation libraries:

  • crc-fast = "1.6.0"
  • crc32c = "0.6.8"
  • crc32fast = "1.5.0"
  • crc64fast-nvme = "1.2.1"

This multi-library approach causes:

  1. Code redundancy - Multiple implementations of the same functionality
  2. Maintenance complexity - Need to familiarize with multiple library APIs
  3. Performance inconsistency - Different libraries may have performance variations
  4. Dependency bloat - Increases final binary size

Goal

Migrate all CRC calculations to use the crc-fast library exclusively and remove dependencies on other CRC libraries.

Migration Requirements

1. Dependency Management

# Remove
crc32c = "0.6.8"
crc32fast = "1.5.0"
crc64fast-nvme = "1.2.1"

# Keep and potentially upgrade
crc-fast = "1.6.0"  # or consider upgrading to latest version

2. API Migration Guide

From crc32c

// Original code
use crc32c::crc32c;
let checksum = crc32c(&data);

// New code
use crc_fast::Crc;
let crc = Crc::<u32>::new(&crc_fast::CRC_32_ISCSI); // CRC-32C (ISCSI)
let checksum = crc.calculate(&data);

From crc32fast

// Original code
use crc32fast::Hasher;
let mut hasher = Hasher::new();
hasher.update(&data);
let checksum = hasher.finalize();

// New code
use crc_fast::Crc;
let crc = Crc::<u32>::new(&crc_fast::CRC_32_ISO_HDLC);
let checksum = crc.calculate(&data);

From crc64fast-nvme

// Original code
use crc64fast_nvme::Crc64;
let checksum = Crc64::new().checksum(&data);

// New code
use crc_fast::Crc;
let crc = Crc::<u64>::new(&crc_fast::CRC_64_ECMA_182);
let checksum = crc.calculate(&data);

Implementation Steps

Step 1: Analyze Existing Code Usage

  • Use grep to analyze all CRC library usage locations
  • Document CRC algorithm types for each usage scenario
  • Confirm migration feasibility

Step 2: Create Unified CRC Adapter Layer

  • Create unified interface in src/crc/mod.rs
  • Implement all required CRC algorithm variants
  • Maintain compatibility with original APIs

Step 3: Gradually Replace Calling Code

  • First replace crc32c usage
  • Then replace crc32fast usage
  • Finally replace crc64fast-nvme usage
  • Run tests to verify after each replacement

Step 4: Clean Up Dependencies and Code

  • Remove unnecessary dependencies from Cargo.toml
  • Remove adapter layer (if no longer needed)
  • Run full test suite

Step 5: Performance Verification and Optimization

  • Compare performance before and after migration
  • Enable crc-fast SIMD features
  • Conduct stress testing

Acceptance Criteria

  • All CRC calculation functions work correctly
  • Performance is not worse than original implementation
  • All test cases pass
  • Code review approved
  • Dependency cleanup completed

Risk Control

  1. Rollback plan: Keep git history of original code for easy rollback if needed
  2. Phased implementation: Replace gradually to avoid large-scale changes at once
  3. Performance testing: Compare performance before and after migration
  4. Compatibility verification: Ensure CRC results match in all usage scenarios
    </issue_description>

<agent_instructions>In-depth analysis of demand content and gradually realize demand content</agent_instructions>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Unify CRC implementation by replacing with crc-fast Unify CRC implementations to crc-fast Nov 17, 2025
Copilot AI requested a review from houseme November 17, 2025 09:09
@houseme houseme changed the base branch from main to copilot/fix-data-recovery-during-disconnection November 17, 2025 09:30
@houseme houseme marked this pull request as ready for review November 17, 2025 11:00
@houseme houseme merged commit 2c73e22 into copilot/fix-data-recovery-during-disconnection Nov 17, 2025
1 check passed
@houseme houseme deleted the copilot/unify-crc-implementation branch November 17, 2025 14:53
houseme added a commit that referenced this pull request Nov 17, 2025
#871)

* Initial plan

* Fix: Trigger self-heal when missing shards detected during read

- Added proactive heal detection in get_object_with_fileinfo
- When reading an object, now checks if any shards are missing even if read succeeds
- Sends low-priority heal request to reconstruct missing shards on rejoined nodes
- This fixes the issue where data written during node outage is not healed when node rejoins

Co-authored-by: houseme <[email protected]>

* fix

* Unify CRC implementations to crc-fast (#873)

* Initial plan

* Replace CRC libraries with unified crc-fast implementation

Co-authored-by: houseme <[email protected]>

* fix

* fix: replace low to Normal

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: houseme <[email protected]>
Co-authored-by: houseme <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: houseme <[email protected]>
Co-authored-by: houseme <[email protected]>
houseme added a commit that referenced this pull request Nov 17, 2025
* Initial plan

* Fix large file upload freeze by increasing StreamReader buffer size

Co-authored-by: houseme <[email protected]>

* Add comprehensive documentation for large file upload freeze fix

Co-authored-by: houseme <[email protected]>

* upgrade s3s version

* Fix compilation error: use BufReader instead of non-existent StreamReader::with_capacity

Co-authored-by: houseme <[email protected]>

* Update documentation with correct BufReader implementation

Co-authored-by: houseme <[email protected]>

* add tokio feature `io-util`

* Implement adaptive buffer sizing based on file size

Co-authored-by: houseme <[email protected]>

* Constants are managed uniformly and fmt code

* fix

* Fix: Trigger self-heal on read when shards missing from rejoined nodes (#871)

* Initial plan

* Fix: Trigger self-heal when missing shards detected during read

- Added proactive heal detection in get_object_with_fileinfo
- When reading an object, now checks if any shards are missing even if read succeeds
- Sends low-priority heal request to reconstruct missing shards on rejoined nodes
- This fixes the issue where data written during node outage is not healed when node rejoins

Co-authored-by: houseme <[email protected]>

* fix

* Unify CRC implementations to crc-fast (#873)

* Initial plan

* Replace CRC libraries with unified crc-fast implementation

Co-authored-by: houseme <[email protected]>

* fix

* fix: replace low to Normal

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: houseme <[email protected]>
Co-authored-by: houseme <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: houseme <[email protected]>
Co-authored-by: houseme <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: houseme <[email protected]>
Co-authored-by: houseme <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unify CRC Implementation - Replace with crc-fast

2 participants