Block-level replication primitives for DuckDB. Encodes dirty 256KB blocks into checksummed segments, ships them to S3, and applies them on followers.
Part of the hadb ecosystem. Used by haduck (the DuckDB extension) for its replication layer.
duckblock is a Rust library that provides:
- Segment format: wraps hadb-changeset physical changesets with DuckDB defaults (u64 page IDs, 256KB block size)
- Storage: S3 key layout, upload/download, discovery of incrementals and snapshots
- Apply: pwrite blocks at offsets with checksum chain verification (fail-fast)
- Pull: discover new segments from S3, download, apply sequentially
- Replicator:
hadb::Replicatorimplementation for DuckDB (add/pull/remove/sync + push_checkpoint) - FollowerBehavior:
hadb::FollowerBehaviorfor follower poll loops and catchup-on-promotion
haduck (C++ DuckDB extension)
|-- FileSync() triggers push_checkpoint()
v
duckblock (this crate)
|-- DuckblockReplicator: bundles dirty blocks into segments, uploads
|-- DuckblockFollowerBehavior: polls S3, downloads + applies segments
|-- pull_incremental: discover -> download -> apply with chain verification
v
hadb-changeset
|-- .hadbp binary format (header + sorted pages + SHA-256 checksum)
|-- S3 key layout: {prefix}{db}/{gen:04x}/{seq:016x}.hadbp
v
hadb-io (ObjectStore trait, S3 backend, retry)
Each segment is a .hadbp file containing:
- Sorted 256KB blocks with u64 block IDs
- SHA-256 checksum chain (each segment chains from the previous)
- Sequence number (monotonic, +1 per checkpoint)
Segments are thin wrappers over hadb-changeset::physical::PhysicalChangeset:
use duckblock::segment::{new_segment, block, encode, decode};
let seg = new_segment(1, 0, vec![
block(0, dirty_block_0_bytes),
block(5, dirty_block_5_bytes),
]);
let bytes = encode(&seg);
let decoded = decode(&bytes).unwrap();Implements hadb::Replicator for the HA coordinator:
use duckblock::replicator::DuckblockReplicator;
let replicator = DuckblockReplicator::new(s3_storage, "prefix/");
// Leader: register DB and take initial snapshot
replicator.add("mydb", Path::new("/data/my.duckdb")).await?;
// Leader: push dirty blocks after CHECKPOINT
replicator.push_checkpoint("mydb", dirty_blocks).await?;
// Follower: restore from snapshot + apply incrementals
replicator.pull("mydb", Path::new("/data/follower.duckdb")).await?;DuckblockFollowerBehavior implements hadb::FollowerBehavior for the coordinator's follower loop:
use duckblock::follower_behavior::DuckblockFollowerBehavior;
let behavior = DuckblockFollowerBehavior::new(s3_storage);
// Used by hadb::Coordinator internally{prefix}{db_name}/0001/{seq:016x}.hadbp -- snapshots (generation 1)
{prefix}{db_name}/0000/{seq:016x}.hadbp -- incrementals (generation 0)
Same layout as walrust (SQLite) and all hadb-changeset consumers.
cargo testS3 integration tests require credentials:
# Set S3 credentials, then:
cargo test --test s3_integrationApache-2.0