Skip to content

V1 metadata parsing incorrectly accepts snapshots with both manifest-list and manifests. #1957

@yshcz

Description

@yshcz

Apache Iceberg Rust version

None

Describe the bug

For V1 snapshots, the spec requires that the manifests field must be omitted if manifest-list is present.

But the current V1 snapshot parsing incorrectly accepts this, and worse, stores an error message as the manifest-list value.

It looks like this bug may have been introduced during https://github.com/apache/iceberg-rust/pull/201/changes#diff-eee8bc669cc946bb5e92cd9d4759e49b89f5cbf4a6c0ab6741ba987b7bf8596eR287

I think updating that match arm to return an error instead of a string should fix the bug.

To Reproduce

#[test]
fn test_v1_snapshot_with_both_manifest_list_and_manifests_is_spec_violation() {
    let metadata = r#"
    {
        "format-version": 1,
        "table-uuid": "d20125c8-7284-442c-9aea-15fee620737c",
        "location": "s3://bucket/test/location",
        "last-updated-ms": 1700000000000,
        "last-column-id": 1,
        "schema": {
            "type": "struct",
            "fields": [
                {"id": 1, "name": "x", "required": true, "type": "long"}
            ]
        },
        "partition-spec": [],
        "properties": {},
        "current-snapshot-id": 111111111,
        "snapshots": [
            {
                "snapshot-id": 111111111,
                "timestamp-ms": 1600000000000,
                "summary": {"operation": "append"},
                "manifest-list": "s3://bucket/metadata/snap-123.avro",
                "manifests": ["s3://bucket/metadata/manifest-1.avro"]
            }
        ]
    }
    "#;

    let result = serde_json::from_str::<TableMetadata>(metadata);
    assert!(
        result.is_ok(),
        "Parsing should fail for spec violation but currently succeeds"
    );
    // Expected behavior
    // assert!(result.is_err());

    let table_metadata = result.unwrap();
    let snapshot = table_metadata
        .snapshot_by_id(111111111)
        .expect("snapshot exists due to a bug");

    let manifest_list = snapshot.manifest_list();

    // The manifest_list incorrectly contains an error message
    assert_eq!(
        manifest_list,
        "Invalid v1 snapshot, when manifest list provided, manifest files should be omitted"
    );
}

Expected behavior

Spec-violating metadata JSON should be rejected.

Willingness to contribute

None

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinggood first issueGood for newcomers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions