-
Notifications
You must be signed in to change notification settings - Fork 172
Description
What happened
Hi team,
We are trying to expire the old snapshots and delete the orphan files using the Nessie GC feature. Iceberg tables are created on top of s3 bucket, so after nessie gc we are expecting the expired files (data as well as metadata) to be deleted from the s3. However, the physical files are not deleted even after trying gc as well as explicit delete commands. Furthermore, we encountered following error while running nessie gc.
_Caused by: java.lang.RuntimeException: Failed to get paths from manifest file location.avro
at org.projectnessie.gc.iceberg.IcebergContentToFiles.allDataAndDeleteFiles(IcebergContentToFiles.java:225)
at org.projectnessie.gc.iceberg.IcebergContentToFiles.lambda$allManifestsAndDataFiles$2(IcebergContentToFiles.java:202)
Caused by: java.lang.IllegalArgumentException: Cannot parse partition spec fields, not an array: {"spec-id":0,"fields":[]}_
Can you please help us to understand this issue?
How to reproduce it
- Create Iceberg table on s3 using Iceberg rest via Nessie (hosted via docker container)
- Add some records in the table.
- Get current snapshot id.
- Delete records from the table
- Get current snapshot id.
- Run the following docker container to run the nessie GC.
nessie-gc:
image: ghcr.io/projectnessie/nessie-gc:0.91.2
ports:
- "5435:5435"
depends_on:
nessie:
condition: service_healthy
command: gc --uri http://nessie:19120/api/v2 --iceberg=s3.access-key-id=${S3_KEY} --iceberg=s3.secret-access-key=${S3_SECRET} --jdbc --jdbc-url=${JDBC_URL} --jdbc-user=${JDBC_USERNAME} --jdbc-password=${JDBC_PASSWORD}
- Check the gc log, docker logs container-id
- Check the postgres tables if the references and live sets are deleted or not.
- Check the s3 location if the expired or files marked as delete are deleted or not.
Nessie server type (docker/uber-jar/built from source) and version
ghcr.io/projectnessie/nessie:0.90.4
Client type (Ex: UI/Spark/pynessie ...) and version
No response
Additional information
No response