Skip to content

fix s3 delete#56

Merged
tdas merged 1 commit intounitycatalog:mainfrom
XiaoHongbo-Hope:fix_s3_delete
Jul 4, 2024
Merged

fix s3 delete#56
tdas merged 1 commit intounitycatalog:mainfrom
XiaoHongbo-Hope:fix_s3_delete

Conversation

@XiaoHongbo-Hope
Copy link
Contributor

@XiaoHongbo-Hope XiaoHongbo-Hope commented Jun 17, 2024

PR Checklist

  • A description of the changes is added to the description of this PR.
  • If there is a related issue, make sure it is linked to this PR.
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added or modified a feature, documentation in docs is updated

Description of changes

@rxin
Copy link
Collaborator

rxin commented Jun 17, 2024

Can you describe more clearly what the issue was and what the fix addresses? Thanks.

@tdas
Copy link
Contributor

tdas commented Jun 20, 2024

@Xiaobenbenben any thoughts on what @rxin asked? What is the issue that you are seeing that this fix addresses?

@akalotkin
Copy link

@rxin, @tdas, the obvious problem with original code is that it doesn't take into account huge S3 prefixes (more than 1000 objects). In this case original code will delete only first 1000 objects.

AWS S3 client has isTruncated method which indicates if there are more objects to list. The proposed fix repeats deletion in a loop until there are no unlisted objects.

AWS S3 client also exposes method to delete multiple objects public DeleteObjectsResult deleteObjects(DeleteObjectsRequest deleteObjectsRequest). It will improve performance and reduce S3 throttling probability.

@tdas
Copy link
Contributor

tdas commented Jun 28, 2024

@Xiaobenbenben thank you for the explanation. you are right. have you tested this with actual S3? Our automated tests don't have any actual S3 tests... so we have to manually test this.

@tdas
Copy link
Contributor

tdas commented Jun 28, 2024

@ravivj-db @vikrantpuppala can one of you test this out?

@tdas tdas requested a review from ravivj-db June 28, 2024 04:59
@ravivj-db
Copy link
Collaborator

Looking into this.

@ravivj-db
Copy link
Collaborator

@tdas Have tested out this code, and it's working fine. You can go ahead and merge it. @Xiaobenbenben thanks for handling pagination and ensuring that all objects inside the s3 folder are deleted.

@tdas tdas merged commit 4fe4802 into unitycatalog:main Jul 4, 2024
@tdas
Copy link
Contributor

tdas commented Jul 4, 2024

Thank you @Xiaobenbenben for contributing this.

rtyler pushed a commit to rtyler/unitycatalog that referenced this pull request Sep 5, 2024
@XiaoHongbo-Hope
Copy link
Contributor Author

Thank you @Xiaobenbenben for contributing this.

I am very sorry to miss your discussion here, very glad to contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants