refactor(puffin): Move puffin crate contents inside iceberg crate#789
Conversation
Xuanwo
left a comment
There was a problem hiding this comment.
A concern is whether all Iceberg users will require Puffin support. It's important to note that users who only access the catalog will also need to pull the Iceberg core. However, I don't have a strong opinion on this.
I'll approve this to unblock the PR from merging. This decision can be left to @Fokko and @liurenjie1024.
|
I can see value in both. It could be that folks just want to have logic to read the Puffin files, and then they could just use the crate (for example PyIceberg) without having to pull in core. Not super strong on this either, if we feel like we should just move it into the core crate, than that's okay with me as well 👍 |
|
Thank you, @Fokko, for the feedback. Since neither of us has a strong opinion on this and the suggestion comes from @liurenjie1024 with agreement from @fqaiser94, I believe there’s no need to wait for another round of review. This change can easily be reverted at any time if we find it to be incorrect. So, let’s move forward. |
|
The reason why I suggest moving puffin into core crate is to avoid circulate dependency problem, thinking about the case we put it outside:
Usually I'm leaning toward to smaller crates so that user could choose a minimum set of dependencies. But if we really want to do this, we need to split core crate into smaller crates, for example file io, data reader/writer. |
|
That makes sense @liurenjie1024 thanks for the explanation 👍 Coming from Java/Python, everything in Rust is small anyway, so I think we're good by moving Puffin into Core. |
Part of #744
Summary
Context
@liurenjie1024 raised a good question about why we needed a separate puffin crate at all.
On reflection, I realized that there's no particularly strong reason to have a separate puffin crate.
Hence this PR.
I don't have a super strong opinion on this change; happy to leave things as-is if people object.
I was however persuaded to make this change when I noticed that in the Java implementation, the Puffin functionality is packaged as part of the iceberg-core module (rather than as a separate module).