fix iceberg namespace for rest catalog#1086
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7f9573fa8c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| auto table_name = stream_name.substr(pos + 1); | ||
| auto namespace_name = fmt::format("{}.{}", getDatabaseName(), stream_name.substr(0, stream_name.size() - table_name.size() - 1)); | ||
| return {std::move(namespace_name), std::move(table_name)}; |
There was a problem hiding this comment.
Account for nested namespaces when checking emptiness
The new mapping splits dotted stream names into sub-namespaces (e.g., db.a.b + table c). However DatabaseApacheIceberg::empty() still calls getCatalog()->empty(getDatabaseName()), which only checks the base namespace and won’t see tables in db.* sub-namespaces. This means DROP/DETACH DATABASE (see DatabaseCatalog.cpp’s check_empty path) can succeed even when all tables live under nested namespaces, leaving Iceberg tables orphaned and the database state inconsistent. Consider updating empty() (and any related drop guards) to enumerate sub-namespaces or otherwise account for the new namespace mapping.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Not support substreams currently as REST API limits. Show tables command has the same problem. May improve this in future. Drop database does not rely on this so have no impact
Fix #1083
Fix the inconsistency in mapping proton stream name to Iceberg namespace name. And the issue in encoding multipart namespace ('.' -> "%1F")
For example, Proton stream `a`.`b.c.d` is with namespace `a.b.c` and table name `d` in Iceberg.
It is a little strange namespace is part from database name and part from the prefix of stream name. In ClickHouse, it put all the namespace in table name prefix (`a`.`b.c.d` is to namespace `b.c` and table name `d`. database name `a` is just to identify catalog)