Introduce async prefetch and staleness for Iceberg metadata#96191
Introduce async prefetch and staleness for Iceberg metadata#96191
Conversation
ecff9f4 to
f2590fd
Compare
daffef9 to
6490f8f
Compare
8509521 to
313625c
Compare
b644b4e to
fdf66a2
Compare
8fced2c to
0da9152
Compare
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp
Outdated
Show resolved
Hide resolved
24aa78d to
655f83f
Compare
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadataFilesCache.h
Outdated
Show resolved
Hide resolved
655f83f to
9c56b43
Compare
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp
Outdated
Show resolved
Hide resolved
9c56b43 to
2310269
Compare
| struct LatestMetadataVersion | ||
| { | ||
| /// time when it's been received from the remote catalog and cached | ||
| std::chrono::time_point<std::chrono::system_clock> cached_at; |
There was a problem hiding this comment.
Using system_clock to enforce iceberg_metadata_staleness_seconds is unsafe when wall clock moves backward/forward (NTP step, VM resume). This can make stale metadata appear fresh for much longer (or expire too early), violating the staleness contract. Please switch this age check to steady_clock (store cached_at with steady_clock::time_point and compare against steady_clock::now()).
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp
Outdated
Show resolved
Hide resolved
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp
Outdated
Show resolved
Hide resolved
| /// and after parsing it, we fetch manifest lists, parse and cache them | ||
| auto ctx = Context::getGlobalContextInstance()->getBackgroundContext(); | ||
| auto [actual_data_snapshot, actual_table_state_snapshot] = getRelevantState(ctx, true); | ||
| if (actual_data_snapshot) |
There was a problem hiding this comment.
backgroundMetadataRefresherThread dereferences Context::getGlobalContextInstance() without checking for nullptr. During shutdown, this can race with global context teardown and turn a recoverable refresh failure into an exception/segfault in a background task.
Please guard this path before calling getBackgroundContext (e.g. if (auto * global = Context::getGlobalContextInstance()) ... else return;) and skip refresh when global context is unavailable.
…sh' to 'prefetch'
1cf881f to
85f9e32
Compare
…for global context
LLVM Coverage Report
PR changed lines: PR changed-lines coverage: 71.31% (256/359, 0 noise lines excluded) |
|
Hi @arsenmuk, |
Thank you @alexbakharew, it's done now |
…ache-preheat-and-staleness-for-iceberg Introduce async prefetch and staleness for Iceberg metadata
| assert 0 < int(s3_read) | ||
| assert 0 < int(s3_get) | ||
| assert 0 < int(s3_head) | ||
| assert 0 < int(s3_list) | ||
| assert 0 < int(cache_hit) # old manifest lists & files are found in cache | ||
| assert 0 < int(cache_miss) # new manifest lists & files are not found in local cache | ||
| assert 0 < int(cache_stale_miss) # the cached metadata has been considered stale |
There was a problem hiding this comment.
Can we compare new values with ones which we got from previous check? (line 98).
There was a problem hiding this comment.
It may happen that the values from ProfileEvents are not changed here and we will not be able to see it
There was a problem hiding this comment.
(we have discussed that separately) not an issue because the metrics used are query-specific and not system-wide
Antalya 26.1 Backport of ClickHouse#96191 - Introduce async prefetch and staleness for Iceberg metadata
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
iceberg_metadata_async_prefetch_period_msat the table creation. E.g.:iceberg_metadata_staleness_msparameter, which would allow ClickHouse to rely on the cache version of the metadata if it's fresher than the specified staleness. Otherwise, the remote Iceberg catalog will be queried for the latest metadata in order to process the request (how it worked before). With this change, we're able to eliminate calls to Iceberg catalog down to 0 during request processing, which is expected to bring a visible performance gain. Example:Similar functionality is available at:
Bechmarks:
Closes #90387
Documentation entry for user-facing changes
Note
Medium Risk
Touches Iceberg metadata resolution and adds background scheduling threads, which can affect correctness (stale reads) and load (background remote fetches) if misconfigured; changes are scoped to Iceberg and guarded by new settings.
Overview
Adds staleness-aware Iceberg metadata resolution: new query setting
iceberg_metadata_staleness_secondsallows using a recently refreshed cached “latest metadata” pointer, otherwise forces a remote catalog fetch; introducesIcebergMetadataFilesCacheStaleMissesto observe stale-cache fallbacks.Implements optional async cache preheating for Iceberg tables via new table/storage setting
iceberg_metadata_async_refresh_period_msand a dedicatedContext::getIcebergSchedulePool()(configurable with server settingiceberg_background_schedule_pool_sizeplus newCurrentMetrics/thread name), periodically fetching latest metadata and manifest files; Iceberg writes now invalidate cached “latest” entries after committing updates. Integration tests cover stale vs latest reads, background refresh behavior, and cache behavior across inserts.Written by Cursor Bugbot for commit 2310269. This will update automatically on new commits. Configure here.