Skip to content

Introduce async prefetch and staleness for Iceberg metadata#96191

Merged
arsenmuk merged 9 commits intomasterfrom
arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg
Mar 18, 2026
Merged

Introduce async prefetch and staleness for Iceberg metadata#96191
arsenmuk merged 9 commits intomasterfrom
arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg

Conversation

@arsenmuk
Copy link
Copy Markdown
Member

@arsenmuk arsenmuk commented Feb 6, 2026

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

  1. Iceberg receives the feature to asynchronously pre-populate metadata into cache. This can be enabled by setting iceberg_metadata_async_prefetch_period_ms at the table creation. E.g.:
CREATE TABLE X (...)
ENGINE = Iceberg*(...)
SETTINGS iceberg_metadata_async_prefetch_period_ms = 60_000
  1. Select queries from Iceberg tables now can be executed with specifying iceberg_metadata_staleness_ms parameter, which would allow ClickHouse to rely on the cache version of the metadata if it's fresher than the specified staleness. Otherwise, the remote Iceberg catalog will be queried for the latest metadata in order to process the request (how it worked before). With this change, we're able to eliminate calls to Iceberg catalog down to 0 during request processing, which is expected to bring a visible performance gain. Example:
SELECT count() FROM {TABLE_NAME} SETTINGS iceberg_metadata_staleness_ms=600000

Similar functionality is available at:

Bechmarks:

Closes #90387

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Note

Medium Risk
Touches Iceberg metadata resolution and adds background scheduling threads, which can affect correctness (stale reads) and load (background remote fetches) if misconfigured; changes are scoped to Iceberg and guarded by new settings.

Overview
Adds staleness-aware Iceberg metadata resolution: new query setting iceberg_metadata_staleness_seconds allows using a recently refreshed cached “latest metadata” pointer, otherwise forces a remote catalog fetch; introduces IcebergMetadataFilesCacheStaleMisses to observe stale-cache fallbacks.

Implements optional async cache preheating for Iceberg tables via new table/storage setting iceberg_metadata_async_refresh_period_ms and a dedicated Context::getIcebergSchedulePool() (configurable with server setting iceberg_background_schedule_pool_size plus new CurrentMetrics/thread name), periodically fetching latest metadata and manifest files; Iceberg writes now invalidate cached “latest” entries after committing updates. Integration tests cover stale vs latest reads, background refresh behavior, and cache behavior across inserts.

Written by Cursor Bugbot for commit 2310269. This will update automatically on new commits. Configure here.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Feb 6, 2026

Workflow [PR], commit [ea7d994]

Summary:

@clickhouse-gh clickhouse-gh bot added the pr-performance Pull request with some performance improvements label Feb 6, 2026
@arsenmuk arsenmuk force-pushed the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch 6 times, most recently from ecff9f4 to f2590fd Compare February 11, 2026 14:22
@arsenmuk arsenmuk force-pushed the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch 2 times, most recently from daffef9 to 6490f8f Compare February 24, 2026 16:18
@arsenmuk arsenmuk force-pushed the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch 4 times, most recently from 8509521 to 313625c Compare March 5, 2026 14:40
@divanik divanik self-assigned this Mar 9, 2026
@arsenmuk arsenmuk force-pushed the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch from b644b4e to fdf66a2 Compare March 10, 2026 08:18
@divanik divanik self-requested a review March 10, 2026 11:29
@arsenmuk arsenmuk force-pushed the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch 2 times, most recently from 8fced2c to 0da9152 Compare March 11, 2026 08:46
@arsenmuk arsenmuk force-pushed the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch 2 times, most recently from 24aa78d to 655f83f Compare March 11, 2026 10:30
@arsenmuk arsenmuk force-pushed the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch from 655f83f to 9c56b43 Compare March 11, 2026 11:00
@arsenmuk arsenmuk force-pushed the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch from 9c56b43 to 2310269 Compare March 11, 2026 15:05
@arsenmuk arsenmuk marked this pull request as ready for review March 11, 2026 15:08
struct LatestMetadataVersion
{
/// time when it's been received from the remote catalog and cached
std::chrono::time_point<std::chrono::system_clock> cached_at;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using system_clock to enforce iceberg_metadata_staleness_seconds is unsafe when wall clock moves backward/forward (NTP step, VM resume). This can make stale metadata appear fresh for much longer (or expire too early), violating the staleness contract. Please switch this age check to steady_clock (store cached_at with steady_clock::time_point and compare against steady_clock::now()).

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

/// and after parsing it, we fetch manifest lists, parse and cache them
auto ctx = Context::getGlobalContextInstance()->getBackgroundContext();
auto [actual_data_snapshot, actual_table_state_snapshot] = getRelevantState(ctx, true);
if (actual_data_snapshot)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backgroundMetadataRefresherThread dereferences Context::getGlobalContextInstance() without checking for nullptr. During shutdown, this can race with global context teardown and turn a recoverable refresh failure into an exception/segfault in a background task.

Please guard this path before calling getBackgroundContext (e.g. if (auto * global = Context::getGlobalContextInstance()) ... else return;) and skip refresh when global context is unavailable.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resonable - done

@arsenmuk arsenmuk force-pushed the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch from 1cf881f to 85f9e32 Compare March 18, 2026 09:39
@arsenmuk arsenmuk changed the title Introduce cache preheat and staleness for Iceberg Introduce and staleness for Iceberg Mar 18, 2026
@arsenmuk arsenmuk changed the title Introduce and staleness for Iceberg Introduce async prefetch and staleness for Iceberg metadata Mar 18, 2026
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 18, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.80% 83.70% -0.10%
Functions 23.90% 23.90% +0.00%
Branches 76.30% 76.30% +0.00%

PR changed lines: PR changed-lines coverage: 71.31% (256/359, 0 noise lines excluded)
Diff coverage report
Uncovered code

@arsenmuk arsenmuk added this pull request to the merge queue Mar 18, 2026
Merged via the queue into master with commit ee017a6 Mar 18, 2026
163 checks passed
@arsenmuk arsenmuk deleted the arsenmuk/i90387-cache-preheat-and-staleness-for-iceberg branch March 18, 2026 18:53
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 18, 2026
@alexbakharew
Copy link
Copy Markdown
Contributor

Hi @arsenmuk,
it looks like it is a New Feature rather than performance improvement. Could you please change the label?

@arsenmuk arsenmuk added pr-feature Pull request with new product feature and removed pr-performance Pull request with some performance improvements labels Mar 23, 2026
@arsenmuk
Copy link
Copy Markdown
Member Author

Hi @arsenmuk, it looks like it is a New Feature rather than performance improvement. Could you please change the label?

Thank you @alexbakharew, it's done now

mkmkme pushed a commit to Altinity/ClickHouse that referenced this pull request Mar 25, 2026
…ache-preheat-and-staleness-for-iceberg

Introduce async prefetch and staleness for Iceberg metadata
Comment on lines +167 to +173
assert 0 < int(s3_read)
assert 0 < int(s3_get)
assert 0 < int(s3_head)
assert 0 < int(s3_list)
assert 0 < int(cache_hit) # old manifest lists & files are found in cache
assert 0 < int(cache_miss) # new manifest lists & files are not found in local cache
assert 0 < int(cache_stale_miss) # the cached metadata has been considered stale
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we compare new values with ones which we got from previous check? (line 98).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may happen that the values from ProfileEvents are not changed here and we will not be able to see it

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(we have discussed that separately) not an issue because the metrics used are query-specific and not system-wide

zvonand added a commit to Altinity/ClickHouse that referenced this pull request Mar 26, 2026
Antalya 26.1 Backport of ClickHouse#96191 - Introduce async prefetch and staleness for Iceberg metadata
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce STALENESS for Iceberg engine

6 participants