Skip to content

Introduce STALENESS for Iceberg engine #90387

@alesapin

Description

@alesapin

Iceberg table engine consists of metadata files (.json and .avro) and data files (in 99% of cases .parquet). All the files in iceberg table are immutable. This property allows to cache them in a very convenient way which clickhouse already extensively use:

  1. Cache for parsed metadata files: Support Iceberg Metadata Files Cache #77156
  2. Cache for parquet footers: Parquet footer cache #89750
  3. Filesystem on-disk cache for all objects in Object Storage

However when we execute new SELECT query we still cannot avoid touching object storage (or catalog) because we need to check for new metadata.json files. So even if we already have 100% cached data we will still spend a lot of time going to external service. However it's quite common scenario when it's not required to always query most recent up-to-date data and some staleness is acceptable. Actually for ReplicatedMergeTree table engine some unpredictable staleness (replication lag) is default mode of SELECT query execution.

The idea is to introduce a setting iceberg_read_staleness_seconds=xxx for Iceberg table engine or table function. If this setting is specified table will have background thread(s) which periodically proactively check table state and put metadata files into cache. The time of the last check is recorded and if it's less than specified in setting -- we serve query fully from the latest cached metadata.json.

Reference: https://www.firebolt.io/blog/querying-apache-iceberg-with-sub-second-performance

Metadata

Metadata

Assignees

Labels

comp-datalakeData lake table formats (Iceberg/Delta/Hudi) integration.feature

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions