You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/engines/table-engines/integrations/iceberg.md
+46-18Lines changed: 46 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ doc_type: 'reference'
10
10
11
11
# Iceberg table engine {#iceberg-table-engine}
12
12
13
-
:::warning
13
+
:::warning
14
14
We recommend using the [Iceberg Table Function](/sql-reference/table-functions/iceberg.md) for working with Iceberg data in ClickHouse. The Iceberg Table Function currently provides sufficient functionality, offering a partial read-only interface for Iceberg tables.
15
15
16
16
The Iceberg Table Engine is available but may have limitations. ClickHouse wasn't originally designed to support tables with externally changing schemas, which can affect the functionality of the Iceberg Table Engine. As a result, some features that work with regular tables may be unavailable or may not function correctly, especially when using the old analyzer.
@@ -78,7 +78,7 @@ Table engine `Iceberg` is an alias to `IcebergS3` now.
78
78
At the moment, with the help of CH, you can read iceberg tables, the schema of which has changed over time. We currently support reading tables where columns have been added and removed, and their order has changed. You can also change a column where a value is required to one where NULL is allowed. Additionally, we support permitted type casting for simple types, namely:
79
79
* int -> long
80
80
* float -> double
81
-
* decimal(P, S) -> decimal(P', S) where P' > P.
81
+
* decimal(P, S) -> decimal(P', S) where P' > P.
82
82
83
83
Currently, it is not possible to change nested structures or the types of elements within arrays and maps.
84
84
@@ -94,20 +94,20 @@ ClickHouse supports time travel for Iceberg tables, allowing you to query histor
94
94
95
95
## Processing of tables with deleted rows {#deleted-rows}
96
96
97
-
Currently, only Iceberg tables with [position deletes](https://iceberg.apache.org/spec/#position-delete-files) are supported.
97
+
Currently, only Iceberg tables with [position deletes](https://iceberg.apache.org/spec/#position-delete-files) are supported.
98
98
99
99
The following deletion methods are **not supported**:
-[Deletion vectors](https://iceberg.apache.org/spec/#deletion-vectors) (introduced in v3)
102
102
103
103
### Basic usage {#basic-usage}
104
104
```sql
105
-
SELECT*FROM example_table ORDER BY1
105
+
SELECT*FROM example_table ORDER BY1
106
106
SETTINGS iceberg_timestamp_ms =1714636800000
107
107
```
108
108
109
109
```sql
110
-
SELECT*FROM example_table ORDER BY1
110
+
SELECT*FROM example_table ORDER BY1
111
111
SETTINGS iceberg_snapshot_id =3547395809148285433
112
112
```
113
113
@@ -132,21 +132,21 @@ Consider this sequence of operations:
132
132
```sql
133
133
-- Create a table with two columns
134
134
CREATETABLEIF NOT EXISTS spark_catalog.db.time_travel_example (
@@ -192,10 +192,10 @@ A time travel query at a current moment might show a different schema than the c
192
192
```sql
193
193
-- Create a table
194
194
CREATETABLEIF NOT EXISTS spark_catalog.db.time_travel_example_2 (
195
-
order_number int,
195
+
order_number int,
196
196
product_code string
197
-
)
198
-
USING iceberg
197
+
)
198
+
USING iceberg
199
199
OPTIONS ('format-version'='2')
200
200
201
201
-- Insert initial data into the table
@@ -234,10 +234,10 @@ The second one is that while doing time travel you can't get state of table befo
234
234
```sql
235
235
-- Create a table
236
236
CREATETABLEIF NOT EXISTS spark_catalog.db.time_travel_example_3 (
237
-
order_number int,
237
+
order_number int,
238
238
product_code string
239
-
)
240
-
USING iceberg
239
+
)
240
+
USING iceberg
241
241
OPTIONS ('format-version'='2');
242
242
243
243
ts = now();
@@ -275,9 +275,9 @@ After identifying candidate files using the above rules, the system determines w
275
275
* The file with the highest version number is selected
276
276
* (Version appears as `V` in filenames formatted as `V.metadata.json` or `V-uuid.metadata.json`)
277
277
278
-
**Note**: All mentioned settings are engine-level settings and must be specified during table creation as shown below:
278
+
**Note**: All mentioned settings (unless explicitly specified otherwise) are engine-level settings and must be specified during table creation as shown below:
`Iceberg` table engine and table function support metadata cache storing the information of manifest files, manifest list and metadata json. The cache is stored in memory. This feature is controlled by setting `use_iceberg_metadata_files_cache`, which is enabled by default.
Asynchronous metadata prefetching can be enabled at `Iceberg` table creation by setting `iceberg_metadata_async_prefetch_period_ms`. If set to 0 (default) or if metadata caching is not enabled, the asynchronous prefetching is disabled.
299
+
In order to enable this feature, a non-zero value of milliseconds should be given. It represents interval between prefetching cycles.
300
+
301
+
If enabled, the server will run a recurring background operation to list the remote catalog and to detect new metadata version. It will then parse it and recursively walk the snapshot, fetching active manifest list files and manifest files.
302
+
The files already available at the metadata cache, won't be downloaded again. At the end of each prefetching cycle, the latest metadata snapshot is available at the metadata cache.
303
+
304
+
```sql
305
+
CREATETABLEexample_table ENGINE = Iceberg(
306
+
's3://bucket/path/to/iceberg_table'
307
+
) SETTINGS
308
+
iceberg_metadata_async_prefetch_period_ms =60000;
309
+
```
310
+
311
+
In order to make the most of asynchronous metadata prefetching at read operations, `iceberg_metadata_staleness_ms` parameter should be specified as Query or Session parameter. By default (0 - not specified) in the context of each query, the server will fetch latest metadata from the remote catalog.
312
+
By specifying tolerance to metadata staleness, the server is allowed to use the cached version of metadata snapshot without calling the remote catalog. If there's metadata version in cache, and it has been downloaded within the given window of staleness, it will be used to process the query.
313
+
Otherwise the latest version will be fetched from the remote catalog.
314
+
315
+
```sql
316
+
SELECTcount() FROM icebench_table WHERE ...
317
+
SETTINGS iceberg_metadata_staleness_ms=120000
318
+
```
319
+
320
+
**Note**: Asynchronous metadata prefetching runs at `ICEBERG_SCEDULE_POOL`, which is server-side threadpool for background operations on active `Iceberg` tables. The size of this threadpool is controlled by `iceberg_background_schedule_pool_size` server configuration parameter (default is 10).
321
+
322
+
**Note**: Current expectation is that metadata cache size is sufficient to hold the latest metadata snapshot in full for all active tables, if asynchronous prefetching is enabled.
Copy file name to clipboardExpand all lines: src/Common/ProfileEvents.cpp
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -100,6 +100,7 @@
100
100
M(PrimaryIndexCacheMisses, "Number of times an entry has not been found in the primary index cache, so we had to load a index file in memory, which is a costly operation, adding to query latency.", ValueType::Number) \
101
101
M(IcebergMetadataFilesCacheHits, "Number of times iceberg metadata files have been found in the cache.", ValueType::Number) \
102
102
M(IcebergMetadataFilesCacheMisses, "Number of times iceberg metadata files have not been found in the iceberg metadata cache and had to be read from (remote) disk.", ValueType::Number) \
103
+
M(IcebergMetadataFilesCacheStaleMisses, "Number of times iceberg metadata files have been found in the cache, but were considered stale and had to be read from (remote) disk.", ValueType::Number) \
103
104
M(IcebergMetadataFilesCacheWeightLost, "Approximate number of bytes evicted from the iceberg metadata cache.", ValueType::Number) \
104
105
M(IcebergMetadataReadWaitTimeMicroseconds, "Total time data readers spend waiting for iceberg metadata files to be read and parsed, summed across all reader threads.", ValueType::Microseconds) \
105
106
M(IcebergIteratorInitializationMicroseconds, "Total time spent on synchronous initialization of iceberg data iterators.", ValueType::Microseconds) \
Copy file name to clipboardExpand all lines: src/Core/ServerSettings.cpp
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1138,6 +1138,7 @@ The policy on how to perform a scheduling of CPU slots specified by `concurrent_
1138
1138
DECLARE(UInt64, threadpool_writer_queue_size, 10000, R"(Number of tasks which is possible to push into background pool for write requests to object storages)", 0) \
1139
1139
DECLARE(UInt64, iceberg_catalog_threadpool_pool_size, 50, R"(Size of background pool for iceberg catalog)", 0) \
1140
1140
DECLARE(UInt64, iceberg_catalog_threadpool_queue_size, 10000, R"(Number of tasks which is possible to push into iceberg catalog pool)", 0) \
1141
+
DECLARE(UInt64, iceberg_background_schedule_pool_size, 10, "Size of thread pool to asynchronously fetch the latest metadata from a remote iceberg catalog; the pool is shared by all the active tables.", 0) \
1141
1142
DECLARE(UInt64, drop_distributed_cache_pool_size, 8, R"(The size of the threadpool used for dropping distributed cache.)", 0) \
1142
1143
DECLARE(UInt64, drop_distributed_cache_queue_size, 1000, R"(The queue size of the threadpool used for dropping distributed cache.)", 0) \
1143
1144
DECLARE(Bool, distributed_cache_apply_throttling_settings_from_client, true, R"(Whether cache server should apply throttling settings received from client.)", 0) \
If non-zero, skip fetching iceberg metadata from remote catalog if there is a cached metadata snapshot, more recent than the given staleness window. Zero means to always fetch the latest metadata version from the remote catalog. Setting this a non-zero trades staleness to a lower latency of read operations.
5169
+
)", 0) \
5168
5170
DECLARE(Bool, use_query_cache, false, R"(
5169
5171
If turned on, `SELECT` queries may utilize the [query cache](../query-cache.md). Parameters [enable_reads_from_query_cache](#enable_reads_from_query_cache)
5170
5172
and [enable_writes_to_query_cache](#enable_writes_to_query_cache) control in more detail how the cache is used.
0 commit comments