Parquet metadata cache v2 by grantholly-clickhouse · Pull Request #98140 · ClickHouse/ClickHouse

grantholly-clickhouse · 2026-02-26T21:18:11Z

Changelog category (leave one):

New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Added a new SLRU cache for Parquet metadata to improve read performance by removing the need to re-download files just to read metadata.

Resolves #89102.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

This pull request adds a new SLRU cache for parquet file metadata. When a file is read for the first time, we capture the file metadata which includes the schema information. Now when the file needs to be read, we can look up the file from the cache and have access to the file metadata. We use the filename and etag from the cloud storage response as cache keys. That way we can cache updated schemas for the same file.

The cache is enabled with use_parquet_metadata_cache=1 and can be dropped with SYSTEM CLEAR PARQUET METADATA CACHE. Dropping a table backed by a parquet file will not drop the corresponding cache entry. All cache invalidation is handled by SLRU in the parent class CacheBase.h. Even if the cache is enabled, it only works with the native v3 parquet reader.

Note

Medium Risk
Introduces a new global cache and threads it through Parquet read paths (especially object storage), which can affect memory usage and correctness of metadata invalidation (etag-keyed) as well as format factory selection logic.

Overview
Adds a new global Parquet metadata SLRU cache (defaults + server settings) and wires it into Parquet native v3 reads so metadata can be reused instead of re-reading file footers, keyed by object path + etag.

Extends FormatFactory with metadata-aware random-access creators (getInputWithMetadata / registerRandomAccessInputFormatWithMetadata) and updates object-storage reads to pass object metadata when use_parquet_metadata_cache=1 (and v3 reader is enabled). Adds observability/ops hooks: new ProfileEvents/CurrentMetrics, new privilege + parser support for SYSTEM DROP PARQUET METADATA CACHE, and documentation/tests covering cache hit/miss and disabling in existing S3 Parquet tests.

^{Written by Cursor Bugbot for commit 9ae55b9. This will update automatically on new commits. Configure here.}

…dge case where we have the metadata cache turned on, but we are not using the native v3 reader

clickhouse-gh · 2026-02-26T21:18:54Z

Workflow [PR], commit [9ae55b9]

Summary: ✅

…t_metadata_cache_v2

src/Formats/FormatFactory.cpp

divanik · 2026-03-04T11:39:28Z

src/Formats/FormatFactory.cpp

    size_t max_parsing_threads = parser_shared_resources->getParsingThreadsPerReader();
    bool parallel_parsing = max_parsing_threads > 1 && settings[Setting::input_format_parallel_parsing]
-        && creators.file_segmentation_engine_creator && !creators.random_access_input_creator && !need_only_count;
+        && creators.file_segmentation_engine_creator && !(creators.random_access_input_creator && creators.random_access_input_creator_with_metadata)


Let's replace creators.random_access_input_creator_with_metadata with (creators.random_access_input_creator_with_metadata && metadata.has_value())

I see, we could make this check more strict like this

bool parallel_parsing = max_parsing_threads > 1 && settings[Setting::input_format_parallel_parsing] && creators.file_segmentation_engine_creator && !(creators.random_access_input_creator && creators.random_access_input_creator_with_metadata && metadata.has_value()) && !need_only_count;

We are testing to see if we should use parallel parsing. I'm not sure that the presence or absence of object metadata makes that decision any clearer or more correct

divanik · 2026-03-04T11:49:44Z

src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp

+                LOG_TRACE(lambda_logger, "using arrow reader in ParquetBlockInputFormat without metadata cache");
                return std::make_shared<ParquetBlockInputFormat>(
                    buf,
                    std::make_shared<const Block>(sample),
                    settings,
                    std::move(parser_shared_resources),
                    std::move(format_filter_info),
-                    min_bytes_for_seek);
+                    min_bytes_for_seek


The problem is not resolved by removing the logical error. We still calls the function which deosn't use object metadata in registerRandomAccessInputFormatWithMetadata. It is redundant, we have the same code below. We still should have logical error if we are in this branch

The problem is presumably:
here you use settings.parquet.use_native_reader_v3
in the code above you use
context_->getSettingsRef()[Setting::input_format_parquet_use_native_reader_v3]

I will try to fix and reproduce

Revert #89750 (parquet footer cache) which causes a LOGICAL_ERROR exception when reading Parquet from S3 with the default settings (use_native_reader_v3 = false). The metadata-aware format creator only handled the use_native_reader_v3 path and threw a LOGICAL_ERROR for the default Arrow reader path, triggered whenever S3 storage provides blob metadata

I thought Alexey reverted this because we need a usable format creator in the case that the parquet metadata cache is turned on, but we don't have object store metadata. That makes sense to me. If, for some reason, we don't get object store metadata in the response, we still need to read the parquet file, and thus, we need some kind of format creator for reading.

@alexey-milovidov Is this correct?

I suppose Alexey reverted it just because we encountered logical error. Why can we have metadata cache without object_metadata? It seems redundant, because it is unusable without metadata

Yes, it was only due to a logical error.

divanik · 2026-03-05T10:06:56Z

There is really strange diff in SettingsChangesHistory, my bad that I overlooked it, I will try to fix

…t_metadata_cache_v2

src/Formats/FormatFactory.cpp

src/Core/SettingsChangesHistory.cpp

src/Formats/FormatFactory.cpp

src/Processors/Formats/Impl/ParquetMetadataCache.h

…check more restrictive, and setting previous value for use_parquet_metadata_cache to false

src/Processors/Formats/Impl/ParquetMetadataCache.h

src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp

src/Processors/Formats/Impl/ParquetV3BlockInputFormat.cpp

src/Processors/Formats/Impl/ParquetMetadataCache.cpp

src/Processors/Formats/Impl/Parquet/ReadManager.cpp

alexey-milovidov · 2026-03-06T18:22:51Z

src/Processors/Formats/Impl/ParquetMetadataCache.cpp

+
+size_t ParquetMetadataCacheKeyHash::operator()(const ParquetMetadataCacheKey & key) const
+{
+    return std::hash<String>{}(key.file_path) ^ std::hash<String>{}(key.etag);


While now it's not too wrong, the code smells bad practice.
See slide 11 here: https://presentations.clickhouse.com/2017-hash_tables/?full#11

Also, we shouldn't be using std::hash. Use methods from Hash.h

😆 I see what you did there
https://github.com/ClickHouse/ClickHouse/blob/master/src/Common/HashTable/Hash.h#L236

…and improving hash calculation of cache keys

cursor · 2026-03-06T22:02:49Z

src/Processors/Formats/Impl/ParquetMetadataCache.h

+            ProfileEvents::increment(ProfileEvents::ParquetMetadataCacheHits);
+        }
+        return result.first->metadata;
+    }


Cache returns large metadata struct by value

Medium Severity

getOrSetMetadata returns parquet::format::FileMetaData by value, causing a full deep copy of the thrift metadata struct on every cache hit. For files with many row groups and columns, this metadata can be substantial. The whole point of the cache is to avoid re-downloading, but the copy overhead on each access partially defeats the purpose.

I guess this is true, but it seems like a pretty large refactor. Maybe in another PR we could attempt this? The part that gives me pause here https://github.com/ClickHouse/ClickHouse/pull/98140/changes/BASE..b8ba83a5537d6c235d33919156cee5868f93fba6#diff-0b5a953b45a537794f28b2c4c4822e1f884b496c68fc5d05697d512958dd2791R101 in the native reader's initializeIfNeeded

{ std::lock_guard lock(reader_mutex); reader.emplace(); reader->reader.prefetcher.init(in, read_options, parser_shared_resources); reader->reader.file_metadata = getFileMetadata(reader->reader.prefetcher); reader->reader.init(read_options, getPort().getHeader(), format_filter_info); reader->init(parser_shared_resources, buckets_to_read ? std::optional(buckets_to_read->row_group_ids) : std::nullopt); }

We are assigning the file_metadata from our cache lookup. I think we would need to refactor the reader to also deal with a pointer as well

src/Interpreters/InterpreterSystemQuery.cpp

src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

cursor · 2026-03-06T23:38:43Z

src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp

-                    min_bytes_for_seek);
+                throw Exception(
+                    ErrorCodes::LOGICAL_ERROR,
+                    "Implementation of ParquetBlockInputFormat using arrow reader didn't require blob metadata for initialization");


Metadata-aware creator throws for non-v3 Parquet reader

Medium Severity

The random_access_input_creator_with_metadata lambda for Parquet throws LOGICAL_ERROR when use_native_reader_v3 is false. In getInputImpl, this creator is unconditionally preferred whenever object_with_metadata has a value, with no check on the reader version. The current single call site in StorageObjectStorageSource guards against this, but the public getInputWithMetadata API has no such protection, making any future caller that passes metadata for non-v3 Parquet reads crash.

Additional Locations (1)

src/Formats/FormatFactory.cpp#L532-L538

azat · 2026-03-10T22:18:44Z

src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp

            if (settings.parquet.use_native_reader_v3)
            {
+                LOG_TRACE(lambda_logger, "using native reader v3 in ParquetBlockInputFormat with metadata cache");
+                ParquetMetadataCachePtr metadata_cache = CurrentThread::getQueryContext()->getParquetMetadataCache();


It is not guaranteed that there will be query context - https://pastila.nl/?001a755c/6964bb8d5180dc5249a29874fd0605f2#KuZwHaHJq+YWh3d8naQn0Q==GCM

And it crashes

Avoid possible crash in Parquet with metadata cache enabled #99231

…t_metadata_cache_v2 Parquet metadata cache v2

…data_cache_261 Antalya 26.1 Backport of ClickHouse#98140, ClickHouse#99230, ClickHouse#99231 and ClickHouse#96545 - Parquet metadata cache (upstream impl) and arrow library version bump

divanik and others added 3 commits February 24, 2026 14:41

reintroducing Parquet footer cache

500341a

removing logical error and providing an Arrow parquet reader in the e…

807fb95

…dge case where we have the metadata cache turned on, but we are not using the native v3 reader

updating cache cell size logic and removing unused logical error

05c9099

clickhouse-gh bot added the pr-feature Pull request with new product feature label Feb 26, 2026

grantholly-clickhouse added 4 commits February 26, 2026 13:53

moving to 26.3 settings history

fb11dfc

Merge branch 'master' into parquet_metadata_cache_v2

836b4fa

fixing style issue in SettingsChangesHistory.cpp

4f7d1a0

fixing whitepace mistake

6f0528b

grantholly-clickhouse requested a review from divanik February 27, 2026 18:46

grantholly-clickhouse self-assigned this Feb 27, 2026

Merge branch 'master' into parquet_metadata_cache_v2

d87331e

divanik self-assigned this Mar 3, 2026

grantholly-clickhouse and others added 3 commits March 3, 2026 08:20

Merge branch 'master' into parquet_metadata_cache_v2

93bf0c4

pulling out unused include that got pulled in merging in upstream

e5223ee

Merge branch 'master' of github.com:ClickHouse/ClickHouse into parque…

ae08ff7

…t_metadata_cache_v2

divanik reviewed Mar 4, 2026

View reviewed changes

Fix problems with logical error, add a test

37d81b5

divanik approved these changes Mar 4, 2026

View reviewed changes

divanik added 3 commits March 5, 2026 11:08

Remove unrelated settings diff

118c690

Merge branch 'master' into parquet_metadata_cache_v2

9fbd84c

Merge branch 'master' of github.com:ClickHouse/ClickHouse into parque…

ab0a9a3

…t_metadata_cache_v2

cursor bot reviewed Mar 5, 2026

View reviewed changes

src/Formats/FormatFactory.cpp Show resolved Hide resolved

src/Core/SettingsChangesHistory.cpp Outdated Show resolved Hide resolved

src/Formats/FormatFactory.cpp Outdated Show resolved Hide resolved

src/Processors/Formats/Impl/ParquetMetadataCache.h Show resolved Hide resolved

fixing prewhere support check logic, making parallel parsing support …

fd1394f

…check more restrictive, and setting previous value for use_parquet_metadata_cache to false

cursor bot reviewed Mar 5, 2026

View reviewed changes

src/Processors/Formats/Impl/ParquetMetadataCache.h Outdated Show resolved Hide resolved

src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp Outdated Show resolved Hide resolved

src/Processors/Formats/Impl/ParquetV3BlockInputFormat.cpp Outdated Show resolved Hide resolved

divanik added 2 commits March 6, 2026 11:33

Merge branch 'master' into parquet_metadata_cache_v2

b4b3a97

Assumably fix stateless test

889e8df

cursor bot reviewed Mar 6, 2026

View reviewed changes

src/Processors/Formats/Impl/ParquetMetadataCache.cpp Show resolved Hide resolved

Fix test and some code

6123af5

cursor bot reviewed Mar 6, 2026

View reviewed changes

src/Processors/Formats/Impl/Parquet/ReadManager.cpp Show resolved Hide resolved

alexey-milovidov reviewed Mar 6, 2026

View reviewed changes

checking parquet schema instead of row groups to test in ReadManager …

b8ba83a

…and improving hash calculation of cache keys

cursor bot reviewed Mar 6, 2026

View reviewed changes

minor whitespace fixup

9ae55b9

cursor bot reviewed Mar 6, 2026

View reviewed changes

divanik added this pull request to the merge queue Mar 9, 2026

Merged via the queue into ClickHouse:master with commit aa00dcb Mar 9, 2026
294 of 295 checks passed

robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 9, 2026

azat reviewed Mar 10, 2026

View reviewed changes

grantholly-clickhouse mentioned this pull request Mar 10, 2026

Add parquet format check to metadata cache #99230

Merged

1 task

arthurpassos pushed a commit to Altinity/ClickHouse that referenced this pull request Mar 24, 2026

Merge pull request ClickHouse#98140 from grantholly-clickhouse/parque…

f2ee383

…t_metadata_cache_v2 Parquet metadata cache v2

arthurpassos mentioned this pull request Mar 25, 2026

Antalya 26.1 Backport of #98140, #99230, #99231 and #96545 - Parquet metadata cache (upstream impl) and arrow library version bump Altinity/ClickHouse#1574

Merged

27 tasks

Conversation

grantholly-clickhouse commented Feb 26, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Documentation entry for user-facing changes

Uh oh!

clickhouse-gh bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

divanik Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

divanik commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 6, 2026

Choose a reason for hiding this comment

Cache returns large metadata struct by value

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 6, 2026

Choose a reason for hiding this comment

Metadata-aware creator throws for non-v3 Parquet reader

Uh oh!

Uh oh!

azat Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

azat Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

grantholly-clickhouse commented Feb 26, 2026 •

edited by cursor bot

Loading

clickhouse-gh bot commented Feb 26, 2026 •

edited

Loading

divanik Mar 6, 2026 •

edited

Loading

divanik commented Mar 5, 2026 •

edited

Loading

azat Mar 10, 2026 •

edited

Loading

azat Mar 10, 2026 •

edited

Loading