Skip to content

Avoid possible crash in Parquet with metadata cache enabled#99231

Merged
azat merged 2 commits intoClickHouse:masterfrom
azat:fix-parquet-cache-crash
Mar 11, 2026
Merged

Avoid possible crash in Parquet with metadata cache enabled#99231
azat merged 2 commits intoClickHouse:masterfrom
azat:fix-parquet-cache-crash

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Mar 10, 2026

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Note

Medium Risk
Changes FormatFactory’s metadata-aware random-access input creator signature to pass an explicit Context, which can affect any format registered with this creator and cause build/runtime issues if not updated. Behavior change is targeted but touches format creation plumbing used by background ingestion paths (e.g., S3Queue).

Overview
Fixes a crash when reading Parquet with metadata cache from background threads that don’t have a CurrentThread query context (e.g., S3Queue).

FormatFactory now passes the Context into RandomAccessInputCreatorWithMetadata, and Parquet’s metadata-aware creator switches from CurrentThread::getQueryContext()->getParquetMetadataCache() to context->getParquetMetadataCache().

Adds a stateless regression test (04034_parquet_v3_metadata_cache_no_query_context) that exercises S3Queue reading Parquet from S3 with native reader v3 + metadata cache enabled.

Written by Cursor Bugbot for commit 423ed0f. This will update automatically on new commits. Configure here.

@azat azat mentioned this pull request Mar 10, 2026
1 task
@azat azat requested a review from divanik March 10, 2026 23:08
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 10, 2026

Workflow [PR], commit [423ed0f]

Summary:


AI Review

1) Summary

This PR fixes a real correctness issue in background Parquet reads by switching metadata-cache access from CurrentThread::getQueryContext() to the explicit Context passed through FormatFactory, and adds a targeted regression test for S3Queue + Parquet v3 + metadata cache. High-level verdict: the change is small, focused, and correct; I found no blockers or majors.

2) Missing context (if any)

  • Full CI has not completed yet at review time (only early workflow artifacts/check states were available).
  • No dedicated performance benchmark results were provided (likely unnecessary for this small fix, but still absent).

3) Findings (by severity)

Blockers

  • None.

Majors

  • None.

The PR looks good.

4) Tests & Evidence

  • Positive coverage: new stateless test 04034_parquet_v3_metadata_cache_no_query_context.sh reproduces the background-thread path (S3Queue) and validates rows are fully ingested (count=100, sum=4950).
  • Regression relevance: test specifically exercises the previously failing context access pattern with input_format_parquet_use_native_reader_v3=1 and use_parquet_metadata_cache=1.
  • Negative/error-path coverage: no explicit negative assertions were added (e.g., cache disabled / missing metadata), but the added regression test is appropriately targeted to the fixed path.
  • Suggested optional follow-up tests: add one case with use_parquet_metadata_cache=0 for the same ingestion path to ensure non-cache behavior remains unchanged.

5) ClickHouse Compliance Checklist (Yes/No + short note)

  • Data deletions logged? Yes — no new deletion paths introduced.
  • Serialization formats versioned? Yes (N/A) — no serialization/protocol changes.
  • Experimental setting gate present? Yes (N/A) — no new feature introduced.
  • Settings exposed for constants/thresholds? Yes (N/A) — no new constants introduced in core logic.
  • Backward compatibility preserved? Yes — behavior change is a bug fix in context lookup only.
  • SettingsHistory.cpp updated for new/changed settings? Yes (N/A) — no settings added/changed.
  • Existing tests untouched (only additions)? Yes — only one new test + reference added.
  • Docs/user-facing notes updated? Yes (N/A) — internal bug fix with test; no user-facing syntax/behavior docs needed.
  • Core-area change got extra scrutiny? Yes — reviewed FormatFactory plumbing and Parquet input path.

6) Performance & Safety Notes

  • Hot-path impact is negligible: one pointer access source changed (CurrentThread query context -> explicit Context) without added loops/allocations.
  • Safety improves in background threads by removing dependence on thread-local query context availability.
  • No new concurrency primitives introduced; no additional memory/resource lifetime risks observed.
  • Benchmarking: not provided; a minimal sanity benchmark could compare Parquet read throughput with metadata cache on/off for unchanged performance expectations.

7) User-Lens Review

  • The fix is intuitive and robust: background consumers now use the correct context object instead of assuming query context exists.
  • Operator experience should improve because this removes an exception-prone path in asynchronous ingestion threads.
  • Logging/error behavior is unchanged and remains actionable.

8) Final Verdict

  • Status: Approve

@clickhouse-gh clickhouse-gh bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label Mar 10, 2026
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No tests, no explanation, no link to CI?

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alexey-milovidov alexey-milovidov self-assigned this Mar 10, 2026
@azat
Copy link
Copy Markdown
Member Author

azat commented Mar 11, 2026

@azat azat enabled auto-merge March 11, 2026 08:49
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 11, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.80% 83.80% +0.00%
Functions 23.80% 23.90% +0.10%
Branches 76.30% 76.40% +0.10%

PR changed lines: PR changed-lines coverage: 100.00% (21/21)
Diff coverage report
Uncovered code

@azat azat added this pull request to the merge queue Mar 11, 2026
Merged via the queue into ClickHouse:master with commit 1d77df5 Mar 11, 2026
475 of 482 checks passed
@azat azat deleted the fix-parquet-cache-crash branch March 11, 2026 20:03
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 11, 2026
@alexey-milovidov
Copy link
Copy Markdown
Member

Thanks for the test and the explanation!

@azat azat changed the title Avoid crash possible crash in Parquet with metadata cache enabled Avoid possible crash in Parquet with metadata cache enabled Mar 25, 2026
arthurpassos pushed a commit to Altinity/ClickHouse that referenced this pull request Mar 25, 2026
Avoid crash possible crash in Parquet with metadata cache enabled
zvonand added a commit to Altinity/ClickHouse that referenced this pull request Mar 31, 2026
…data_cache_261

Antalya 26.1 Backport of ClickHouse#98140, ClickHouse#99230, ClickHouse#99231 and ClickHouse#96545 - Parquet metadata cache (upstream impl) and arrow library version bump
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants