Skip to content

automatically create minmax + uniq statistics for new columns#101275

Draft
hanfei1991 wants to merge 3 commits intoClickHouse:masterfrom
hanfei1991:hanfei/enable-auto-stats
Draft

automatically create minmax + uniq statistics for new columns#101275
hanfei1991 wants to merge 3 commits intoClickHouse:masterfrom
hanfei1991:hanfei/enable-auto-stats

Conversation

@hanfei1991
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

  • auto_statistics_types MergeTree setting defaults to "minmax, uniq" — minmax and uniq statistics are created automatically for all suitable columns in new tables
  • materialize_statistics_on_insert defaults to false — statistics are now built during merges rather than at INSERT time, reducing insert overhead. Use SET materialize_statistics_on_insert = 1 to restore the old
    behavior

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@hanfei1991 hanfei1991 marked this pull request as draft March 30, 2026 21:31
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 30, 2026

Workflow [PR], commit [cfe0eb7]

Summary:

job_name test_name status info comment
Stress test (arm_release) failure
Server died FAIL cidb
Logical error query (STID: 5066-5314) FAIL cidb
Stress test (arm_debug) failure
Server died FAIL cidb
Logical error: Shard number is greater than shard count: shard_num=A shard_count=B cluster=C (STID: 5066-457d) FAIL cidb
Stress test (arm_asan_ubsan) failure
Server died FAIL cidb
UndefinedBehaviorSanitizer: undefined behavior (STID: 6239-5b30) FAIL cidb
Stress test (arm_msan) failure
Server died FAIL cidb
MemorySanitizer: use-of-uninitialized-value (STID: 1003-358c) FAIL cidb, issue

AI Review

Summary

This PR changes defaults for statistics in MergeTree-family tables: auto_statistics_types now defaults to minmax, uniq, and materialize_statistics_on_insert now defaults to false (with compatibility history updated accordingly). The implementation is straightforward (default-value and compatibility-history updates) and test adjustments consistently pin prior behavior where needed. I did not find correctness, safety, or compatibility defects in the diff itself.

Missing context
  • ⚠️ No CI logs or benchmark artifacts were reviewed in this pass.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@rschu1ze rschu1ze mentioned this pull request Mar 31, 2026
72 tasks
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Apr 1, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.10% +0.10%
Functions 90.90% 90.90% +0.00%
Branches 76.50% 76.60% +0.10%

Changed lines: 100.00% (39/39) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant