Skip to content

Minmax indices by default#76867

Draft
alexey-milovidov wants to merge 24 commits intomasterfrom
minmax-by-default
Draft

Minmax indices by default#76867
alexey-milovidov wants to merge 24 commits intomasterfrom
minmax-by-default

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov commented Feb 27, 2025

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

MergeTree tables will have add_minmax_index_for_numeric_columns by default. Resolves #70605

Details

Resolves #70605

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Feb 27, 2025

Workflow [PR], commit [3da8ca1]

@clickhouse-gh clickhouse-gh bot added the pr-performance Pull request with some performance improvements label Feb 27, 2025
@devcrafter devcrafter self-assigned this Feb 27, 2025
@devcrafter
Copy link
Copy Markdown
Member

devcrafter commented Feb 27, 2025

Interesting, just enabling the setting is failing many fast tests

@alexey-milovidov
Copy link
Copy Markdown
Member Author

Cannot apply mutation because it breaks skip index auto_minmax_index_x.

We will have to allow mutations of indexed columns first.

@SmitaRKulkarni
Copy link
Copy Markdown
Member

@alexey-milovidov yes, currently even explicitly created indexes have the same behaviour.
Maybe we can enable minmax index by default and add to the docs how to fix this issue & add more information to exception guiding user how to fix it. I can help with fixing the tests. what do you think?

cc @rschu1ze

@rschu1ze

This comment was marked as resolved.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jul 8, 2025

Dear @alexey-milovidov, this PR hasn't been updated for a while. Will you continue working on it? If not, please close it. Otherwise, ignore this message.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Oct 21, 2025

Dear @alexey-milovidov, this PR hasn't been updated for a while. Will you continue working on it? If not, please close it. Otherwise, ignore this message.

@alexey-milovidov

This comment was marked as resolved.

@alexey-milovidov

This comment was marked as resolved.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Nov 19, 2025

Workflow [PR], commit [487a318]

Summary:

job_name test_name status info comment
Stateless tests (arm_asan_ubsan, targeted) failure
02770_async_buffer_ignore FAIL cidb
03770_min_columns_to_activate_adaptive_write_buffer FAIL cidb
02980_s3_plain_DROP_TABLE_MergeTree FAIL cidb
00753_system_columns_and_system_tables_long FAIL cidb
02346_text_index_queries FAIL cidb
02980_s3_plain_DROP_TABLE_ReplicatedMergeTree FAIL cidb
03560_parallel_replicas_projection FAIL cidb
03100_lwu_deletes_2 FAIL cidb
01062_alter_on_mutataion_zookeeper_long FAIL cidb
03916_system_log_indexes FAIL cidb
4 more test cases not shown
Integration tests (amd_asan_ubsan, targeted) failure
test_http_limits/test_hard_limit.py::test_disk_hard_limit_hit FAIL cidb
test_merge_tree_s3/test.py::test_alter_table_columns[node] FAIL cidb
test_merge_tree_s3_failover/test.py::test_write_failover[1048576-11-0] FAIL cidb
Stateless tests (amd_asan_ubsan, distributed plan, parallel, 1/2) failure
02497_source_part_is_intact_when_mutation FAIL cidb
02980_s3_plain_DROP_TABLE_ReplicatedMergeTree FAIL cidb
03560_parallel_replicas_projection FAIL cidb
02346_text_index_queries FAIL cidb
02228_merge_tree_insert_memory_usage FAIL cidb
Stateless tests (amd_asan_ubsan, distributed plan, parallel, 2/2) failure
03770_min_columns_to_activate_adaptive_write_buffer FAIL cidb
02980_s3_plain_DROP_TABLE_MergeTree FAIL cidb
00753_system_columns_and_system_tables_long FAIL cidb
03916_system_log_indexes FAIL cidb
02770_async_buffer_ignore FAIL cidb
00816_long_concurrent_alter_column FAIL cidb
02352_lightweight_delete FAIL cidb
01062_alter_on_mutataion_zookeeper_long FAIL cidb
Stateless tests (amd_debug, parallel) failure
03560_parallel_replicas_projection FAIL cidb
02980_s3_plain_DROP_TABLE_MergeTree FAIL cidb
03916_system_log_indexes FAIL cidb
02346_text_index_queries FAIL cidb
00753_system_columns_and_system_tables_long FAIL cidb
02770_async_buffer_ignore FAIL cidb
02980_s3_plain_DROP_TABLE_ReplicatedMergeTree FAIL cidb
02597_column_update_tricky_expression_and_replication FAIL cidb
Stateless tests (amd_tsan, parallel, 1/2) failure
02980_s3_plain_DROP_TABLE_ReplicatedMergeTree FAIL cidb
03560_parallel_replicas_projection FAIL cidb
03100_lwu_deletes_2 FAIL cidb
02346_text_index_queries FAIL cidb
Stateless tests (amd_tsan, parallel, 2/2) failure
03770_min_columns_to_activate_adaptive_write_buffer FAIL cidb
03916_system_log_indexes FAIL cidb
01062_alter_on_mutataion_zookeeper_long FAIL cidb
02352_lightweight_delete FAIL cidb
02770_async_buffer_ignore FAIL cidb
02980_s3_plain_DROP_TABLE_MergeTree FAIL cidb
00753_system_columns_and_system_tables_long FAIL cidb
00816_long_concurrent_alter_column FAIL cidb
Stateless tests (arm_binary, parallel) failure
03770_min_columns_to_activate_adaptive_write_buffer FAIL cidb
02346_text_index_queries FAIL cidb
02770_async_buffer_ignore FAIL cidb
02980_s3_plain_DROP_TABLE_ReplicatedMergeTree FAIL cidb
03100_lwu_deletes_2 FAIL cidb
03916_system_log_indexes FAIL cidb
02980_s3_plain_DROP_TABLE_MergeTree FAIL cidb
03560_parallel_replicas_projection FAIL cidb
00753_system_columns_and_system_tables_long FAIL cidb
01062_alter_on_mutataion_zookeeper_long FAIL cidb
Stateless tests (amd_llvm_coverage, 1/3) failure
02770_async_buffer_ignore FAIL cidb
02980_s3_plain_DROP_TABLE_ReplicatedMergeTree FAIL cidb
02346_text_index_queries FAIL cidb
Stateless tests (amd_llvm_coverage, 2/3) failure
03164_s3_settings_for_queries_and_merges FAIL cidb

AI Review

Summary

This PR flips the default of MergeTree setting add_minmax_index_for_numeric_columns from false to true, adds compatibility history in getMergeTreeSettingsChangesHistory, updates affected stateless tests to explicitly opt out where needed, and adds a dedicated stateless test for the new default behavior. I did not find correctness, safety, concurrency, or compatibility issues in the current diff.

Missing context
  • ⚠️ No CI report/logs were provided in the review input, so this review is code/diff-based only.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@Algunenano
Copy link
Copy Markdown
Member

Algunenano commented Nov 20, 2025

Many issues in tests:

  • The most frequent is the change in the table schema that's covered by Fix create statement when using default minmax indicies #87723.
  • Others are changes in the behaviour due to less data being read, the output of some queries to metadata/system tables, and so on. They need to be reviewed case by case to either adjust them, or disable implicit indices in those tests.
  • Others appear to be bugs: some UNKNOWN_IDENTIFIER, NOT_FOUND_COLUMN_IN_BLOCK and so on.

@Algunenano
Copy link
Copy Markdown
Member

Note that there are a few places in the code that check whether add_minmax_index_for_numeric_columns / add_minmax_index_for_string_columns has changed via metadata.settings_changes. We'll need to expand tests to cover tables were the setting is not declared by enabled by default, and when we use compatibility

@Algunenano
Copy link
Copy Markdown
Member

Updating the PR after #91429, which fixes some of the issues found in the last CI run here

@Algunenano
Copy link
Copy Markdown
Member

  • Around 60 test fixed (148 -> 82)
  • Some bugs uncovered:
    • UTF-8 handling is broken (Indices with bad utf-8 names break parts #91417)
    • TTL merges seem broken too: 00933_ttl_simple
    • Some other issues with LWD: 02932_lwd_and_mutations
    • Bad implicit indices created 03327_alias_column_constant (I thought I had fixed this one)
    • Many tests need to be adjusted to not use the indices, so they keep testing what they were testing.

Not too bad, not great either

@Algunenano
Copy link
Copy Markdown
Member

FAIL	01162_strange_mutations cidb	5.40
FAIL	01387_clear_column_default_depends cidb	0.20
FAIL	02967_parallel_replicas_joins_and_analyzer cidb	2.50
FAIL	01269_alias_type_differs cidb	0.00
FAIL	03356_tables_with_binary_identifiers_invalid_utf8 cidb	0.40
FAIL	01855_jit_comparison_constant_result cidb	0.50
FAIL	02441_alter_delete_and_drop_column cidb	70.39

@Algunenano
Copy link
Copy Markdown
Member

5 fast test left: 3 are new test that need adjusting, 2 seem like hitting the same bug. Then I need to tackle some other needed changes in private (tests and code) to make enabling this setting possible. In the meantime once this is green I'll randomize it in CI so I don't keep chasing new tests to adjust its EXPLAIN / rows_count values.

Persists virtual column `_block_number` on merges.
)", 0) \
DECLARE(Bool, add_minmax_index_for_numeric_columns, false, R"(
DECLARE(Bool, add_minmax_index_for_numeric_columns, true, R"(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes user-visible default behavior for MergeTree tables, but the PR does not add coverage for the new default path. Please add a stateless test that verifies a numeric column gets an auto min-max skipping index by default, and a complementary case with add_minmax_index_for_numeric_columns=0 to prove the opt-out still works.

});
addSettingsChanges(merge_tree_settings_changes_history, "26.3",
{
{"add_minmax_index_for_numeric_columns", false, true, "Should give benefits."},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ This compatibility-history entry is placed under 26.3, but this default flip is being introduced now (current development cycle), not in the already released 26.3 line.

With the current placement, compatibility = 26.3 will not revert add_minmax_index_for_numeric_columns back to false (because the loop stops at version >= 26.3), so users requesting 26.3 behavior still get the new default.

Please move this entry to the version where the default actually changed (likely 26.4) so compatibility mode preserves old behavior correctly.

alexey-milovidov and others added 6 commits March 30, 2026 05:21
…add test

Move the compatibility history entry from 26.3 to 26.4 so that
`compatibility = 26.3` correctly reverts the setting to `false`.

Add a stateless test verifying the new default (auto minmax index
is created for numeric columns) and the opt-out path.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…led by default

Add `add_minmax_index_for_numeric_columns = 0` to CREATE TABLE statements
in tests that are not testing minmax index behavior, to prevent auto minmax
indices from appearing in EXPLAIN output and changing granule counts or
query plans.

Affected tests:
- 04029_multiply_monotonicity
- 04003_deterministic_filter_chain_partition_pruning_key_condition_explain
- 04003_text_index_search_mode
- 03711_top_k_by_skip_index_negative
- 04004_cast_nullable_monotonicity_key_condition
- 03788_statistics_part_pruning
- 03788_statistics_part_pruning_precision
- 04032_alter_version_column_nonexistent
- 03742_apply_row_policy_after_final

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…nmax_index_for_numeric_columns`

The auto minmax index on the `value` column was appearing in EXPLAIN output,
causing the test reference mismatch. Disable the setting in this test since
it tests statistics-based pruning, not auto minmax indices.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Min-max indices by default

5 participants