Skip to content

Add testing setting to force reverse primary keys#94017

Open
amosbird wants to merge 22 commits intoClickHouse:masterfrom
amosbird:fuzz-desc-key
Open

Add testing setting to force reverse primary keys#94017
amosbird wants to merge 22 commits intoClickHouse:masterfrom
amosbird:fuzz-desc-key

Conversation

@amosbird
Copy link
Copy Markdown
Collaborator

@amosbird amosbird commented Jan 13, 2026

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

This PR adds a testing-only setting force_primary_key_reverse_order. When enabled, all ORDER BY columns in CREATE TABLE are automatically forced to DESC, regardless of the original definition, and allow_experimental_reverse_key is enabled implicitly. This allows running the full functional test suite with reverse primary keys to stress-test correctness and stability without modifying individual tests. This resolves #93288 (comment)

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Note

Medium Risk
Although framed as a testing-only switch, this PR changes core MergeTree metadata/projection generation and ORDER BY read-in-order matching, which could affect query planning/results for tables with DESC sorting keys or during ALTERs.

Overview
Adds a new testing-only setting force_primary_key_reverse_order that, during CREATE TABLE, rewrites the ORDER BY AST to force all key columns to DESC and implicitly enables allow_experimental_reverse_key.

Updates MergeTree internals to properly account for reversed sorting keys: ReadInOrderOptimizer now matches required sort direction against the physical direction using sorting_key.reverse_flags, minmax-count projection generation now uses the sorting key’s reverse flag (and its API is extended to accept sorting_key), and ALTER ... MODIFY ORDER BY now rejects changing sort direction for existing key columns. Numerous functional tests are adjusted to explicitly disable the new setting where output/order assumptions would otherwise break.

Written by Cursor Bugbot for commit f0828c0. This will update automatically on new commits. Configure here.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jan 13, 2026

Workflow [PR], commit [dfa5a1f]

Summary:

job_name test_name status info comment
Stateless tests (arm_asan_ubsan, targeted) failure
01593_concurrent_alter_mutations_kill_many_replicas_long FAIL cidb
02863_delayed_source_with_totals_and_extremes FAIL cidb
03706_statistics_preserve_checksums_on_mutations FAIL cidb, issue
Integration tests (amd_asan_ubsan, targeted) failure
test_database_iceberg_nessie_catalog/test.py::test_select FAIL cidb
test_database_iceberg_nessie_catalog/test.py::test_drop_table FAIL cidb
test_database_iceberg_nessie_catalog/test.py::test_backup_database FAIL cidb
test_database_iceberg_nessie_catalog/test.py::test_insert FAIL cidb
test_database_iceberg_nessie_catalog/test.py::test_create FAIL cidb
test_database_iceberg_nessie_catalog/test.py::test_tables_with_same_location FAIL cidb
test_database_iceberg_nessie_catalog/test.py::test_hide_sensitive_info FAIL cidb
test_database_iceberg_nessie_catalog/test.py::test_list_tables FAIL cidb
test_database_iceberg_nessie_catalog/test.py::test_timestamps FAIL cidb
Stateless tests (amd_asan_ubsan, flaky check) failure
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
00072_compare_date_and_string_index FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
Stateless tests (amd_debug, flaky check) failure
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
Stateless tests (amd_binary, flaky check) failure
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
00071_merge_tree_optimize_aio FAIL cidb
Stateless tests (amd_asan_ubsan, distributed plan, parallel, 1/2) failure
03448_topk_merging FAIL cidb
Stateless tests (amd_asan_ubsan, distributed plan, parallel, 2/2) failure
01099_parallel_distributed_insert_select FAIL cidb
Stateless tests (amd_asan_ubsan, db disk, distributed plan, sequential) failure
01593_concurrent_alter_mutations_kill_many_replicas_long FAIL cidb
Stateless tests (amd_debug, parallel) failure
01099_parallel_distributed_insert_select FAIL cidb
Stateless tests (amd_tsan, parallel, 1/2) failure
01171_mv_select_insert_isolation_long FAIL cidb

AI Review

Summary

This PR introduces a testing setting force_primary_key_reverse_order that rewrites ORDER BY to descending and adjusts MergeTree-related logic for reverse keys (read-in-order matching, minmax-count projection generation, and ALTER validation). I found one blocker: in CREATE TABLE ... CLONE AS ..., the rewrite is applied after clone compatibility checks, which can produce metadata/order mismatches against reused parts.

Findings
  • ❌ Blockers
    • [src/Interpreters/InterpreterCreateQuery.cpp:1049] applyForceReverseOrder is applied after CLONE AS ordering compatibility checks. With force_primary_key_reverse_order=1, clone validation can pass using pre-rewrite keys, then metadata is rewritten to DESC while cloned parts remain in original physical order. This can violate read-order assumptions.
    • Suggested fix: skip forced rewrite for create.is_clone_as, or apply rewrite before clone compatibility checks so validation uses final ORDER BY.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility CLONE AS can pass compatibility checks before later ORDER BY rewrite, creating metadata/data-layout mismatch risk.
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout ⚠️ Current behavior is unsafe for CLONE AS when the testing setting is enabled.
Compilation time
Performance & Safety
  • The blocker above is a safety/correctness issue: metadata can claim reversed sort order while existing cloned parts are physically sorted by the original order.
Final Verdict
  • Status: ❌ Block
  • Minimum required actions:
    • Prevent force_primary_key_reverse_order from rewriting ORDER BY after CLONE AS compatibility checks (either skip for clones or move rewrite before checks).
    • Add/adjust test coverage for CLONE AS with force_primary_key_reverse_order=1 to ensure metadata and reused parts remain consistent.

@clickhouse-gh clickhouse-gh bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jan 13, 2026
@EmeraldShift
Copy link
Copy Markdown
Contributor

I saw a link to this on the roadmap, and GitHub claims this PR will close the roadmap ticket, maybe because of the word "resolves" in the reference to the comment?

amosbird added a commit to amosbird/ClickHouse that referenced this pull request Mar 8, 2026
…lickHouse#94017

Add force_primary_key_reverse_order setting that forces all ORDER BY columns
in CREATE TABLE to DESC for stress-testing the reverse key feature across
the entire functional test suite. Auto-enables allow_experimental_reverse_key.

Also fix style check by adding allow_experimental_reverse_key to the
experimental settings ignore list.
@amosbird amosbird force-pushed the fuzz-desc-key branch 2 times, most recently from 8ca9673 to 4b4a044 Compare March 9, 2026 03:34
amosbird added a commit to amosbird/ClickHouse that referenced this pull request Mar 10, 2026
…lickHouse#94017

Add force_primary_key_reverse_order setting that forces all ORDER BY columns
in CREATE TABLE to DESC for stress-testing the reverse key feature across
the entire functional test suite. Auto-enables allow_experimental_reverse_key.

Also fix style check by adding allow_experimental_reverse_key to the
experimental settings ignore list.
amosbird added a commit to amosbird/ClickHouse that referenced this pull request Mar 12, 2026
…lickHouse#94017

Add force_primary_key_reverse_order setting that forces all ORDER BY columns
in CREATE TABLE to DESC for stress-testing the reverse key feature across
the entire functional test suite. Auto-enables allow_experimental_reverse_key.

Also fix style check by adding allow_experimental_reverse_key to the
experimental settings ignore list.
if (!as_table_saved.empty())
create.is_create_empty = false;

if (!internal && mode <= LoadingStrictnessLevel::CREATE
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

force_primary_key_reverse_order introduces new CREATE TABLE behavior, but this PR only adds broad test exclusions/randomization changes and no focused positive test that verifies: (1) ORDER BY columns are rewritten to DESC, and (2) reversed keys still create successfully without explicitly setting allow_experimental_reverse_key. Please add a dedicated stateless test that asserts these two guarantees to prevent regressions.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

"input_format_parquet_use_native_reader_v3": lambda: min(1, random.randint(0, 5)),
"enable_lazy_columns_replication": lambda: random.randint(0, 1),
"allow_special_serialization_kinds_in_output_formats": lambda: random.randint(0, 1),
"force_primary_key_reverse_order": lambda: random.randint(0, 1),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Randomized setting at 50% may cause widespread test failures

Medium Severity

The force_primary_key_reverse_order setting is randomized with 50% probability (random.randint(0, 1)). Several tests that disable it explicitly create only Memory engine tables (e.g., 00578_merge_table_sampling.sql, 00632_aggregation_window_funnel.sql), suggesting the set of affected tests was over-estimated in some places and potentially under-estimated in others. With such aggressive randomization, any missed test that depends on physical MergeTree scan order without an explicit ORDER BY in its SELECT will fail intermittently, and there's no mechanism to detect these missed tests except CI flakes.

Fix in Cursor Fix in Web

amosbird added a commit to amosbird/ClickHouse that referenced this pull request Mar 16, 2026
…lickHouse#94017

Add force_primary_key_reverse_order setting that forces all ORDER BY columns
in CREATE TABLE to DESC for stress-testing the reverse key feature across
the entire functional test suite. Auto-enables allow_experimental_reverse_key.

Also fix style check by adding allow_experimental_reverse_key to the
experimental settings ignore list.
amosbird added 13 commits March 23, 2026 18:50
1. ReadInOrderOptimizer (old analyzer path): matchSortDescriptionAndKey
   did not account for reverse_flags, causing wrong read direction when
   allow_experimental_analyzer = 0.

2. ALTER TABLE MODIFY ORDER BY: direction changes (ASC<->DESC) were not
   validated. Existing data parts are physically sorted in the original
   direction, so changing it silently corrupts sort order. Now throws
   BAD_ARGUMENTS.

3. MinMaxCount projection: used primary_key.reverse_flags which is
   always empty when PRIMARY KEY is explicitly specified (parsed without
   DESC support). Now uses sorting_key.reverse_flags instead.
…lickHouse#94017

Add force_primary_key_reverse_order setting that forces all ORDER BY columns
in CREATE TABLE to DESC for stress-testing the reverse key feature across
the entire functional test suite. Auto-enables allow_experimental_reverse_key.

Also fix style check by adding allow_experimental_reverse_key to the
experimental settings ignore list.
Add SET force_primary_key_reverse_order = 0 to tests that are incompatible
with the forced DESC sorting key feature. This includes tests that:
- Check specific ORDER BY / sorting key behavior
- Verify specific data layout or merge behavior
- Use features incompatible with reverse keys (e.g., specific projections)
- Check SHOW CREATE TABLE output

Also register force_primary_key_reverse_order in SettingsChangesHistory.cpp
for version 26.3.
System log tables (query_log, part_log, etc.) are created with
internal=true. Applying force_primary_key_reverse_order to them
produces ORDER BY with DESC keywords that break CI's
setup_log_cluster.sh sed-based ORDER BY modification, causing
syntax errors like 'tuple(event_date DESC, event_time DESC)'.

ATTACH queries load from existing metadata and must not be modified.

Guard the applyForceReverseOrder call with !internal and
mode <= LoadingStrictnessLevel::CREATE.
…m settings

The setting was defaulting to true, which caused ~168 integration test
failures since integration tests create MergeTree tables via normal
CREATE TABLE (not internal). Changing the default to false and adding
the setting to SettingsRandomizer enables random stress-testing in
stateless tests without breaking integration tests.

Tests that are incompatible with the setting already have explicit
SET force_primary_key_reverse_order = 0.
The SET force_primary_key_reverse_order = 0 was placed inside a
Jinja2 list literal within the {% for %} block, breaking template
parsing. Moved it to the top of the file before any Jinja2 syntax.
Three root causes:
- .sh tests: positional SQL args consumed by multitoken() Bool parser;
  fix by using --query flag instead of positional args
- .sql tests with echoOn: inserted SET broke contiguous comment block
  for getTestTagsLength; fix by placing SET after all -- comments
- query_log test: SET was logged; fix by filtering it from output
- 03733, 03533: Update reference files to include leaked comment lines
  (comments after SET break getTestTagsLength contiguity, echoOn echoes them)
- 03006: Move SET before CREATE TABLE so tables aren't affected by random settings
- 02572: Remove ilike filter, add SET entries to reference (SET is logged in query_log)
… tests

Add SET force_primary_key_reverse_order = 0 (or CLICKHOUSE_CLIENT override for
.sh tests) to 54 tests that fail when the setting is randomly enabled via
SettingsRandomizer. Each test has a per-test comment explaining why the setting
is incompatible.

Also fix positional SQL args in 03276 and 03278 backup tests to work around
the multitoken() Bool parsing bug in clickhouse-client.
Add SET force_primary_key_reverse_order = 0 (or CLICKHOUSE_CLIENT override
for .sh tests) to tests that are incompatible with reverse key ordering.
Each test has a one-line comment explaining the specific reason.
EXPLAIN PIPELINE output depends on in-order aggregation optimization
which is sensitive to MergeTree key direction.
- 02494_query_cache_totals_extremes: aggregation-in-order bug with reversed keys
- 01656_sequence_next_node_long: sequenceNextNode depends on physical key order
- 01231_distributed_aggregation_memory_efficient_mix_levels: distributed aggregation wrong counts
- 02561_temporary_table_sessions: add CLICKHOUSE_URL override for curl-based queries
- 03821_async_insert_same_param_names: async insert data cross-contamination
- 02177_issue_31009: partial_merge join produces duplicates with reversed keys
- 04004_alter_modify_column_ttl_without_type: SHOW CREATE TABLE direction sensitivity
if (!as_table_saved.empty())
create.is_create_empty = false;

if (!internal && mode <= LoadingStrictnessLevel::CREATE
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying applyForceReverseOrder here is too late for CREATE TABLE ... CLONE AS ....

The clone compatibility checks above compare source and target ordering before this rewrite. With force_primary_key_reverse_order=1, a clone can pass those checks and then have its ORDER BY flipped to DESC afterwards. CLONE AS reuses existing parts, so this can leave metadata saying reversed order while parts are still laid out in the original order, which can break read-order assumptions.

Please either skip this rewrite for create.is_clone_as, or move the rewrite earlier so clone compatibility is validated against the final ORDER BY.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 26, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.10% +0.00%
Functions 24.50% 24.50% +0.00%
Branches 76.60% 76.70% +0.10%

Changed lines: 90.21% (129/143) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ClickHouse roadmap 2026

2 participants