Skip to content

Fix readByLayers using wrong ReadType for reverse-key tables#101153

Open
alexey-milovidov wants to merge 7 commits intomasterfrom
fix/readByLayers-reverse-key-read-type
Open

Fix readByLayers using wrong ReadType for reverse-key tables#101153
alexey-milovidov wants to merge 7 commits intomasterfrom
fix/readByLayers-reverse-key-read-type

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

readByLayers hardcoded ReadType::InOrder when reading data for the sharded join optimization. For tables with descending primary keys (allow_experimental_reverse_key = 1), the correct type should be ReadType::InReverseOrder when input_order_info->direction == -1.

This caused flaky failures in 03668_shard_join_in_reverse_order when randomized settings enabled both read_in_order_use_virtual_row and disabled enable_join_runtime_filters, producing wrong output order.

The fix mirrors the logic already used in spreadMarkRangesAmongStreamsWithOrder.

Closes #100870

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

`readByLayers` hardcoded `ReadType::InOrder` when reading data for the
sharded join optimization. For tables with descending primary keys
(`allow_experimental_reverse_key = 1`), the correct type should be
`ReadType::InReverseOrder` when `input_order_info->direction == -1`.

This caused flaky failures in `03668_shard_join_in_reverse_order` when
randomized settings enabled both `read_in_order_use_virtual_row` and
disabled `enable_join_runtime_filters`, producing wrong output order.

The fix mirrors the logic already used in
`spreadMarkRangesAmongStreamsWithOrder`.

Closes #100870

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@alexey-milovidov
Copy link
Copy Markdown
Member Author

Add a test.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 30, 2026

Workflow [PR], commit [70f8e23]

Summary:

job_name test_name status info comment
Stress test (amd_tsan) failure
Logical error: Shard number is greater than shard count: shard_num=A shard_count=B cluster=C (STID: 5066-457d) FAIL cidb
Stress test (arm_msan) failure
MemorySanitizer: use-of-uninitialized-value (STID: 1003-358c) FAIL cidb, issue
Finish Workflow failure
python3 ./ci/jobs/scripts/workflow_hooks/pr_body_check.py failure

AI Review

Summary

This PR fixes readByLayers to choose ReadType::InReverseOrder when input_order_info->direction == -1, matching behavior already used in spreadMarkRangesAmongStreamsWithOrder, and adds a focused stateless test for reverse-key tables. The code change is small and correct, and the test reproduces the reported flaky ordering problem. I found no code-level correctness or safety issues in the diff.

PR Metadata
  • ⚠️ Changelog category is correct (Bug Fix), but Changelog entry is required for this category and is currently missing (...).
  • Exact replacement text for the Changelog entry field:
    • Fix incorrect output order in some joins over reverse-key MergeTree tables when read-by-layers optimization is used.
Findings

⚠️ Majors

  • [PR description: Changelog entry section] Required Changelog entry is missing for selected Bug Fix category, which violates the PR template requirement and reduces user-facing release-note quality.
  • Suggested fix: replace ... with a concrete user-facing sentence (example above).
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality ⚠️ Bug Fix selected, but required Changelog entry is missing (...).
Safe rollout
Compilation time
Final Verdict
  • Status: ⚠️ Request changes
  • Minimum required actions:
    • Add a proper Changelog entry in the PR description for the selected Bug Fix category.

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Mar 30, 2026
sort_description.emplace_back(sorting_columns[i], input_order_info->direction);
}

const auto read_type = input_order_info->direction == 1 ? ReadType::InOrder : ReadType::InReverseOrder;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the required Changelog entry in the PR description for the selected Bug Fix category (the template requires it unless category is one of the explicitly exempt ones).

Suggested text:
Fix incorrect read order in readByLayers for reverse-key MergeTree tables, which could produce unstable result ordering in shard-join queries when reading in reverse order.

alexey-milovidov and others added 6 commits March 30, 2026 03:59
Adds a deterministic test that exercises the `readByLayers` code path
(via `query_plan_join_shard_by_pk_ranges`) on a table with a descending
primary key (`allow_experimental_reverse_key = 1`). This verifies that
the correct `ReadType` is used based on `input_order_info->direction`.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The number 04068 was taken by another test on master.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@alexey-milovidov alexey-milovidov marked this pull request as ready for review March 31, 2026 10:13
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 31, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.10% +0.00%
Functions 90.90% 90.90% +0.00%
Branches 76.70% 76.70% +0.00%

Changed lines: 92.86% (13/14) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: 03668_shard_join_in_reverse_order

2 participants