Skip to content

Fix index granularity calculation on block borders#17120

Merged
alesapin merged 3 commits intomasterfrom
fix_granularity_on_block_borders
Nov 19, 2020
Merged

Fix index granularity calculation on block borders#17120
alesapin merged 3 commits intomasterfrom
fix_granularity_on_block_borders

Conversation

@alesapin
Copy link
Copy Markdown
Member

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Improve adaptive index granularity calculation when incoming blocks of data differ in bytes size a lot.

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Nov 17, 2020
@alesapin alesapin requested a review from CurtizJ November 17, 2020 10:18
@CurtizJ CurtizJ self-assigned this Nov 17, 2020
@alesapin
Copy link
Copy Markdown
Member Author

Quite big changes for a very rare problem. So don't backport.

@alesapin
Copy link
Copy Markdown
Member Author

2020.11.18 17:02:07.429869 [ 1073 ] {78c0f24a-f484-4082-83fb-8980891033eb} <Error> executeQuery: Code: 57, e.displayText() = DB::Exception: Cannot attach table with UUID 061e98ed-d127-4ae4-a606-d5727ca4d5c5, because it was detached but still used by some query. Retry later. (version 20.12.1.5209) (from [::1]:36242) (in query: ATTACH TABLE concurrent_mutate_mt_1), Stack trace (when copying this message, always include the lines below):

@alesapin
Copy link
Copy Markdown
Member Author

Test failures seem unrelated but suspicious.

@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented Nov 19, 2020

01133_max_result_rows

SET max_block_size = 10;
SET max_result_rows = 20;
SET result_overflow_mode = 'throw';

SELECT DISTINCT intDiv(number, 10) FROM numbers(300); -- { serverError 396 }
SELECT DISTINCT intDiv(number, 10) FROM numbers(190);
SELECT DISTINCT intDiv(number, 10) FROM numbers(200);
SELECT DISTINCT intDiv(number, 10) FROM numbers(210); -- { serverError 396 }

SET result_overflow_mode = 'break';

SELECT DISTINCT intDiv(number, 10) FROM numbers(300);
SELECT DISTINCT intDiv(number, 10) FROM numbers(190);
SELECT DISTINCT intDiv(number, 10) FROM numbers(200);
SELECT DISTINCT intDiv(number, 10) FROM numbers(210);

SET max_block_size = 10;
SET max_result_rows = 1;
SELECT number FROM system.numbers;
SELECT count() FROM numbers(100);
-- subquery result is not the total result
SELECT count() FROM (SELECT * FROM numbers(100));
-- remote query result is not the total result
SELECT count() FROM remote('127.0.0.{1,2}', numbers(100));

How this test can be connected with changes in merge tree data serialization?

@alesapin
Copy link
Copy Markdown
Member Author

SELECT
    check_start_time,
    pull_request_number,
    check_name,
    test_name,
    test_status
FROM checks
WHERE (test_name = '01133_max_result_rows') AND (test_status = 'FAIL') AND (check_start_time >= (now() - toIntervalDay(7)))
ORDER BY test_status ASC

Query id: 6a4f37e0-8960-4a81-9b08-4dc9c027a107

┌────check_start_time─┬─pull_request_number─┬─check_name───────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-11-17 10:53:21 │               17120 │ Functional stateless tests (release) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─check_name───────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-11-18 21:51:15 │               17120 │ Functional stateless tests (release) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────┴───────────────────────┴─────────────┘

Fails in this PR only.

@alesapin
Copy link
Copy Markdown
Member Author

Actually no, I'm just unlucky https://clickhouse-test-reports.s3.yandex.net/0/40a964826917aac026f788c26e83ee75d5db5895/functional_stateless_tests_(release,_wide_parts_enabled)/test_run.txt.out.log.

SELECT
    check_start_time,
    pull_request_number,
    commit_sha,
    check_name,
    test_name,
    test_status
FROM checks
WHERE (test_name = '01133_max_result_rows') AND (test_status = 'FAIL') AND (check_start_time >= (now() - toIntervalDay(30)))
ORDER BY pull_request_number ASC

Query id: bb0d5e09-0302-4f2d-9f0f-bbd2f320a23c

┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name───────────────────────────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-11-05 09:46:36 │                   0 │ 40a964826917aac026f788c26e83ee75d5db5895 │ Functional stateless tests (release, wide parts enabled) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴──────────────────────────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name─────────────────────────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-10-31 12:59:20 │                   0 │ db146ee6154a3c46208f535c5effa0ed4cd6a9ed │ Functional stateless tests (release, DatabaseOrdinary) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴────────────────────────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name───────────────────────────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-10-29 14:54:32 │                   0 │ 2a66c17472387c09afc3ac0556b1c4b4a8c90564 │ Functional stateless tests (release, wide parts enabled) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴──────────────────────────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name───────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-10-21 19:31:57 │               15811 │ 340b0c913429f82308e793018a0bf19fe708ed6a │ Functional stateless tests (release) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴──────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name─────────────────────────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-10-20 14:54:42 │               15874 │ f5115287db903827cf3fcfd0063de2a7c7d2b48b │ Functional stateless tests (release, DatabaseOrdinary) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴────────────────────────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name──────────────────────────────────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-10-22 15:05:53 │               16262 │ 3d3b49d009a9613984e865b9535bfa9418e135e4 │ Functional stateless tests (release, polymorphic parts enabled) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴─────────────────────────────────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name─────────────────────────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-10-29 13:32:24 │               16509 │ d3f08b21de47065de5687401fe4e28fd72ad46f9 │ Functional stateless tests (release, DatabaseOrdinary) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴────────────────────────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name───────────────────────────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-11-07 13:02:14 │               16784 │ 81be32e4abb90e6f59a213f6f903c6df3b740356 │ Functional stateless tests (release, wide parts enabled) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴──────────────────────────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name───────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-11-18 21:51:15 │               17120 │ d11c547ec2f9074370e02d9128771ccacd87a6f8 │ Functional stateless tests (release) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴──────────────────────────────────────┴───────────────────────┴─────────────┘
┌────check_start_time─┬─pull_request_number─┬─commit_sha───────────────────────────────┬─check_name───────────────────────────┬─test_name─────────────┬─test_status─┐
│ 2020-11-17 10:53:21 │               17120 │ 389b88353489205278f22e9d34bac8983147dc93 │ Functional stateless tests (release) │ 01133_max_result_rows │ FAIL        │
└─────────────────────┴─────────────────────┴──────────────────────────────────────────┴──────────────────────────────────────┴───────────────────────┴─────────────┘

Copy link
Copy Markdown
Member

@CurtizJ CurtizJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok.

@CurtizJ
Copy link
Copy Markdown
Member

CurtizJ commented Nov 19, 2020

But code in DataPartWriter becomes more and more complicated.

@alesapin
Copy link
Copy Markdown
Member Author

But code in DataPartWriter becomes more and more complicated.

I'm also sad about it. I think we should rewrite it. I spent about 3-4 hours understanding indices serialization.

@alesapin
Copy link
Copy Markdown
Member Author

Let's try to merge.

@alesapin alesapin merged commit 2623d35 into master Nov 19, 2020
@alesapin alesapin deleted the fix_granularity_on_block_borders branch November 19, 2020 15:36
azat added a commit to azat/ClickHouse that referenced this pull request Jan 5, 2021
In ClickHouse#17120 (that was reverted in ClickHouse#17918 already) marks adjustemnt was
introduced, but it may lead to corruption of marks files, that may cause
two things:
1) Incorrect marks files, which lead to reading more rows than there are
   (can be reproduced with `max_threads`>1 or/and `PREWHERE`, and
    `optimize_move_to_prewhere`)
2) Can't adjust last granule because it has X rows, but try to subtract Y
   rows (ClickHouse#9260)

And 1) is pretty hard to diagnosis, since it does not throw any error,
it just may return wrong result.

Fortunately both problems can be fixed with `OPTIMIZE TABLE ... [ FINAL]`.

Cc: @alesapin
Refs: ClickHouse#17943
Refs: ClickHouse#18223
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-no-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants