Skip to content

Fix on-disk format breakage for secondary indices over Nullable column#27197

Merged
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
azat:nullable-index-fix
Aug 5, 2021
Merged

Fix on-disk format breakage for secondary indices over Nullable column#27197
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
azat:nullable-index-fix

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Aug 4, 2021

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix on-disk format breakage for secondary indices over Nullable column (no stable release had been affected)

Detailed description / Documentation draft:
1 breaks on disk format.

I checked this patchset for compatibility after
reverting this patch 2 (use case: I've applied it manually, then
revert it, and data skipping indexes over Nullable column had been
broken)

But this patchset actually breaks compatibility with older versions of
clickhouse for Nullable data skipping indexes after simple upgrade:

Details

Here is a simple reproducer:

--
-- run this with 21.6 or similar (i.e. w/o this patch)
--

CREATE TABLE data
(
    `key` Int,
    `value` Nullable(Int),
    INDEX value_index value TYPE minmax GRANULARITY 1
)
ENGINE = MergeTree
ORDER BY key;

INSERT INTO data SELECT
    number,
    number
FROM numbers(10000);

SELECT * FROM data WHERE value = 20000 SETTINGS force_data_skipping_indices = 'value_index' SETTINGS force_data_skipping_indices = 'value_index', max_rows_to_read=1;

Now upgrade and run the query again:

SELECT * FROM data WHERE value = 20000 SETTINGS force_data_skipping_indices = 'value_index' SETTINGS force_data_skipping_indices = 'value_index', max_rows_to_read=1;

And it will fail because of on disk format changes:

$ ll --time-style=+ data/*/data/all_1_1_0/skp*.idx
-rw-r----- 1 azat azat 36  data/with_nullable_patch/data/all_1_1_0/skp_idx_value_index.idx
-rw-r----- 1 azat azat 37  data/without_nullable_patch/data/all_1_1_0/skp_idx_value_index.idx

$ md5sum data/*/data/all_1_1_0/skp*.idx
a19c95c4a14506c65665a1e30ab404bf  data/with_nullable_patch/data/all_1_1_0/skp_idx_value_index.idx
e50e2fcfa873b232196623d56ab26105  data/without_nullable_patch/data/all_1_1_0/skp_idx_value_index.idx

Note, that there is no stable release with this patch included yet, so
no need to backport.

Also note that you may create data skipping indexes over Nullable
column even before 3.

Changelog:

Details
  • v2: break cases when granulas has Null in values due to backward compatibility

Cc: @alexey-milovidov (please add no-backport label)
Cc: @amosbird (please take a look)

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Aug 4, 2021
azat added 2 commits August 5, 2021 00:19
[1] breaks on disk format (and the relevant change in the:

  [1]: ClickHouse#12455 (comment)

Too bad that I checked this patchset only for compatibility after
reverting this patch [2] (use case: I've applied it manually, then
revert it, and data skipping indexes over Nullable column had been
broken)

  [2]: ClickHouse#12455 (comment)

But this patchset actually breaks compatibility with older versions of
clickhouse for Nullable data skipping indexes after simple upgrade:

Here is a simple reproducer:

    --
    -- run this with 21.6 or similar (i.e. w/o this patch)
    --

    CREATE TABLE data
    (
        `key` Int,
        `value` Nullable(Int),
        INDEX value_index value TYPE minmax GRANULARITY 1
    )
    ENGINE = MergeTree
    ORDER BY key;

    INSERT INTO data SELECT
        number,
        number
    FROM numbers(10000);

    SELECT * FROM data WHERE value = 20000 SETTINGS force_data_skipping_indices = 'value_index' SETTINGS force_data_skipping_indices = 'value_index', max_rows_to_read=1;

Now upgrade and run the query again:

    SELECT * FROM data WHERE value = 20000 SETTINGS force_data_skipping_indices = 'value_index' SETTINGS force_data_skipping_indices = 'value_index', max_rows_to_read=1;

And it will fail because of on disk format changes:

    $ ll --time-style=+ data/*/data/all_1_1_0/skp*.idx
    -rw-r----- 1 azat azat 36  data/with_nullable_patch/data/all_1_1_0/skp_idx_value_index.idx
    -rw-r----- 1 azat azat 37  data/without_nullable_patch/data/all_1_1_0/skp_idx_value_index.idx

    $ md5sum data/*/data/all_1_1_0/skp*.idx
    a19c95c4a14506c65665a1e30ab404bf  data/with_nullable_patch/data/all_1_1_0/skp_idx_value_index.idx
    e50e2fcfa873b232196623d56ab26105  data/without_nullable_patch/data/all_1_1_0/skp_idx_value_index.idx

Note, that there is no stable release with this patch included yet, so
no need to backport.

Also note that you may create data skipping indexes over Nullable
column even before [3].

  [3]: ClickHouse#12455

v2: break cases when granulas has Null in values due to backward
compatibility
@azat azat force-pushed the nullable-index-fix branch from 5668e1f to dee27fc Compare August 4, 2021 21:20
@azat
Copy link
Copy Markdown
Member Author

azat commented Aug 4, 2021

As a long-term solution I think new format for Nullable indexed should be added, with a new name, that will be able to skip granulas with Null in it (see modifications in tests).

@filimonov
Copy link
Copy Markdown
Contributor

Note, that there is no stable release with this patch included yet, so
no need to backport

21.8 was already forked.

@azat
Copy link
Copy Markdown
Member Author

azat commented Aug 5, 2021

21.8 was already forked.

21.8 does not includes #12455

@azat azat force-pushed the nullable-index-fix branch 2 times, most recently from 09c3762 to 76af33f Compare August 5, 2021 08:47
@filimonov
Copy link
Copy Markdown
Contributor

21.8 was already forked.

21.8 does not includes #12455

You're right (thought it is).

@azat azat force-pushed the nullable-index-fix branch 2 times, most recently from 39f8257 to 0ba6efc Compare August 5, 2021 13:42
@alexey-milovidov
Copy link
Copy Markdown
Member

@Mergifyio update

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Aug 5, 2021

Command update: success

Branch has been successfully updated

@azat azat force-pushed the nullable-index-fix branch from 6d76081 to dee27fc Compare August 5, 2021 14:20
@alexey-milovidov alexey-milovidov merged commit 1a3d8ce into ClickHouse:master Aug 5, 2021
@azat azat deleted the nullable-index-fix branch August 5, 2021 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-no-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants