Skip to content

Secondary indexes should be used for queries with FINAL #70292

@alexey-milovidov

Description

@alexey-milovidov

Describe the situation

Filtering by secondary indexes is disabled for queries with FINAL, because skipping a granule can change the presence or values of records in other granules for the same sorting key as for skipped records.

See #34243

How to solve

If a query contains FINAL, secondary indices should retain granules that could contain records with the same value of the sorting key as in the records in granules that are not going to be skipped.

As a result, they will skip less amount of granules because skipped ranges will be smaller.

Example

Let's suppose we have a table

CREATE TABLE table
(
id String, 
payload String, 
PRIMARY KEY (id), 
INDEX ix (payload) TYPE = set(100)
)
ENGINE = ReplacingMergeTree

containing the following granules:

granule 1:    primary key: A    ix: (abc, hello)
id  payload
---
A abc
A hello
B abc

granule 2:    primary key: B    ix: (world, xyz)
---
B world
B xyz

granule 3:    primary key: B    ix: (hello)
---
B hello

granule 4:    primary key: C    ix: (abc)
---
C abc

granule 5:    primary key: C    ix: (def)
---
C def

granule 6:    primary key: D    ix: (world)
---
D world

If we do SELECT * FROM table, it returns:

A abc
A hello
B abc
B world
B xyz
B hello
C abc
C def
D world

If we do SELECT * FROM table FINAL, it returns:

A hello
B hello
C def
D world

If we do SELECT * FROM table WHERE payload = 'world', it returns:

B world
D world

If we do SELECT * FROM table FINAL WHERE payload = 'world' with the setting use_skip_indexes_if_final, it returns:

B world
D world

while it should return

D world

because the index skipped the granule 3 which contained a replacement for key B, while it should not.

Let's look at the granules, under the filter payload = 'world':

SKIPPED granule 1:    primary key: A    possible values of PK: [A..B]    ix: (abc, hello)
PASSED  granule 2:    primary key: B    possible values of PK: [B..B]    ix: (world, xyz)
SKIPPED granule 3:    primary key: B    possible values of PK: [B..C]    ix: (hello)
SKIPPED granule 4:    primary key: C    possible values of PK: [C..C]    ix: (abc)
SKIPPED granule 5:    primary key: D    possible values of PK: [C..D]    ix: (def)
PASSED  granule 6:    primary key: D    possible values of PK: [D..?]    ix: (world)

Granule 1 is skipped. But it could contain the same values of PK (B) as in the passed granule 2 - then it should not be skipped.
Granule 3 is skipped. But it could contain the same values of PK (B) as in the passed granule 2 - then it should not be skipped.
Granule 4 is skipped. It is ok because the values of PK in this granule (C) don't intersect with any of the passed granules.
Granule 5 is skipped. But it could contain the same values of PK (D) as in the passed granule 6 - then it should not be skipped.

The correct application of the index should look like this:

PASSED  granule 1:    primary key: A    possible values of PK: [A..B]    ix: (abc, hello)
PASSED  granule 2:    primary key: B    possible values of PK: [B..B]    ix: (world, xyz)
PASSED  granule 3:    primary key: B    possible values of PK: [B..C]    ix: (hello)
SKIPPED granule 4:    primary key: C    possible values of PK: [C..C]    ix: (abc)
PASSED  granule 5:    primary key: D    possible values of PK: [C..D]    ix: (def)
PASSED  granule 6:    primary key: D    possible values of PK: [D..?]    ix: (world)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions