Secondary indexes should be used for queries with FINAL

**Describe the situation**

Filtering by secondary indexes is disabled for queries with FINAL, because skipping a granule can change the presence or values of records in other granules for the same sorting key as for skipped records.

See https://github.com/ClickHouse/ClickHouse/pull/34243

**How to solve**

If a query contains FINAL, secondary indices should retain granules that could contain records with the same value of the sorting key as in the records in granules that are not going to be skipped.

As a result, they will skip less amount of granules because skipped ranges will be smaller.

**Example**

Let's suppose we have a table
```
CREATE TABLE table
(
id String, 
payload String, 
PRIMARY KEY (id), 
INDEX ix (payload) TYPE = set(100)
)
ENGINE = ReplacingMergeTree
```
containing the following granules:
```
granule 1:    primary key: A    ix: (abc, hello)
id  payload
---
A abc
A hello
B abc

granule 2:    primary key: B    ix: (world, xyz)
---
B world
B xyz

granule 3:    primary key: B    ix: (hello)
---
B hello

granule 4:    primary key: C    ix: (abc)
---
C abc

granule 5:    primary key: C    ix: (def)
---
C def

granule 6:    primary key: D    ix: (world)
---
D world
```

If we do `SELECT * FROM table`, it returns:
```
A abc
A hello
B abc
B world
B xyz
B hello
C abc
C def
D world
```

If we do `SELECT * FROM table FINAL`, it returns:
```
A hello
B hello
C def
D world
```

If we do `SELECT * FROM table WHERE payload = 'world'`, it returns:
```
B world
D world
```

If we do `SELECT * FROM table FINAL WHERE payload = 'world'` with the setting `use_skip_indexes_if_final`, it returns:
```
B world
D world
```
while it should return
```
D world
```
because the index skipped the granule 3 which contained a replacement for key B, while it should not.

Let's look at the granules, under the filter `payload = 'world'`:
```
SKIPPED granule 1:    primary key: A    possible values of PK: [A..B]    ix: (abc, hello)
PASSED  granule 2:    primary key: B    possible values of PK: [B..B]    ix: (world, xyz)
SKIPPED granule 3:    primary key: B    possible values of PK: [B..C]    ix: (hello)
SKIPPED granule 4:    primary key: C    possible values of PK: [C..C]    ix: (abc)
SKIPPED granule 5:    primary key: D    possible values of PK: [C..D]    ix: (def)
PASSED  granule 6:    primary key: D    possible values of PK: [D..?]    ix: (world)
```

Granule 1 is skipped. But it could contain the same values of PK (B) as in the passed granule 2 - then it should not be skipped.
Granule 3 is skipped. But it could contain the same values of PK (B) as in the passed granule 2 - then it should not be skipped.
Granule 4 is skipped. It is ok because the values of PK in this granule (C) don't intersect with any of the passed granules.
Granule 5 is skipped. But it could contain the same values of PK (D) as in the passed granule 6 - then it should not be skipped.

The correct application of the index should look like this:
```
PASSED  granule 1:    primary key: A    possible values of PK: [A..B]    ix: (abc, hello)
PASSED  granule 2:    primary key: B    possible values of PK: [B..B]    ix: (world, xyz)
PASSED  granule 3:    primary key: B    possible values of PK: [B..C]    ix: (hello)
SKIPPED granule 4:    primary key: C    possible values of PK: [C..C]    ix: (abc)
PASSED  granule 5:    primary key: D    possible values of PK: [C..D]    ix: (def)
PASSED  granule 6:    primary key: D    possible values of PK: [D..?]    ix: (world)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Secondary indexes should be used for queries with FINAL #70292

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Secondary indexes should be used for queries with FINAL #70292

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions