Skip to content

Constraints for optimization#18787

Merged
CurtizJ merged 65 commits intoClickHouse:masterfrom
nikvas0:nikvas0/simple_optimizations
Nov 22, 2021
Merged

Constraints for optimization#18787
CurtizJ merged 65 commits intoClickHouse:masterfrom
nikvas0:nikvas0/simple_optimizations

Conversation

@nikvas0
Copy link
Copy Markdown
Contributor

@nikvas0 nikvas0 commented Jan 6, 2021

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

  • Added CONSTRAINT ... ASSUME ... (without checking during INSERT)
  • Added query transformation to CNF (Optimize boolean conditions using CNF / DNF #11749) for more convenient optimization
  • Added simple query rewriting using constraints (only simple matching now, will be improved to support <,=,>... later)
  • Added ability to replace heavy columns with light
  • Added ability to use the index in queries

Detailed description / Documentation draft:

TODO:

  • documentation
  • more tests for CNF
  • improve constraint matching (now it is a simple all-to-all matching)
  • refactoring
  • settings

@robot-clickhouse robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Jan 6, 2021
@nikvas0 nikvas0 marked this pull request as ready for review May 12, 2021 20:46
@nikvas0 nikvas0 changed the title [WIP] Constraints for optimization Constraints for optimization May 16, 2021
@UnamedRus
Copy link
Copy Markdown
Contributor

EXPLAIN SYNTAX
SELECT count()
FROM
(
    SELECT
        number AS key_a
    FROM numbers(20)
) AS a
WHERE ((key_a >= 1) AND (key_a <= 5)) OR ((key_a >= 5) AND (key_a <= 10)) OR ((key_a >= 10) AND (key_a <= 15))

┌─explain────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ SELECT count()                                                                                                         │
│ FROM                                                                                                                   │
│ (                                                                                                                      │
│     SELECT number AS key_a                                                                                             │
│     FROM numbers(20)                                                                                                   │
│     WHERE (((((key_a >= 10) AND (key_a >= 5)) AND (key_a <= 5)) AND (key_a <= 10)) AND (key_a >= 1)) AND (key_a <= 15) │
│ ) AS a                                                                                                                 │
│ WHERE (key_a <= 15) AND (key_a >= 1) AND (key_a <= 10) AND (key_a <= 5) AND (key_a >= 5) AND (key_a >= 10)             │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

 set convert_query_to_cnf=0;
 
 EXPLAIN SYNTAX
SELECT count()
FROM
(
    SELECT number AS key_a
    FROM numbers(20)
) AS a
WHERE ((key_a >= 1) AND (key_a <= 5)) OR ((key_a >= 5) AND (key_a <= 10)) OR ((key_a >= 10) AND (key_a <= 15))

Query id: d2d4275f-07b2-412b-9eca-bc40e0b5d180

┌─explain────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ SELECT count()                                                                                                     │
│ FROM                                                                                                               │
│ (                                                                                                                  │
│     SELECT number AS key_a                                                                                         │
│     FROM numbers(20)                                                                                               │
│     WHERE ((key_a >= 1) AND (key_a <= 5)) OR ((key_a >= 5) AND (key_a <= 10)) OR ((key_a >= 10) AND (key_a <= 15)) │
│ ) AS a                                                                                                             │
│ WHERE ((key_a >= 1) AND (key_a <= 5)) OR ((key_a >= 5) AND (key_a <= 10)) OR ((key_a >= 10) AND (key_a <= 15))     │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

But it looks like it working correctly if we have only 2 groups:

 set convert_query_to_cnf=1;
 
 EXPLAIN SYNTAX
SELECT count()
FROM
(
    SELECT number AS key_a
    FROM numbers(20)
) AS a
WHERE ((key_a >= 1) AND (key_a <= 5)) OR ((key_a >= 5) AND (key_a <= 10))

Query id: 966cb5fb-58a5-4bd2-b8aa-e39c8c7fc9e4

┌─explain─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ SELECT count()                                                                                                                                          │
│ FROM                                                                                                                                                    │
│ (                                                                                                                                                       │
│     SELECT number AS key_a                                                                                                                              │
│     FROM numbers(20)                                                                                                                                    │
│     WHERE ((((key_a <= 5) OR (key_a >= 5)) AND ((key_a <= 10) OR (key_a <= 5))) AND ((key_a >= 1) OR (key_a >= 5))) AND ((key_a >= 1) OR (key_a <= 10)) │
│ ) AS a                                                                                                                                                  │
│ WHERE ((key_a >= 1) OR (key_a <= 10)) AND ((key_a >= 1) OR (key_a >= 5)) AND ((key_a <= 10) OR (key_a <= 5)) AND ((key_a <= 5) OR (key_a >= 5))         │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

@nikvas0
Copy link
Copy Markdown
Contributor Author

nikvas0 commented May 19, 2021

But it looks like it working correctly if we have only 2 groups:

Fixed

@CurtizJ CurtizJ self-assigned this May 25, 2021
@UnamedRus
Copy link
Copy Markdown
Contributor

It's possible that #10685 going to be covered by this pr?

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Sep 28, 2021

CLA assistant check
All committers have signed the CLA.

@CurtizJ
Copy link
Copy Markdown
Member

CurtizJ commented Nov 18, 2021

@nikvas0 Please, can you sign the CLA? So, I'll be able to merge this PR.

CurtizJ added a commit that referenced this pull request Nov 22, 2021
@CurtizJ CurtizJ merged commit 833652e into ClickHouse:master Nov 22, 2021
@sevirov
Copy link
Copy Markdown
Contributor

sevirov commented Nov 22, 2021

Internal documentation ticket: DOCSUP-18735

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants