Skip to content

Added Analyzer, Planner#31796

Merged
kitaisreal merged 105 commits intoClickHouse:masterfrom
kitaisreal:identifier-resolver
Oct 25, 2022
Merged

Added Analyzer, Planner#31796
kitaisreal merged 105 commits intoClickHouse:masterfrom
kitaisreal:identifier-resolver

Conversation

@kitaisreal
Copy link
Copy Markdown
Contributor

@kitaisreal kitaisreal commented Nov 25, 2021

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Added new infrastructure for query analysis and planning under allow_experimental_analyzer setting.

@robot-clickhouse robot-clickhouse added the pr-improvement Pull request with some product improvements label Nov 25, 2021
@UnamedRus
Copy link
Copy Markdown
Contributor

EXPLAIN QUERYTREE
SELECT
    Carrier,
    sum(toFloat64(C3)) AS C1,
    sum(toFloat64(C1)) AS C2,
    sum(toFloat64(C2)) AS C3
FROM
(
    SELECT
        1 AS Carrier,
        count(CAST(1, 'Nullable(Int32)')) AS C1,
        max(rand()) AS C2,
        min(rand32()) AS C3
    FROM numbers(10)
    GROUP BY Carrier
) AS ITBL
GROUP BY Carrier

Query id: fdabff54-e6bc-4758-b216-7f028d11bfaf

[LAPTOP-] 2021.11.25 18:56:09.149166 [ 25283 ] <Fatal> BaseDaemon: ########################################
[LAPTOP-] 2021.11.25 18:56:09.149218 [ 25283 ] <Fatal> BaseDaemon: (version 21.12.1.0, build id: A6AA62D20769C3B3CEDB310A814E34FDF6510B8F) (from thread 25088) (query_id: fdabff54-e6bc-4758-b216-7f028d11bfaf) Received signal Segmentation fault (11)
[LAPTOP-] 2021.11.25 18:56:09.149234 [ 25283 ] <Fatal> BaseDaemon: Address: NULL pointer. Access: read. Address not mapped to object.
[LAPTOP-] 2021.11.25 18:56:09.149258 [ 25283 ] <Fatal> BaseDaemon: Stack trace: 0x12ed7504 0x12ed6a4b 0x12ed663b 0x12ed7535 0x12ed6a4b 0x12ed663b 0x12ed6453 0x12ed6206 0x12f609f4 0x12f604f6 0x1325b2c8 0x132591f5 0x13cfc7b1 0x13d0fbb9 0x16c566af 0x16c58b01 0x16d67889 0x16d64f80 0x7faced2b1609 0x7faced1d8293
[LAPTOP-] 2021.11.25 18:56:09.149309 [ 25283 ] <Fatal> BaseDaemon: 2. DB::QueryAnalyzer::getJoinTree(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::IScope>&) const @ 0x12ed7504 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149324 [ 25283 ] <Fatal> BaseDaemon: 3. DB::QueryAnalyzer::getSelectExpression(std::__1::shared_ptr<DB::IAST> const&, bool) const @ 0x12ed6a4b in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149333 [ 25283 ] <Fatal> BaseDaemon: 4. DB::QueryAnalyzer::getSelectWithUnionExpression(std::__1::shared_ptr<DB::IAST> const&, bool) const @ 0x12ed663b in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149345 [ 25283 ] <Fatal> BaseDaemon: 5. DB::QueryAnalyzer::getJoinTree(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::IScope>&) const @ 0x12ed7535 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149357 [ 25283 ] <Fatal> BaseDaemon: 6. DB::QueryAnalyzer::getSelectExpression(std::__1::shared_ptr<DB::IAST> const&, bool) const @ 0x12ed6a4b in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149370 [ 25283 ] <Fatal> BaseDaemon: 7. DB::QueryAnalyzer::getSelectWithUnionExpression(std::__1::shared_ptr<DB::IAST> const&, bool) const @ 0x12ed663b in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149382 [ 25283 ] <Fatal> BaseDaemon: 8. DB::QueryAnalyzer::initialize() @ 0x12ed6453 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149391 [ 25283 ] <Fatal> BaseDaemon: 9. DB::QueryAnalyzer::QueryAnalyzer(std::__1::shared_ptr<DB::IAST>, std::__1::shared_ptr<DB::Context const>) @ 0x12ed6206 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149406 [ 25283 ] <Fatal> BaseDaemon: 10. DB::InterpreterExplainQuery::executeImpl() @ 0x12f609f4 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149434 [ 25283 ] <Fatal> BaseDaemon: 11. DB::InterpreterExplainQuery::execute() @ 0x12f604f6 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149451 [ 25283 ] <Fatal> BaseDaemon: 12. ? @ 0x1325b2c8 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149465 [ 25283 ] <Fatal> BaseDaemon: 13. DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum) @ 0x132591f5 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149478 [ 25283 ] <Fatal> BaseDaemon: 14. DB::TCPHandler::runImpl() @ 0x13cfc7b1 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149490 [ 25283 ] <Fatal> BaseDaemon: 15. DB::TCPHandler::run() @ 0x13d0fbb9 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149506 [ 25283 ] <Fatal> BaseDaemon: 16. Poco::Net::TCPServerConnection::start() @ 0x16c566af in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149518 [ 25283 ] <Fatal> BaseDaemon: 17. Poco::Net::TCPServerDispatcher::run() @ 0x16c58b01 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149532 [ 25283 ] <Fatal> BaseDaemon: 18. Poco::PooledThread::run() @ 0x16d67889 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149540 [ 25283 ] <Fatal> BaseDaemon: 19. Poco::ThreadImpl::runnableEntry(void*) @ 0x16d64f80 in /usr/bin/clickhouse
[LAPTOP-] 2021.11.25 18:56:09.149557 [ 25283 ] <Fatal> BaseDaemon: 20. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
[LAPTOP-] 2021.11.25 18:56:09.149578 [ 25283 ] <Fatal> BaseDaemon: 21. __clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
[LAPTOP-] 2021.11.25 18:56:09.259074 [ 25283 ] <Fatal> BaseDaemon: Calculated checksum of the binary: 8813B5288386ACE95907CD2F14012888. There is no information about the reference checksum.
SELECT
    Carrier,
    sum(toFloat64(C3)) AS C1,
    sum(toFloat64(C1)) AS C2,
    sum(toFloat64(C2)) AS C3
FROM
(
    SELECT
        1 AS Carrier,
        count(CAST(1, 'Nullable(Int32)')) AS C1,
        max(rand()) AS C2,
        min(rand32()) AS C3
    FROM numbers(10)
    GROUP BY Carrier
) AS ITBL
GROUP BY Carrier

Query id: 104e20bf-4998-43f8-bdad-fc385ca68d66


0 rows in set. Elapsed: 0.003 sec.

Received exception from server (version 21.12.1):
Code: 174. DB::Exception: Received from localhost:9000. DB::Exception: Cyclic aliases. (CYCLIC_ALIASES)

Copy link
Copy Markdown
Member

@novikd novikd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identifiers resolution looks over-complicated. IExpression and IScope interfaces are very complex.

  1. IExpression doesn't represent expression, it's a part of QueryTree, so in my opinion it's better to rename into something like IQueryNode.
  2. Result of identifier resolution shouldn't be IExpression. ColumnExpression seems to be completely unnatural. It's better to return some thing like ResolveEntity which may have different implementations depending on what it's referencing.
  3. Usually things similar to query tree do not know anything about how resolution works. It's purpose to keep information about structure, cache resolve result and may be run resolve resolution.
  4. Usually scopes work is to keep in memory map id -> resolve entity. Scope hierarchy is intended to help with identifier shadowing. This means resolution is performed as follows: you find the scope to start with, you find the closest scope containing info about the identifier.
    Also one query tree node may introduce several scopes. For example, WITH ... SELECT ... probably should create separate scopes for WITH clause and SELECT clause.
  5. It'd be much more easier to read and understand if resolution would be implemented as a visitor of query tree.

@kitaisreal kitaisreal force-pushed the identifier-resolver branch from 5949875 to d3d4908 Compare July 14, 2022 11:20
@kitaisreal kitaisreal added the can be tested Allows running workflows for external contributors label Jul 16, 2022
@kitaisreal kitaisreal force-pushed the identifier-resolver branch 3 times, most recently from 0e6484c to 98b9ad5 Compare July 19, 2022 10:54
@kitaisreal kitaisreal force-pushed the identifier-resolver branch from a4bcabe to 3fadf32 Compare July 21, 2022 14:45
@kitaisreal kitaisreal force-pushed the identifier-resolver branch 2 times, most recently from 95250f4 to 3e049ac Compare August 15, 2022 16:34
@kitaisreal kitaisreal force-pushed the identifier-resolver branch 6 times, most recently from a34810a to f149a4e Compare September 10, 2022 20:06
@kitaisreal kitaisreal changed the title Added Analyzer Added Analyzer, Planner Oct 25, 2022
@kitaisreal kitaisreal merged commit 06fe6f3 into ClickHouse:master Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants