Skip to content

[CH] Support lazy projection #9141

@baibaichen

Description

@baibaichen

Description

We need support ClickHouse/ClickHouse#55518, currently, there is an issue:

2025-03-26 02:43:38.231 <Error> SerializedPlanParser: Invalid clickhouse plan(35) =>
Expression (Rename Output)
Header: n_nationkey#912 Int64
        n_name#913 String
        n_regionkey#914 Int64
Actions: INPUT : 0 -> n_nationkey Int64 : 0
         INPUT : 1 -> n_name String : 1
         INPUT : 2 -> n_regionkey Int64 : 2
         ALIAS n_nationkey :: 0 -> n_nationkey#912 Int64 : 3
         ALIAS n_name :: 1 -> n_name#913 String : 0
         ALIAS n_regionkey :: 2 -> n_regionkey#914 Int64 : 1
Positions: 3 0 1
  Expression (Project)
  Header: n_nationkey Int64
          n_name String
          n_regionkey Int64
  Actions: INPUT :: 0 -> n_nationkey Int64 : 0
           INPUT :: 1 -> n_name String : 1
           INPUT :: 2 -> n_regionkey Int64 : 2
           INPUT :: 3 -> plus(n_regionkey,1_7) Int64 : 3
  Positions: 0 1 2
    LazilyRead (Lazily Read)
    Header: n_nationkey Int64
            n_name String
            n_regionkey Int64
            plus(n_regionkey,1_7) Int64
    Lazily read columns: n_nationkey
      Limit (LIMIT)
      Header: n_nationkey Int64 (Lazy)
              n_name String
              n_regionkey Int64
              plus(n_regionkey,1_7) Int64
      Limit 3
      Offset 0
        Sorting (Sorting step)
        Header: n_nationkey Int64 (Lazy)
                n_name String
                n_regionkey Int64
                plus(n_regionkey,1_7) Int64
        Sort description: n_name ASC, plus(n_regionkey,1_7) ASC
        Limit 3
          Expression (Project)
          Header: n_nationkey Int64 (Lazy)
                  n_name String
                  n_regionkey Int64
                  plus(n_regionkey,1_7) Int64
          Actions: INPUT :: 0 -> n_nationkey Int64 : 0
                   INPUT :: 1 -> n_name String : 1
                   INPUT : 2 -> n_regionkey Int64 : 2
                   COLUMN Const(Int64) -> 1_7 Int64 : 3
                   FUNCTION plus(n_regionkey : 2, 1_7 :: 3) -> plus(n_regionkey,1_7) Int64 : 4
          Positions: 0 1 2 4
            ReadFromMergeTree (default.nation)
            Header: n_name String
                    n_regionkey Int64
                    n_nationkey Int64 (Lazy)
            ReadType: Default
            Parts: 1
            Granules: 1

2025-03-26 17:43:38.934 [442][Executor task launch worker for task 0.0 in stage 32.0 (TID 35)] ERROR org.apache.spark.task.TaskResources: Task 35 failed by error: 
org.apache.gluten.exception.GlutenException: Block structure mismatch in (columns with identical name must have identical structure) stream: different columns:
n_nationkey Int64 Int64(size = 0)
n_nationkey Int64 Lazy(size = 0)
0. /home/chang/SourceCode/backend1/contrib/llvm-project/libcxx/include/__exception/exception.h:113: std::exception::capture() @ 0x000000000c30f442
1. /home/chang/SourceCode/backend1/contrib/llvm-project/libcxx/include/__exception/exception.h:85: std::exception::exception[abi:se190107]() @ 0x000000000c30f40d
2. /home/chang/SourceCode/backend1/base/poco/Foundation/src/Exception.cpp:27: Poco::Exception::Exception(String const&, int) @ 0x00000000264efaa0
3. /home/chang/SourceCode/backend1/src/Common/Exception.cpp:108: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x0000000015eac4f1
4. /home/chang/SourceCode/backend1/src/Common/Exception.h:112: DB::Exception::Exception(String&&, int, bool) @ 0x000000000c30c7ea
5. /home/chang/SourceCode/backend1/src/Common/Exception.h:56: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000c30a9a9
6. /home/chang/SourceCode/backend1/src/Common/Exception.h:130: DB::Exception::Exception<std::basic_string_view<char, std::char_traits<char>>&, String, String>(int, FormatStringHelperImpl<std::type_identity<std::basic_string_view<char, std::char_traits<char>>&>::type, std::type_identity<String>::type, std::type_identity<String>::type>, std::basic_string_view<char, std::char_traits<char>>&, String&&, String&&) @ 0x000000001a8ff5bb
7. /home/chang/SourceCode/backend1/src/Core/Block.cpp:41: void DB::onError<void, std::basic_string_view<char, std::char_traits<char>>&, String, String>(int, FormatStringHelperImpl<std::type_identity<std::basic_string_view<char, std::char_traits<char>>&>::type, std::type_identity<String>::type, std::type_identity<String>::type>, std::basic_string_view<char, std::char_traits<char>>&, String&&, String&&) @ 0x000000001a8f430f
8. /home/chang/SourceCode/backend1/src/Core/Block.cpp:89: void DB::checkColumnStructure<void>(DB::ColumnWithTypeAndName const&, DB::ColumnWithTypeAndName const&, std::basic_string_view<char, std::char_traits<char>>, bool, int) @ 0x000000001a8ed154
9. /home/chang/SourceCode/backend1/src/Core/Block.cpp:202: DB::Block::insert(DB::ColumnWithTypeAndName) @ 0x000000001a8ed639
10. /home/chang/SourceCode/backend1/src/Storages/MergeTree/MergeTreeSelectProcessor.cpp:306: DB::MergeTreeSelectProcessor::injectLazilyReadColumns(unsigned long, DB::Block&, DB::MergeTreeReadTask*, std::shared_ptr<DB::LazilyReadInfo> const&) @ 0x000000001e8c3490
11. /home/chang/SourceCode/backend1/src/Storages/MergeTree/MergeTreeSelectProcessor.cpp:319: DB::MergeTreeSelectProcessor::transformHeader(DB::Block, std::shared_ptr<DB::LazilyReadInfo> const&, std::shared_ptr<DB::PrewhereInfo> const&) @ 0x000000001e8c1ef9
12. /home/chang/SourceCode/backend1/src/Storages/MergeTree/MergeTreeSelectProcessor.cpp:108: DB::MergeTreeSelectProcessor::MergeTreeSelectProcessor(std::shared_ptr<DB::IMergeTreeReadPool>, std::unique_ptr<DB::IMergeTreeSelectAlgorithm, std::default_delete<DB::IMergeTreeSelectAlgorithm>>, std::shared_ptr<DB::PrewhereInfo> const&, std::shared_ptr<DB::LazilyReadInfo> const&, DB::ExpressionActionsSettings const&, DB::MergeTreeReaderSettings const&) @ 0x000000001e8c06de
13. /home/chang/SourceCode/backend1/contrib/llvm-project/libcxx/include/__memory/unique_ptr.h:634: std::__unique_if<DB::MergeTreeSelectProcessor>::__unique_single std::make_unique[abi:se190107]<DB::MergeTreeSelectProcessor, std::shared_ptr<DB::IMergeTreeReadPool>&, std::unique_ptr<DB::IMergeTreeSelectAlgorithm, std::default_delete<DB::IMergeTreeSelectAlgorithm>>, std::shared_ptr<DB::PrewhereInfo>&, std::shared_ptr<DB::LazilyReadInfo>&, DB::ExpressionActionsSettings&, DB::MergeTreeReaderSettings&>(std::shared_ptr<DB::IMergeTreeReadPool>&, std::unique_ptr<DB::IMergeTreeSelectAlgorithm, std::default_delete<DB::IMergeTreeSelectAlgorithm>>&&, std::shared_ptr<DB::PrewhereInfo>&, std::shared_ptr<DB::LazilyReadInfo>&, DB::ExpressionActionsSettings&, DB::MergeTreeReaderSettings&) @ 0x000000001f802e9e
14. /home/chang/SourceCode/backend1/src/Processors/QueryPlan/ReadFromMergeTree.cpp:662: DB::ReadFromMergeTree::readInOrder(DB::RangesInDataParts, std::vector<String, std::allocator<String>>, DB::MergeTreeReadPoolBase::PoolSettings, DB::MergeTreeReadType, unsigned long) @ 0x000000001f7e6324
15. /home/chang/SourceCode/backend1/src/Processors/QueryPlan/ReadFromMergeTree.cpp:740: DB::ReadFromMergeTree::read(DB::RangesInDataParts, std::vector<String, std::allocator<String>>, DB::MergeTreeReadType, unsigned long, unsigned long, bool) @ 0x000000001f7e6ecc
16. /home/chang/SourceCode/backend1/src/Processors/QueryPlan/ReadFromMergeTree.cpp:1035: DB::ReadFromMergeTree::spreadMarkRangesAmongStreams(DB::RangesInDataParts&&, unsigned long, std::vector<String, std::allocator<String>> const&) @ 0x000000001f7e9723
17. /home/chang/SourceCode/backend1/src/Processors/QueryPlan/ReadFromMergeTree.cpp:2181: DB::ReadFromMergeTree::spreadMarkRanges(DB::RangesInDataParts&&, unsigned long, DB::ReadFromMergeTree::AnalysisResult&, std::optional<DB::ActionsDAG>&) @ 0x000000001f7f4d1c
18. /home/chang/SourceCode/backend1/src/Processors/QueryPlan/ReadFromMergeTree.cpp:2295: DB::ReadFromMergeTree::initializePipeline(DB::QueryPipelineBuilder&, DB::BuildQueryPipelineSettings const&) @ 0x000000001f7f62e7
19. /home/chang/SourceCode/backend1/src/Processors/QueryPlan/ISourceStep.cpp:20: DB::ISourceStep::updatePipeline(std::vector<std::unique_ptr<DB::QueryPipelineBuilder, std::default_delete<DB::QueryPipelineBuilder>>, std::allocator<std::unique_ptr<DB::QueryPipelineBuilder, std::default_delete<DB::QueryPipelineBuilder>>>>, DB::BuildQueryPipelineSettings const&) @ 0x000000001f758649
20. /home/chang/SourceCode/backend1/src/Processors/QueryPlan/QueryPlan.cpp:201: DB::QueryPlan::buildQueryPipeline(DB::QueryPlanOptimizationSettings const&, DB::BuildQueryPipelineSettings const&) @ 0x000000001f79d91e
21. /home/chang/SourceCode/backend1/utils/extern-local-engine/Parser/SerializedPlanParser.cpp:323: local_engine::SerializedPlanParser::buildQueryPipeline(DB::QueryPlan&) const @ 0x000000001654e29c
22. /home/chang/SourceCode/backend1/utils/extern-local-engine/Parser/SerializedPlanParser.cpp:339: local_engine::SerializedPlanParser::createExecutor(std::unique_ptr<DB::QueryPlan, std::default_delete<DB::QueryPlan>>, substrait::Plan const&) const @ 0x000000001654c5da
23. /home/chang/SourceCode/backend1/utils/extern-local-engine/Parser/SerializedPlanParser.cpp:220: local_engine::SerializedPlanParser::createExecutor(substrait::Plan const&) @ 0x000000001654c556
24. /home/chang/SourceCode/backend1/utils/extern-local-engine/local_engine_jni.cpp:266: Java_org_apache_gluten_vectorized_ExpressionEvaluatorJniWrapper_nativeCreateKernelWithIterator @ 0x000000000c2ea29b

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions