Add _part_starting_offset virtual column and key condition support for offset-based querying#79417
Conversation
|
Can you elaborate more on why this isn't stable across partitioned table inserts? Is it because each partition may have its own set of parts so the offsets aren't unique? Would it be sufficient and stable to instead use something like this for joining? EDIT: I guess it means my projection index now also has to store my partition columns, which is not ideal. It still sounds better than storing all primary columns to get the same index analysis effect. And since the partition key should be the same for all rows in a part, maybe there's a way to optimize storing it? |
|
Also, sorry for the basic question but I have very little understanding here: what ordering is the offset based on? What's a "preceding" part? Why is that ordering stable across merges? |
|
@EmeraldShift The current implementation of This can cause _part_starting_offset values to shift unexpectedly when parts are inserted into earlier partitions. For example: Initial computed Now a new part is inserted into Partition 1: Due to the way parts are ordered, the new global list may become: And the updated This shift breaks the assumption that The correctness of |
e9b6c42 to
7011fb4
Compare
|
Thank you for the detailed explanation! |
a5060cd to
2a7714c
Compare
2a7714c to
649d8e8
Compare
11d5900
* [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250430) * Fix Build due to ClickHouse/ClickHouse#79067 * Fix build due to ClickHouse/ClickHouse#79417 --------- Co-authored-by: kyligence-git <[email protected]> Co-authored-by: Chang chen <[email protected]>
|
@amosbird @KochetovNicolai Hi! Can you add a experimental setting for new virtual columns? |
|
@fm4v Hi! I'd like to understand the rationale behind introducing an experimental flag for To my knowledge, there's no other pure virtual column that has or needs an experimental flag. The only related example is |
* [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250705) * Fix benchmark build * Fix Benchmark build due to ClickHouse/ClickHouse#79417 * Revert "Fix Build due to ClickHouse/ClickHouse#80931" This reverts commit 02d12f6. * Fix Build due to ClickHouse/ClickHouse#81886 * Fix Link issue due to ClickHouse/ClickHouse#83121 * Fix Build due to ClickHouse/ClickHouse#82604 * Fix Build due to ClickHouse/ClickHouse#82945 * Fix Build due to ClickHouse/ClickHouse#83214 --------- Co-authored-by: kyligence-git <[email protected]> Co-authored-by: Chang chen <[email protected]>
|
We see a serve lock contention on getting parts snapshot in v25.5.4, suspicious that it relates to this PR. WITH
(
SELECT now() - 3600
) AS start_time,
(
SELECT now()
) AS end_time
SELECT
arrayStringConcat(arrayMap(x -> demangle(addressToSymbol(x)), trace), '\n') AS trace_symbols,
count() AS sz
FROM system.trace_log
WHERE ((event_time >= start_time) AND (event_time <= end_time)) AND (trace_type = 'Real')
GROUP BY trace_symbols
ORDER BY sz DESC
LIMIT 5
SETTINGS allow_introspection_functions = 1
Query id: c0fbbcbb-9f7c-4293-8439-65d217508a4a
Row 1:
──────
trace_symbols: DB::(anonymous namespace)::writeTraceInfo(DB::TraceType, int, siginfo_t*, void*)
std::__1::mutex::lock()
DB::DataPartsLock::DataPartsLock(std::__1::mutex&)
DB::MergeTreeData::getStorageSnapshot(std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::shared_ptr<DB::Context const>) const
DB::IdentifierResolver::tryResolveTableIdentifier(DB::Identifier const&, std::__1::shared_ptr<DB::Context const> const&)
DB::QueryAnalyzer::tryResolveIdentifier(DB::IdentifierLookup const&, DB::IdentifierResolveScope&, DB::IdentifierResolveContext)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolveExpressionNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, bool, bool, bool)
DB::QueryAnalyzer::resolveQueryJoinTreeNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, DB::QueryExpressionsAliasVisitor&)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolve(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::IQueryTreeNode> const&, std::__1::shared_ptr<DB::Context const>)
DB::QueryAnalysisPass::run(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::Context const>)
DB::QueryTreePassManager::run(std::__1::shared_ptr<DB::IQueryTreeNode>, unsigned long)
DB::buildQueryTreeAndRunPasses(std::__1::shared_ptr<DB::IAST> const&, DB::SelectQueryOptions const&, std::__1::shared_ptr<DB::Context const> const&, std::__1::shared_ptr<DB::IStorage> const&)
DB::InterpreterSelectQueryAnalyzer::InterpreterSelectQueryAnalyzer(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)
DB::TableFunctionView::getActualTableStructure(std::__1::shared_ptr<DB::Context const>, bool) const
DB::TableFunctionView::executeImpl(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, DB::ColumnsDescription, bool) const
DB::ITableFunction::execute(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, DB::ColumnsDescription, bool, bool) const
DB::Context::executeTableFunction(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::ITableFunction> const&, std::__1::shared_ptr<DB::Context const> const&)
DB::QueryAnalyzer::resolveTableFunction(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, DB::QueryExpressionsAliasVisitor&, bool)
DB::QueryAnalyzer::resolveQueryJoinTreeNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, DB::QueryExpressionsAliasVisitor&)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolve(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::IQueryTreeNode> const&, std::__1::shared_ptr<DB::Context const>)
DB::QueryAnalysisPass::run(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::Context const>)
DB::QueryTreePassManager::run(std::__1::shared_ptr<DB::IQueryTreeNode>, unsigned long)
DB::buildQueryTreeAndRunPasses(std::__1::shared_ptr<DB::IAST> const&, DB::SelectQueryOptions const&, std::__1::shared_ptr<DB::Context const> const&, std::__1::shared_ptr<DB::IStorage> const&)
DB::InterpreterSelectQueryAnalyzer::InterpreterSelectQueryAnalyzer(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)
std::__1::unique_ptr<DB::IInterpreter, std::__1::default_delete<DB::IInterpreter>> std::__1::__function::__policy_invoker<std::__1::unique_ptr<DB::IInterpreter, std::__1::default_delete<DB::IInterpreter>> (DB::InterpreterFactory::Arguments const&)>::__call_impl[abi:ne190107]<std::__1::__function::__default_alloc_func<DB::registerInterpreterSelectQueryAnalyzer(DB::InterpreterFactory&)::$_0, std::__1::unique_ptr<DB::IInterpreter, std::__1::default_delete<DB::IInterpreter>> (DB::InterpreterFactory::Arguments const&)>>(std::__1::__function::__policy_storage const*, DB::InterpreterFactory::Arguments const&) (.llvm.1496827156594342616)
DB::InterpreterFactory::get(std::__1::shared_ptr<DB::IAST>&, std::__1::shared_ptr<DB::Context>, DB::SelectQueryOptions const&)
DB::executeQueryImpl(char const*, char const*, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, DB::ReadBuffer*, std::__1::shared_ptr<DB::IAST>&)
DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum)
DB::TCPHandler::runImpl()
DB::TCPHandler::run()
Poco::Net::TCPServerConnection::start()
Poco::Net::TCPServerDispatcher::run()
Poco::PooledThread::run()
Poco::ThreadImpl::runnableEntry(void*)
sz: 241180
Row 2:
──────
trace_symbols: DB::(anonymous namespace)::writeTraceInfo(DB::TraceType, int, siginfo_t*, void*)
std::__1::mutex::lock()
DB::DataPartsLock::DataPartsLock(std::__1::mutex&)
DB::MergeTreeData::getStorageSnapshot(std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::shared_ptr<DB::Context const>) const
DB::IdentifierResolver::tryResolveTableIdentifier(DB::Identifier const&, std::__1::shared_ptr<DB::Context const> const&)
DB::QueryAnalyzer::tryResolveIdentifier(DB::IdentifierLookup const&, DB::IdentifierResolveScope&, DB::IdentifierResolveContext)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolveExpressionNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, bool, bool, bool)
DB::QueryAnalyzer::resolveQueryJoinTreeNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, DB::QueryExpressionsAliasVisitor&)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolveExpressionNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, bool, bool, bool)
DB::QueryAnalyzer::resolveQueryJoinTreeNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, DB::QueryExpressionsAliasVisitor&)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolve(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::IQueryTreeNode> const&, std::__1::shared_ptr<DB::Context const>)
DB::QueryAnalysisPass::run(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::Context const>)
DB::QueryTreePassManager::run(std::__1::shared_ptr<DB::IQueryTreeNode>, unsigned long)
DB::buildQueryTreeAndRunPasses(std::__1::shared_ptr<DB::IAST> const&, DB::SelectQueryOptions const&, std::__1::shared_ptr<DB::Context const> const&, std::__1::shared_ptr<DB::IStorage> const&)
DB::InterpreterSelectQueryAnalyzer::InterpreterSelectQueryAnalyzer(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)
DB::StorageView::read(DB::QueryPlan&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&, std::__1::shared_ptr<DB::StorageSnapshot> const&, DB::SelectQueryInfo&, std::__1::shared_ptr<DB::Context const>, DB::QueryProcessingStage::Enum, unsigned long, unsigned long)
DB::(anonymous namespace)::buildQueryPlanForTableExpression(std::__1::shared_ptr<DB::IQueryTreeNode>, std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::SelectQueryInfo const&, DB::SelectQueryOptions const&, std::__1::shared_ptr<DB::PlannerContext>&, bool, bool)
DB::buildJoinTreeQueryPlan(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::SelectQueryInfo const&, DB::SelectQueryOptions&, std::__1::unordered_set<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&, std::__1::shared_ptr<DB::PlannerContext>&)
DB::Planner::buildPlanForQueryNode()
DB::Planner::buildQueryPlanIfNeeded()
DB::executeQueryImpl(char const*, char const*, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, DB::ReadBuffer*, std::__1::shared_ptr<DB::IAST>&)
DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum)
DB::TCPHandler::runImpl()
DB::TCPHandler::run()
Poco::Net::TCPServerConnection::start()
Poco::Net::TCPServerDispatcher::run()
Poco::PooledThread::run()
Poco::ThreadImpl::runnableEntry(void*)
sz: 198878
Row 3:
──────
trace_symbols: DB::(anonymous namespace)::writeTraceInfo(DB::TraceType, int, siginfo_t*, void*)
std::__1::mutex::lock()
DB::DataPartsLock::DataPartsLock(std::__1::mutex&)
DB::MergeTreeData::getStorageSnapshot(std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::shared_ptr<DB::Context const>) const
DB::IdentifierResolver::tryResolveTableIdentifier(DB::Identifier const&, std::__1::shared_ptr<DB::Context const> const&)
DB::QueryAnalyzer::tryResolveIdentifier(DB::IdentifierLookup const&, DB::IdentifierResolveScope&, DB::IdentifierResolveContext)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolve(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::IQueryTreeNode> const&, std::__1::shared_ptr<DB::Context const>)
DB::QueryAnalysisPass::run(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::Context const>)
DB::QueryTreePassManager::run(std::__1::shared_ptr<DB::IQueryTreeNode>, unsigned long)
DB::buildQueryTreeAndRunPasses(std::__1::shared_ptr<DB::IAST> const&, DB::SelectQueryOptions const&, std::__1::shared_ptr<DB::Context const> const&, std::__1::shared_ptr<DB::IStorage> const&)
DB::InterpreterSelectQueryAnalyzer::InterpreterSelectQueryAnalyzer(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)
std::__1::unique_ptr<DB::IInterpreter, std::__1::default_delete<DB::IInterpreter>> std::__1::__function::__policy_invoker<std::__1::unique_ptr<DB::IInterpreter, std::__1::default_delete<DB::IInterpreter>> (DB::InterpreterFactory::Arguments const&)>::__call_impl[abi:ne190107]<std::__1::__function::__default_alloc_func<DB::registerInterpreterSelectQueryAnalyzer(DB::InterpreterFactory&)::$_0, std::__1::unique_ptr<DB::IInterpreter, std::__1::default_delete<DB::IInterpreter>> (DB::InterpreterFactory::Arguments const&)>>(std::__1::__function::__policy_storage const*, DB::InterpreterFactory::Arguments const&) (.llvm.1496827156594342616)
DB::InterpreterFactory::get(std::__1::shared_ptr<DB::IAST>&, std::__1::shared_ptr<DB::Context>, DB::SelectQueryOptions const&)
DB::executeQueryImpl(char const*, char const*, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, DB::ReadBuffer*, std::__1::shared_ptr<DB::IAST>&)
DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum)
DB::TCPHandler::runImpl()
DB::TCPHandler::run()
Poco::Net::TCPServerConnection::start()
Poco::Net::TCPServerDispatcher::run()
Poco::PooledThread::run()
Poco::ThreadImpl::runnableEntry(void*)
sz: 171786
Row 4:
──────
trace_symbols: DB::(anonymous namespace)::writeTraceInfo(DB::TraceType, int, siginfo_t*, void*)
DB::ExecutionThreadContext::wait(std::__1::atomic<bool>&)
DB::ExecutorTasks::tryGetTask(DB::ExecutionThreadContext&)
DB::PipelineExecutor::executeStepImpl(unsigned long, std::__1::atomic<bool>*)
void std::__1::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__1::__function::__default_alloc_func<DB::PipelineExecutor::spawnThreadsImpl(std::__1::shared_ptr<DB::IAcquiredSlot>)::$_0, void ()>>(std::__1::__function::__policy_storage const*)
ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::worker()
void std::__1::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__1::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false, true>::ThreadFromGlobalPoolImpl<void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*>(void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*&&)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*&&)::'lambda'(), void ()>>(std::__1::__function::__policy_storage const*)
ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool::worker()
void* std::__1::__thread_proxy[abi:ne190107]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::__1::thread>::ThreadFromThreadPool*>>(void*)
sz: 105504
Row 5:
──────
trace_symbols: DB::(anonymous namespace)::writeTraceInfo(DB::TraceType, int, siginfo_t*, void*)
std::__1::mutex::lock()
DB::DataPartsLock::DataPartsLock(std::__1::mutex&)
DB::MergeTreeData::getStorageSnapshot(std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::shared_ptr<DB::Context const>) const
DB::IdentifierResolver::tryResolveTableIdentifier(DB::Identifier const&, std::__1::shared_ptr<DB::Context const> const&)
DB::QueryAnalyzer::tryResolveIdentifier(DB::IdentifierLookup const&, DB::IdentifierResolveScope&, DB::IdentifierResolveContext)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolveExpressionNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, bool, bool, bool)
DB::QueryAnalyzer::resolveQueryJoinTreeNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, DB::QueryExpressionsAliasVisitor&)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolveExpressionNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, bool, bool, bool)
DB::QueryAnalyzer::resolveQueryJoinTreeNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, DB::QueryExpressionsAliasVisitor&)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolve(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::IQueryTreeNode> const&, std::__1::shared_ptr<DB::Context const>)
DB::QueryAnalysisPass::run(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::Context const>)
DB::QueryTreePassManager::run(std::__1::shared_ptr<DB::IQueryTreeNode>, unsigned long)
DB::buildQueryTreeAndRunPasses(std::__1::shared_ptr<DB::IAST> const&, DB::SelectQueryOptions const&, std::__1::shared_ptr<DB::Context const> const&, std::__1::shared_ptr<DB::IStorage> const&)
DB::InterpreterSelectQueryAnalyzer::InterpreterSelectQueryAnalyzer(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)
DB::TableFunctionView::getActualTableStructure(std::__1::shared_ptr<DB::Context const>, bool) const
DB::TableFunctionView::executeImpl(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, DB::ColumnsDescription, bool) const
DB::ITableFunction::execute(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, DB::ColumnsDescription, bool, bool) const
DB::Context::executeTableFunction(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::ITableFunction> const&, std::__1::shared_ptr<DB::Context const> const&)
DB::QueryAnalyzer::resolveTableFunction(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, DB::QueryExpressionsAliasVisitor&, bool)
DB::QueryAnalyzer::resolveQueryJoinTreeNode(std::__1::shared_ptr<DB::IQueryTreeNode>&, DB::IdentifierResolveScope&, DB::QueryExpressionsAliasVisitor&)
DB::QueryAnalyzer::resolveQuery(std::__1::shared_ptr<DB::IQueryTreeNode> const&, DB::IdentifierResolveScope&)
DB::QueryAnalyzer::resolve(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::IQueryTreeNode> const&, std::__1::shared_ptr<DB::Context const>)
DB::QueryAnalysisPass::run(std::__1::shared_ptr<DB::IQueryTreeNode>&, std::__1::shared_ptr<DB::Context const>)
DB::QueryTreePassManager::run(std::__1::shared_ptr<DB::IQueryTreeNode>, unsigned long)
DB::buildQueryTreeAndRunPasses(std::__1::shared_ptr<DB::IAST> const&, DB::SelectQueryOptions const&, std::__1::shared_ptr<DB::Context const> const&, std::__1::shared_ptr<DB::IStorage> const&)
DB::InterpreterSelectQueryAnalyzer::InterpreterSelectQueryAnalyzer(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context const> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&)
std::__1::unique_ptr<DB::IInterpreter, std::__1::default_delete<DB::IInterpreter>> std::__1::__function::__policy_invoker<std::__1::unique_ptr<DB::IInterpreter, std::__1::default_delete<DB::IInterpreter>> (DB::InterpreterFactory::Arguments const&)>::__call_impl[abi:ne190107]<std::__1::__function::__default_alloc_func<DB::registerInterpreterSelectQueryAnalyzer(DB::InterpreterFactory&)::$_0, std::__1::unique_ptr<DB::IInterpreter, std::__1::default_delete<DB::IInterpreter>> (DB::InterpreterFactory::Arguments const&)>>(std::__1::__function::__policy_storage const*, DB::InterpreterFactory::Arguments const&) (.llvm.1496827156594342616)
DB::InterpreterFactory::get(std::__1::shared_ptr<DB::IAST>&, std::__1::shared_ptr<DB::Context>, DB::SelectQueryOptions const&)
DB::executeQueryImpl(char const*, char const*, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, DB::ReadBuffer*, std::__1::shared_ptr<DB::IAST>&)
DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum)
DB::TCPHandler::runImpl()
DB::TCPHandler::run()
Poco::Net::TCPServerConnection::start()
Poco::Net::TCPServerDispatcher::run()
Poco::PooledThread::run()
Poco::ThreadImpl::runnableEntry(void*)
sz: 65665 |
| auto lock = lockParts(); | ||
| snapshot_data->parts = getVisibleDataPartsVectorUnlocked(query_context, lock); | ||
| parts = getVisibleDataPartsVectorUnlocked(query_context, lock); | ||
| snapshot_data->parts = RangesInDataParts(parts); |
There was a problem hiding this comment.
We don't need to hold part lock when constructing RangesInDataParts
There was a problem hiding this comment.
Nice spot! Do you mind submitting a PR to optimize this?
There was a problem hiding this comment.
BTW, how did you notice this? It’s a bit surprising, since the RangesInDataParts c'tor seems simpler than getVisibleDataPartsVectorUnlocked.
There was a problem hiding this comment.
We see some lock contention in getStorageSnapshot when upgrading from v25.3 to v25.5. I'm not 100% constructing RangesInDataParts is the problem here though. Testing in our prod env if moving snapshot_data->parts = RangesInDataParts(parts); out of the scope of lock can help.
There was a problem hiding this comment.
We see some lock contention in getStorageSnapshot when upgrading from v25.3 to v25.5.
Yeah, I saw the trace info you posted. I'm curious — how did you first notice the contention? Was there a slowdown in queries or a drop in QPS that led you to check the trace_log? Or do you have some tooling in place that specifically monitors lock contention? Also, do you happen to have a comparison of the trace_log before and after the upgrade? It would be helpful to see how the contention pattern changed between v25.3 and v25.5.
There was a problem hiding this comment.
Was there a slowdown in queries or a drop in QPS
Yes, we have quota for max concurrent query and mostly never hit it unless something bad happens.
Or do you have some tooling in place that specifically monitors lock contention
No, it's just my habit that when seeing performance regression, the first thing I check is where the query spending time on, i.e. checking trace_log.
Also, do you happen to have a comparison of the trace_log before and after the upgrade? It would be helpful to see how the contention pattern changed between v25.3 and v25.5.
Unfortunately I don't have it now. But I've always been using system.trace_log to investigate performance regression and to the top of my mind, when things run properly, Real trace is similar to CPU trace and may contains other general traces (network, query pipeline execution...), but not lock.
|
|
I move construction of |
|
and after: |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Support
_part_starting_offsetvirtual column in MergeTree-family tables. This column represents the cumulative row count of all preceding parts, calculated at query time based on the current part list. The cumulative values are retained throughout query execution and remain effective even after part pruning. Related internal logic has been refactored to support this behavior.When expressions like
_part_starting_offset + _part_offsetor_part_offset + _part_starting_offsetare used in the WHERE clause, key condition analysis will be properly applied, enabling efficient query-then-fetch patterns and cursor-based pagination. This improves the analysis process introduced in #58224 . Now only one numeric column is used to filter instead of (_part, _part_offset) pair.This PR also improves stability over projection indexes that rely on
_partand_part_offset, which are sensitive to part merges. See #78429 . However, it may still not be reliable under workloads involving: Inserts into partitioned table, materialization of lightweight delete masks, background merges in Collapsing, Replacing, or AggregatingMergeTree tables, etc. A proper query-level snapshot is required for this, and it will be implemented in another PR.Documentation entry for user-facing changes