Reduce memory consumption by mutations with big subqueries used with IN by davenger · Pull Request #46835 · ClickHouse/ClickHouse

davenger · 2023-02-24T16:36:55Z

Changelog category (leave one):

Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

If we run a mutation with IN (subquery) like this:
ALTER TABLE t UPDATE col='new value' WHERE id IN (SELECT id FROM huge_table)
and the table t has multiple parts than for each part a set for subquery SELECT id FROM huge_table is built in memory. And if there are many parts then this might consume a lot of memory (and lead to an OOM) and CPU.
The solution is to introduce a short-lived cache of sets that are currently being built by mutation tasks. If another task of the same mutation is executed concurrently it can lookup the set in the cache, wait for it be be built and reuse it.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

UnamedRus · 2023-02-24T16:42:38Z

May be, we can have it under setting?

vdimir · 2023-03-10T12:39:59Z

src/Interpreters/ActionsVisitor.cpp

        if (SetPtr set = data.prepared_sets->get(set_key))
            return set;

+        if (data.prepared_sets_cache)


Suggested change

if (data.prepared_sets_cache)

/// Used to share sets for a mutation between MutateTasks for different parts

if (data.prepared_sets_cache)

vdimir · 2023-03-10T12:47:47Z

src/Storages/StorageMergeTree.h

+    /// NOTE: we only store weak_ptr to PreparedSetsCache, so that the cache is shared between mutation tasks that are executed in parallel.
+    /// The goal is to avoiding consuming a lot of memory when the same big sets are used by multiple tasks at the same time.
+    /// If the tasks are executed without time overlap, we will destroy the cache to free memory, and the next task might rebuild the same sets.
+    std::mutex mutation_prepared_sets_cache_mutex;


Is it possible to have several ongoing mutations? Here

ClickHouse/src/Storages/StorageMergeTree.h

Line 149 in c2611c3

std::map<UInt64, MergeTreeMutationEntry> current_mutations_by_version;

we have a map for mutations, probably the same here 🤔

changed this to a map

vdimir · 2023-03-10T12:48:35Z

tests/queries/0_stateless/02581_share_big_sets_between_mutation_tasks.sql

+SELECT name FROM system.parts WHERE database=currentDatabase() AND table = '02581_trips' AND active ORDER BY name;
+
+-- Run mutation with a 'IN big subquery'
+ALTER TABLE 02581_trips UPDATE description='' WHERE id IN (SELECT (number+5)::UInt32 FROM numbers(10000000)) SETTINGS mutations_sync=2;


It's worth to add cases for other types of mutations

added a testcase with different mutations

davenger · 2023-04-14T15:19:44Z

May be, we can have it under setting?

Resurrected use_index_for_in_with_subqueries_max_values setting.

davenger · 2023-04-15T06:59:14Z

Stress test (debug) failure - #48777

src/Core/Settings.h

src/Interpreters/ExpressionAnalyzer.cpp

vdimir · 2023-04-17T13:17:30Z

src/Planner/PlannerContext.h

-    explicit PlannerSet(SetPtr set_, QueryTreeNodePtr subquery_node_)
-        : set(std::move(set_))
+    explicit PlannerSet(QueryTreeNodePtr subquery_node_)
+        : set(promise_to_build_set.get_future())


Where promise_to_build_set should be assigned?

If I understand the question correctly, the answer is: promise_to_build_set is default-constructed and after this it has a valid internal state so we can get a future from it.

And it's one set in CreatingSet via promise_to_fill_set.set_value, got it

src/Storages/StorageMergeTree.cpp

src/DataTypes/DataTypeSet.h

vdimir · 2023-04-17T13:33:25Z

src/Processors/Transforms/CreatingSetsTransform.cpp

+                const SetPtr & ready_set = set_built_by_another_thread.get();
+                if (!ready_set)
+                {
+                    LOG_TRACE(log, "Failed to use set from cache, key: {}", subquery.key);


Why it may fail?

I described below a case when the Set from ExpressionAnalyzer code path can be not built due to size limits.

vdimir · 2023-04-17T13:44:24Z

src/Processors/Transforms/CreatingSetsTransform.cpp

+            {
+                LOG_TRACE(log, "Waiting for set to be build by another thread, key: {}", subquery.key);
+                SharedSet set_built_by_another_thread = std::move(std::get<1>(from_cache));
+                const SetPtr & ready_set = set_built_by_another_thread.get();


Are we supposed to lock here and wait until builing is finished?

Yes, we block here waiting the future value to be set.

vdimir · 2023-04-17T13:47:09Z

src/Interpreters/ExpressionAnalyzer.cpp

+        else
+        {
+            LOG_TRACE(getLogger(), "Waiting for set, key: {}", set_key.toString());
+            set = std::get<1>(from_cache).get();


What's the key difference between this code and similar in CreatingSetsTransform? No loop here

In Expression Analyzer code the Set is used as an optimization that can be skipped e.g. if the set is too big - we set size limits and break if they are exceeded. So if building the set fails we can continue without it. CreatingSetsTransform expects the Set to be successfully built and cannot continue without it. So there is retry loop in CreatingSetsTransform to handle the case when other thread was building the set from ExpressionAnalyzer code and failed due to limits.

src/Interpreters/PreparedSets.h

vdimir · 2023-04-17T15:40:57Z

src/Interpreters/PreparedSets.cpp

    return true;
 }

+String PreparedSetKey::toString() const


Btw, in analyzer, we need to hash the same set differently depending on the type of left operand, check out the comment at PlannerContext::createSetKey in #48754

This solution solves the issue (at least fixes test 00700_decimal_compare, but I'm uncertain if it's not too ad-hoc).

vdimir

I believe I understand the changes, although while reading I found just a bit difficult to follow where the set state changes and where we reference the same future or promise, but I don't think it can be simplified.

Co-authored-by: Vladimir C <[email protected]>

tavplubix · 2023-05-17T11:19:08Z

@davenger, the new tests are flaky with analyzer, please take a look: link

davenger marked this pull request as draft February 24, 2023 16:37

robot-ch-test-poll2 added the pr-not-for-changelog This PR should not be mentioned in the changelog label Feb 24, 2023

UnamedRus mentioned this pull request Feb 24, 2023

use_index_for_in_with_subqueries_max_values setting #39865

Closed

davenger added the force tests label Feb 24, 2023

davenger force-pushed the reduce_mem_in_mutation_with_subquery branch 5 times, most recently from 3429ce6 to 99eb9e1 Compare March 2, 2023 21:49

davenger removed the force tests label Mar 3, 2023

davenger changed the title ~~[WIP]Don't use large sets in KeyCondition~~ [WIP]Reduce memory consumption by mutations with big subqueries used with IN Mar 3, 2023

davenger force-pushed the reduce_mem_in_mutation_with_subquery branch 2 times, most recently from 8f75251 to 3bd503e Compare March 9, 2023 22:09

vdimir self-assigned this Mar 10, 2023

vdimir reviewed Mar 10, 2023

View reviewed changes

davenger force-pushed the reduce_mem_in_mutation_with_subquery branch from 35ec895 to 3bad593 Compare April 4, 2023 10:02

davenger added the force tests label Apr 4, 2023

davenger force-pushed the reduce_mem_in_mutation_with_subquery branch 8 times, most recently from d21547a to 803d544 Compare April 6, 2023 11:23

davenger force-pushed the reduce_mem_in_mutation_with_subquery branch 4 times, most recently from 9c6f022 to d911b8c Compare April 11, 2023 21:42

davenger added 5 commits April 14, 2023 16:12

More detailed test

6854dd0

Test with multiple mutations

9e93ddc

Caches for multiple mutations

a7b0558

Test case with different types of columns

3260341

Test with IN StorageSet

1454e84

davenger force-pushed the reduce_mem_in_mutation_with_subquery branch 2 times, most recently from 8c8cd26 to 279ac65 Compare April 14, 2023 14:45

davenger changed the title ~~[WIP]Reduce memory consumption by mutations with big subqueries used with IN~~ Reduce memory consumption by mutations with big subqueries used with IN Apr 14, 2023

davenger added pr-improvement Pull request with some product improvements and removed pr-not-for-changelog This PR should not be mentioned in the changelog force tests labels Apr 14, 2023

davenger marked this pull request as ready for review April 14, 2023 15:17

davenger added 2 commits April 14, 2023 20:07

Cleanups

018f768

Disable long test in debug

034dce5

davenger force-pushed the reduce_mem_in_mutation_with_subquery branch from 279ac65 to 034dce5 Compare April 14, 2023 18:08

vdimir reviewed Apr 17, 2023

View reviewed changes

vdimir approved these changes Apr 17, 2023

View reviewed changes

Clarify setting description

fc4fd3e

Co-authored-by: Vladimir C <[email protected]>

davenger force-pushed the reduce_mem_in_mutation_with_subquery branch from b9fc9f1 to fc4fd3e Compare April 17, 2023 16:24

davenger merged commit ba5ca15 into master Apr 17, 2023

davenger deleted the reduce_mem_in_mutation_with_subquery branch April 17, 2023 21:55

davenger mentioned this pull request Apr 26, 2023

Store const Set-s in PreparedSets #46842

Closed

1 task

evillique mentioned this pull request Apr 28, 2023

Abort in InterpreterSelectQuery::executeSubqueriesInSetsAndJoins due to invalid std::promise #49312

Closed

tavplubix mentioned this pull request May 17, 2023

Provide better partitions hint for merge selecting task #49637

Merged

baibaichen mentioned this pull request May 22, 2023

[GLUTEN-1632][CH]Update Clickhouse Version (20230517) apache/gluten#1692

Merged

azat mentioned this pull request Oct 8, 2023

Fix data-race in CreatingSetsTransform (on errors) due to throwing shared exception #55338

Merged

	if (data.prepared_sets_cache)
	/// Used to share sets for a mutation between MutateTasks for different parts
	if (data.prepared_sets_cache)

Conversation

davenger commented Feb 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry for user-facing changes

Uh oh!

UnamedRus commented Feb 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davenger commented Apr 14, 2023

Uh oh!

davenger commented Apr 15, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vdimir left a comment

Choose a reason for hiding this comment

Uh oh!

tavplubix commented May 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

davenger commented Feb 24, 2023 •

edited

Loading

UnamedRus commented Feb 24, 2023 •

edited

Loading