Skip to content

[RFC] Fix memory tracking for OPTIMIZE TABLE/merges#18772

Merged
alexey-milovidov merged 3 commits intoClickHouse:masterfrom
azat:optimize-memory-tracking-fix
Jan 22, 2021
Merged

[RFC] Fix memory tracking for OPTIMIZE TABLE/merges#18772
alexey-milovidov merged 3 commits intoClickHouse:masterfrom
azat:optimize-memory-tracking-fix

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Jan 5, 2021

Changelog category (leave one):

  • Backward Incompatible Change

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

  • Fix memory tracking for OPTIMIZE TABLE/merges
  • Account query memory limits and sampling for OPTIMIZE TABLE/merges

Detailed description / Documentation draft:
Because of BlockerInThread in
MergeTreeDataPartWriterOnDisk::calculateAndSerializePrimaryIndex memory
was accounted incorrectly and grows constantly.

And IIUC there is no need in that blocker, since INSERT SELECT shares
the same thread group.

Plus take memory limits and sampling into account (and because of this marked as backward incompatible since now OPTIMIZE may fail if the query has some memory limits)

@robot-clickhouse robot-clickhouse added the pr-improvement Pull request with some product improvements label Jan 5, 2021
@azat azat marked this pull request as draft January 5, 2021 22:31
@robot-clickhouse robot-clickhouse added pr-backward-incompatible Pull request with backwards incompatible changes and removed pr-improvement Pull request with some product improvements labels Jan 5, 2021
@azat azat changed the title Fix memory tracking for OPTIMIZE TABLE queries Fix memory tracking for OPTIMIZE TABLE/merges Jan 5, 2021
@azat azat changed the title Fix memory tracking for OPTIMIZE TABLE/merges [RFC] Fix memory tracking for OPTIMIZE TABLE/merges Jan 5, 2021
@azat azat marked this pull request as ready for review January 5, 2021 23:49
@alexey-milovidov alexey-milovidov self-assigned this Jan 9, 2021
@alexey-milovidov
Copy link
Copy Markdown
Member

There is an issue - you run long INSERT query (for example, insert a terabyte of data in single query) and you will get constantly growing memory usage for query, that is false, because the memory for indices is freed in background merges.

@alexey-milovidov alexey-milovidov added the st-discussion When the implementation aspects are not clear or when the PR is on hold due to questions. label Jan 9, 2021
@azat azat marked this pull request as draft January 9, 2021 07:52
@azat
Copy link
Copy Markdown
Member Author

azat commented Jan 9, 2021

There is an issue - you run long INSERT query (for example, insert a terabyte of data in single query) and you will get constantly growing memory usage for query, that is false, because the memory for indices is freed in background merges.

@alexey-milovidov are you sure that this is still relevant?

Those index_columns will be releated with IMergeTreeDataPartWriter::releaseIndexColumns() from the MergedBlockOutputStream::writeSuffixAndFinalizePart(), which is called by MergeTreeDataWriter::writeTempPart.
And also if MergeTreeData::MutableDataPartPtr were leaked then that blocker will not help.

Also I've just started one huge INSERT, like

CREATE TABLE test_memory                       
ENGINE = MergeTree                                                                                                                                                                             
ORDER BY (k, v) AS                                                                                                                                                                             SELECT                 
    number AS k,                                                                                                                                                                               
    reinterpretAsString(number) AS v                                                                                                                                                           FROM numbers(toUInt64(1000000000000.))

With the binary from this PR, and I don't see excessive memory usage reported by MemoryTracker (right now 3e9 rows were inserted already, and memory usage is ~70MB, same as w/o this PR).

@alexey-milovidov
Copy link
Copy Markdown
Member

How is it possible if:

  • data parts are added to the working set (with index) in context of INSERT query;
  • data parts are removed from the working set in context of background merge?

azat added 3 commits January 12, 2021 00:49
Because of BlockerInThread in
MergeTreeDataPartWriterOnDisk::calculateAndSerializePrimaryIndex memory
was accounted incorrectly and grows constantly.

And IIUC there is no need in that blocker, since INSERT SELECT shares
the same thread group.
But reduce scope, to avoid leaking too much memory, since there are old
values in last_block_index_columns.

The scope of the MemoryTracker::BlockerInThread has been increased in ClickHouse#8290
@azat azat force-pushed the optimize-memory-tracking-fix branch from 93fac66 to 0560868 Compare January 11, 2021 22:02
@azat
Copy link
Copy Markdown
Member Author

azat commented Jan 11, 2021

How is it possible if:

data parts are added to the working set (with index) in context of INSERT query;
data parts are removed from the working set in context of background merge?

You are right, and blocker cannot be removed here.

However the scope should be decreased, because otherwise deallocating of old columns from last_block_index_columns will leak (this has been introduced in #8290, cc @CurtizJ )

Here is an example of OPTIMIZE TABLE that "eats too much memory" w/o this patches:

2021.01.12 01:07:12.606957 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Trace> MergeTreeSequentialSource: Reading 201976 marks from part 27_1_1_21, total 827273557 rows starting from the beginning of the part
2021.01.12 01:07:24.254478 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage (total): 1.01 GiB.
2021.01.12 01:07:24.267584 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage: 1.00 GiB.
2021.01.12 01:07:30.000157 [ 29606 ] {} <Debug> AsynchronousMetrics: MemoryTracking: was 1.31 GiB, peak 1.31 GiB, will set to 634.98 MiB (RSS), difference: -702.49 MiB
2021.01.12 01:07:44.612275 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage: 2.00 GiB.
2021.01.12 01:07:48.381828 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage (total): 2.00 GiB.
2021.01.12 01:07:49.609813 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf98*98855d8a} <Debug> MemoryTracker: Current memory usage: 3.00 GiB.
2021.01.12 01:07:52.051876 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage (total): 3.00 GiB.
2021.01.12 01:07:53.212005 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage: 4.00 GiB.
2021.01.12 01:07:55.816859 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage (total): 4.00 GiB.
2021.01.12 01:07:57.281031 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage: 5.00 GiB.
2021.01.12 01:07:59.745307 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage (total): 5.00 GiB.
2021.01.12 01:08:01.004794 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage: 6.00 GiB.
2021.01.12 01:08:03.524911 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage (total): 6.00 GiB.
2021.01.12 01:08:04.757638 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage: 7.00 GiB.
2021.01.12 01:08:07.623491 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage (total): 7.00 GiB.
2021.01.12 01:08:08.698261 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage: 8.00 GiB.
2021.01.12 01:08:14.263123 [ 29480 ] {b9cc8aa8-ec48-47fc-aa64-bf9898855d8a} <Debug> MemoryTracker: Current memory usage (total): 8.00 GiB.

While process RSS does not grows:

$ pgrep -f clickhouse.*server$ | xargs ps u                                                                         
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND                                                                 
azat     29353 77.9  1.0 6236800 669548 pts/13 Sl+  01:07   0:42 /src/ch/tmp/upstream/clickhouse server 

And also you can see that AsynchronousMetrics adjusts the memory.

P.S. Maybe these two patches should be splitted into 2 separate PRs...

@azat azat marked this pull request as ready for review January 12, 2021 08:10
@alexey-milovidov
Copy link
Copy Markdown
Member

@azat I think we should make all such blockers to account memory for server and not account for query/user.

@azat
Copy link
Copy Markdown
Member Author

azat commented Jan 15, 2021

@azat I think we should make all such blockers to account memory for server and not account for query/user.

Let's try - #19146 (but let's not mix everything into this patchset)

@azat
Copy link
Copy Markdown
Member Author

azat commented Jan 21, 2021

@alexey-milovidov FWIW this is still relevant (and it does not requires any changes even after #19146, since that PR changes the default for the BlockerInThread)

@alexey-milovidov alexey-milovidov merged commit 4e11e7c into ClickHouse:master Jan 22, 2021
@azat azat deleted the optimize-memory-tracking-fix branch January 22, 2021 18:12
alesapin pushed a commit that referenced this pull request Feb 5, 2021
[RFC] Fix memory tracking for OPTIMIZE TABLE/merges

(cherry picked from commit 4e11e7c)
alesapin added a commit that referenced this pull request Feb 5, 2021
alesapin pushed a commit that referenced this pull request Feb 5, 2021
[RFC] Fix memory tracking for OPTIMIZE TABLE/merges

(cherry picked from commit 4e11e7c)
alesapin pushed a commit that referenced this pull request Feb 5, 2021
[RFC] Fix memory tracking for OPTIMIZE TABLE/merges

(cherry picked from commit 4e11e7c)
alesapin added a commit that referenced this pull request Feb 5, 2021
@Zhile
Copy link
Copy Markdown

Zhile commented Mar 19, 2021

@azat Does this also happens in 20.12? If yes, can we also cherry-pick the fix to 20.12?

@azat
Copy link
Copy Markdown
Member Author

azat commented Mar 19, 2021

@azat Does this also happens in 20.12? If yes, can we also cherry-pick the fix to 20.12?

@Zhile yes, it has been backported to 20.12 already (since v20.12.8.5-stable):

@Zhile
Copy link
Copy Markdown

Zhile commented Mar 19, 2021

Many thanks for the info. @azat

@Zhile
Copy link
Copy Markdown

Zhile commented Mar 19, 2021

Hi, @azat

I found the fix is not in 20.12 branch. Can you help to confirm?
https://github.com/ClickHouse/ClickHouse/blob/20.12/src/Storages/MergeTree/MergeList.cpp

Does the new version of 20.12 come from 20.12 branch?

BTW, do you know how to generate a new release for 20.12?

@azat
Copy link
Copy Markdown
Member Author

azat commented Mar 19, 2021

I found the fix is not in 20.12 branch. Can you help to confirm?
https://github.com/ClickHouse/ClickHouse/blob/20.12/src/Storages/MergeTree/MergeList.cpp

Yes, you are right, by some reason it has been reverted in 83ee3cb
Maybe @alesapin can comment on this?

Does the new version of 20.12 come from 20.12 branch?

Yes.

@Zhile
Copy link
Copy Markdown

Zhile commented Mar 19, 2021

Is this safe to merge into 20.12? I see it's reverted because of backward incapability. @azat
If it's safe I think we need to merge to 20.12 too. @alesapin
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backward-incompatible Pull request with backwards incompatible changes st-discussion When the implementation aspects are not clear or when the PR is on hold due to questions.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants