Skip to content

Fix memory tracking for two-level GROUP BY when not all rows read from Aggregator (TCP)#11022

Merged
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
azat:memory-tracking
May 21, 2020
Merged

Fix memory tracking for two-level GROUP BY when not all rows read from Aggregator (TCP)#11022
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
azat:memory-tracking

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented May 18, 2020

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix memory tracking for two-level GROUP BY when not all rows read from Aggregator (TCP)

Detailed description / Documentation draft:
Example of such cases:

  • SELECT GROUP BY LIMIT
  • SELECT GROUP BY with subsequent MEMORY_LIMIT_EXCEEDED error

And it should be two-level aggregation, since otherwise there will be
only one hashtable which will be cleared correctly, only if you have
two-level GROUP BY some of hashtables will not be cleared since nobody
consume rows.

Before this patch:

    09:39.015292 [ 3070801 ] {609a0610-e377-4132-9cf3-f49454cf3c96} <Information> executeQuery: Read 1000000 rows, 7.63 MiB in 0.707 sec., 1413826 rows/sec., 10.79 MiB/sec.
    09:39.015348 [ 3070801 ] {609a0610-e377-4132-9cf3-f49454cf3c96} <Debug> MemoryTracker: Peak memory usage (for query): 51.93 MiB.
    09:39.015942 [ 3070801 ] {} <Trace> Aggregator: Destroying aggregate states <-- **problem**
    09:39.017057 [ 3070801 ] {} <Trace> Aggregator: Destroying aggregate states <--
    09:39.017961 [ 3070801 ] {} <Debug> MemoryTracker: Peak memory usage (for query): 51.93 MiB.
    09:39.018029 [ 3070801 ] {} <Information> TCPHandler: Processed in 0.711 sec.

After this patch:

    16:24.544030 [ 3087333 ] {79da208a-b3c0-48d4-9943-c974a3cbb6ea} <Information> executeQuery: Read 1000000 rows, 7.63 MiB in 0.599 sec., 1670199 rows/sec., 12.74 MiB/sec.
    16:24.544084 [ 3087333 ] {79da208a-b3c0-48d4-9943-c974a3cbb6ea} <Debug> MemoryTracker: Peak memory usage (for query): 72.11 MiB.
    16:24.544398 [ 3087333 ] {79da208a-b3c0-48d4-9943-c974a3cbb6ea} <Trace> Aggregator: Destroying aggregate states
    16:24.545485 [ 3087333 ] {79da208a-b3c0-48d4-9943-c974a3cbb6ea} <Trace> Aggregator: Destroying aggregate states
    16:24.547053 [ 3087333 ] {} <Debug> MemoryTracker: Peak memory usage (for query): 72.11 MiB.
    16:24.547093 [ 3087333 ] {} <Information> TCPHandler: Processed in 0.603 sec.

@alexey-milovidov alexey-milovidov self-assigned this May 18, 2020
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. It fixes the memory tracking issue.
But I don't understand how it fixes the following issues:

  • memory leak in the test for uniqState (or is there any memory leak?);
  • memory usage that is higher than usual and sustained in about 6x the memory usage of query but with no leak.

@alexey-milovidov alexey-milovidov added the pr-bugfix Pull request with bugfix, not backported by default label May 18, 2020
@azat
Copy link
Copy Markdown
Member Author

azat commented May 18, 2020

memory leak in the test for uniqState (or is there any memory leak?);

Hm, I didn't look carefully at RSS/VIRT, indeed, this is a different issue (will drop fixes tag for the #8995)

memory usage that is higher than usual and sustained in about 6x the memory usage of query but with no leak.

This is about #10818 ?

Ok, I've forgot to verify with multiple threads:

# pwd is hits_100m_obfuscated_v1
docker run --name ch --rm -v $PWD:/var/lib/clickhouse -it yandex/clickhouse-server:20.4 gosu clickhouse:clickhouse clickhouse-server --config=/etc/clickhouse-server/config.xml -- --max_server_memory_usage=$((6<<30))

clickhouse-benchmark --host 172.17.0.2 --max_threads 16 <<< "select WatchID from hits_100m_obfuscated group by WatchID with totals limit 10"

Up to this point it is ok.

But if I will force more threads:

clickhouse-client -q 'select count(*) from (select * from numbers_mt(1000000) settings max_threads=1000)'

clickhouse-benchmark --host 172.17.0.2 --max_threads 16 <<< "select WatchID from hits_100m_obfuscated group by WatchID with totals limit 10"

RSS starts growing, indeed looks like memory allocator issues, but in the long-term lots of threads will be used by the aggregator and this issue will arise anyway, no?

P.S. playing with jemalloc may helps there, i.e. any of the following may helps:

  • MALLOC_CONF=nareans:16
  • MALLOC_CONF=percpu_arena:percpu
  • MALLOC_CONF=background_thread:true

P.S. I also tried #10956 and now it will help for this issue, RSS is around 6-8GB

@azat azat force-pushed the memory-tracking branch from c6715e0 to b587120 Compare May 18, 2020 19:57
@azat
Copy link
Copy Markdown
Member Author

azat commented May 18, 2020

force-pushed the azat:memory-tracking branch from c6715e0 to b587120 now

Drop the patch with removing the test leak_when_memory_limit_exceeded.sql

@azat azat force-pushed the memory-tracking branch from dd5d4bf to 47965fb Compare May 19, 2020 09:58
@azat
Copy link
Copy Markdown
Member Author

azat commented May 19, 2020

Functional stateless tests (ubsan) — fail: 1, passed: 1848, skipped: 6

2020-05-19 14:11:43 Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table system.trace_log doesn't exist.. 

Can be fixed with some hack, but it will go away anyway after #11029

@azat
Copy link
Copy Markdown
Member Author

azat commented May 20, 2020

@alexey-milovidov

Functional stateless tests (ubsan) — fail: 1, passed: 1848, skipped: 6
Can be fixed with some hack, but it will go away anyway after #11029

I can rebase against master to make tests green if you not on this right now?

And one more questions from the big comment above:

RSS starts growing, indeed looks like memory allocator issues, but in the long-term lots of threads will be used by the aggregator and this issue will arise anyway, no?

do you still think that making thread pool more predictable instead of random thread get random job worth it? (with taking into account that #10956 may helps in some cases)

@azat
Copy link
Copy Markdown
Member Author

azat commented May 20, 2020

@alexey-milovidov also what do you think about enabling background_thread:true by default (or limit number of arenas? or even use percpu arena, but there are some some known performance issues for it) ?

@alexey-milovidov
Copy link
Copy Markdown
Member

@azat Let's test all three variants.
You can send three draft pull requests, we will wait for performance tests and look.
If memory usage will be lower and there will be no regressions in performance - I strongly appreciate these changes.

@azat
Copy link
Copy Markdown
Member Author

azat commented May 20, 2020

Let's test all three variants.

Agree

P.S. But I don't think that nareans:16 (or anyt other) should be tested, other two variants is enough.

You can send three draft pull requests, we will wait for performance tests and look.

Ok (although I don't like PR storm :) )

If memory usage will be lower

If I did not messed up during testing again, then all three - helps

@azat
Copy link
Copy Markdown
Member Author

azat commented May 20, 2020

And what about this PR?

I can rebase against master to make tests green if you not on this right now?

azat added 2 commits May 21, 2020 02:06
…m Aggregator (TCP)

Example of such cases:
- SELECT GROUP BY LIMIT
- SELECT GROUP BY with subsequent MEMORY_LIMIT_EXCEEDED error

And it should be two-level aggregation, since otherwise there will be
only one hashtable which will be cleared correctly, only if you have
two-level GROUP BY some of hashtables will not be cleared since nobody
consume rows.

Before this patch:
    09:39.015292 [ 3070801 ] {609a0610-e377-4132-9cf3-f49454cf3c96} <Information> executeQuery: Read 1000000 rows, 7.63 MiB in 0.707 sec., 1413826 rows/sec., 10.79 MiB/sec.
    09:39.015348 [ 3070801 ] {609a0610-e377-4132-9cf3-f49454cf3c96} <Debug> MemoryTracker: Peak memory usage (for query): 51.93 MiB.
    09:39.015942 [ 3070801 ] {} <Trace> Aggregator: Destroying aggregate states <-- **problem**
    09:39.017057 [ 3070801 ] {} <Trace> Aggregator: Destroying aggregate states <--
    09:39.017961 [ 3070801 ] {} <Debug> MemoryTracker: Peak memory usage (for query): 51.93 MiB.
    09:39.018029 [ 3070801 ] {} <Information> TCPHandler: Processed in 0.711 sec.

After this patch:
    16:24.544030 [ 3087333 ] {79da208a-b3c0-48d4-9943-c974a3cbb6ea} <Information> executeQuery: Read 1000000 rows, 7.63 MiB in 0.599 sec., 1670199 rows/sec., 12.74 MiB/sec.
    16:24.544084 [ 3087333 ] {79da208a-b3c0-48d4-9943-c974a3cbb6ea} <Debug> MemoryTracker: Peak memory usage (for query): 72.11 MiB.
    16:24.544398 [ 3087333 ] {79da208a-b3c0-48d4-9943-c974a3cbb6ea} <Trace> Aggregator: Destroying aggregate states
    16:24.545485 [ 3087333 ] {79da208a-b3c0-48d4-9943-c974a3cbb6ea} <Trace> Aggregator: Destroying aggregate states
    16:24.547053 [ 3087333 ] {} <Debug> MemoryTracker: Peak memory usage (for query): 72.11 MiB.
    16:24.547093 [ 3087333 ] {} <Information> TCPHandler: Processed in 0.603 sec.
- GROUP BY
- GROUP BY LIMIT
- GROUP BY MEMORY_LIMIT_EXCEEDED
@azat azat force-pushed the memory-tracking branch from 47965fb to 77bdb5f Compare May 20, 2020 23:07
@azat
Copy link
Copy Markdown
Member Author

azat commented May 20, 2020

Rebased, previous HEAD was 47965fbf5cf2edbdb13c3fd31caceef0cb246ac5

@alexey-milovidov
Copy link
Copy Markdown
Member

First for me is to fix unrelated integration test failure... then proceed with PRs.

@alexey-milovidov alexey-milovidov merged commit 38f36f6 into ClickHouse:master May 21, 2020
@azat azat deleted the memory-tracking branch May 21, 2020 18:26
azat added a commit to azat/ClickHouse that referenced this pull request Sep 10, 2022
While rewriting the test in ClickHouse#11119 it looses LIMIT, which makes the test
useless, see details in ClickHouse#11022

Signed-off-by: Azat Khuzhin <[email protected]>
nickitat pushed a commit that referenced this pull request Sep 11, 2022
* tests: fix 01281_group_by_limit_memory_tracking flakiness

CI: https://s3.amazonaws.com/clickhouse-test-reports/41092/14d8f297d73f9f813c447474310fbfa0c5b18a0f/stateless_tests__debug__[2/3].html
Signed-off-by: Azat Khuzhin <[email protected]>

* tests: add missing bits into 01281_group_by_limit_memory_tracking

While rewriting the test in #11119 it looses LIMIT, which makes the test
useless, see details in #11022

Signed-off-by: Azat Khuzhin <[email protected]>

Signed-off-by: Azat Khuzhin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-docs-needed pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants