Skip to content

tests: limit memory usage for tests#76388

Closed
azat wants to merge 1 commit intoClickHouse:masterfrom
azat:tests/limit-memory
Closed

tests: limit memory usage for tests#76388
azat wants to merge 1 commit intoClickHouse:masterfrom
azat:tests/limit-memory

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Feb 18, 2025

Note: for now in draft, since likely there will be some issues

To avoid situation when the tests eats too much RAM and we spend dozen of time to figure out why MEMORY_LIMIT_EXCEEDED happens on the server.

Refs: #75966

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Feb 18, 2025

Workflow [PR], commit [05a9015]

@clickhouse-gh clickhouse-gh bot added the pr-ci label Feb 18, 2025
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Can we also set up limits on the number of processes and opened files? (they can be set to a reasonably high value).

@alexey-milovidov alexey-milovidov self-assigned this Feb 18, 2025
@azat
Copy link
Copy Markdown
Member Author

azat commented Feb 19, 2025

I guess this make sense as well, though I would not expect them to be huge right now.

What I really think make sense is to implement them as cgroups, to limit the memory properly over limiting address space.

To avoid situation when the tests eats too much RAM and we spend dozen
of time to figure out why MEMORY_LIMIT_EXCEEDED happens on the server.

Refs: ClickHouse#75966
@azat azat force-pushed the tests/limit-memory branch from 887a843 to 05a9015 Compare February 19, 2025 10:01

def prlimit(command, limit):
"""Limit memory usage for each test to avoid MEMORY_LIMIT_EXCEEDED errors on server"""
return f"prlimit --as={limit} {command}"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the --rss option.

Copy link
Copy Markdown
Member Author

@azat azat Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rudiment, here is a snippet from man page:

RLIMIT_RSS
Specifies the limit (in pages) of the process's resident set (the number of virtual pages resident in RAM). This limit only has effect in Linux 2.4.x, x < 30, and there only affects calls to [madvise](https://linux.die.net/man/2/madvise)(2) specifying MADV_WILLNEED

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok...

if batch_num and batch_total:
aux = f"--run-by-hash-total {batch_total} --run-by-hash-num {batch_num-1}"
statless_test_command = f"clickhouse-test --testname --shard --zookeeper --check-zookeeper-session --hung-check --print-time \
statless_test_command = f"clickhouse-test --memory-limit {20<<30} --testname --shard --zookeeper --check-zookeeper-session --hung-check --print-time \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to expose these settings, hardcoding is ok.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, due to the comment above, this is likely to not work with sanitizers, so let's keep it.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Apr 22, 2025

Dear @azat, this PR hasn't been updated for a while. Will you continue working on it? If not, please close it. Otherwise, ignore this message.

@azat azat closed this Jun 21, 2025
@azat azat mentioned this pull request Jun 21, 2025
1 task
azat added a commit to azat/ClickHouse that referenced this pull request Jun 22, 2025
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

Refs: ClickHouse#82036
Resubmit of: ClickHouse#76388
azat added a commit to azat/ClickHouse that referenced this pull request Jun 22, 2025
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

This is resubmit of ClickHouse#76388, which did not work because address space
limit, which is way different thing, especially for the binaries built
with sanitizers

Refs: ClickHouse#82036
azat added a commit to azat/ClickHouse that referenced this pull request Jun 22, 2025
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

This is resubmit of ClickHouse#76388, which did not work because address space
limit, which is way different thing, especially for the binaries built
with sanitizers

Refs: ClickHouse#82036
azat added a commit to azat/ClickHouse that referenced this pull request Jun 22, 2025
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

This is resubmit of ClickHouse#76388, which did not work because address space
limit, which is way different thing, especially for the binaries built
with sanitizers

Refs: ClickHouse#82036
azat added a commit to azat/ClickHouse that referenced this pull request Jun 22, 2025
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

This is resubmit of ClickHouse#76388, which did not work because address space
limit, which is way different thing, especially for the binaries built
with sanitizers

Refs: ClickHouse#82036
azat added a commit to azat/ClickHouse that referenced this pull request Jun 22, 2025
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

This is resubmit of ClickHouse#76388, which did not work because address space
limit, which is way different thing, especially for the binaries built
with sanitizers

Refs: ClickHouse#82036
azat added a commit to azat/ClickHouse that referenced this pull request Jun 23, 2025
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

This is resubmit of ClickHouse#76388, which did not work because address space
limit, which is way different thing, especially for the binaries built
with sanitizers

Refs: ClickHouse#82036
azat added a commit to azat/ClickHouse that referenced this pull request Jan 1, 2026
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

This is resubmit of ClickHouse#76388, which did not work because address space
limit, which is way different thing, especially for the binaries built
with sanitizers

Refs: ClickHouse#82036

v2: use cgroup v2
azat added a commit to azat/ClickHouse that referenced this pull request Jan 1, 2026
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

This is resubmit of ClickHouse#76388, which did not work because address space
limit, which is way different thing, especially for the binaries built
with sanitizers

Refs: ClickHouse#82036

v2: use cgroup v2
azat added a commit to azat/ClickHouse that referenced this pull request Jan 1, 2026
Jobs with sanitizers shows lots of OOM, last time (ClickHouse#76143) it happened
due to some test spawns tons of clients, and sanitizer binary takes
650MiB each in RAM

This is resubmit of ClickHouse#76388, which did not work because address space
limit, which is way different thing, especially for the binaries built
with sanitizers

Refs: ClickHouse#82036

v2: use cgroup v2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants