Skip to content

Allow to load marks in threadpool in advance#40821

Merged
kssenii merged 15 commits intoClickHouse:masterfrom
kssenii:improve-marks-cache-loading
Sep 13, 2022
Merged

Allow to load marks in threadpool in advance#40821
kssenii merged 15 commits intoClickHouse:masterfrom
kssenii:improve-marks-cache-loading

Conversation

@kssenii
Copy link
Copy Markdown
Member

@kssenii kssenii commented Aug 30, 2022

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Allow to load marks with threadpool in advance. Regulated by setting load_marks_asynchronously (default: 0).

@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-improvement Pull request with some product improvements label Aug 30, 2022
@tavplubix
Copy link
Copy Markdown
Member

Btw, we have a bug in marks loading: #40399

@kssenii kssenii force-pushed the improve-marks-cache-loading branch from 25a06cd to 8ba9d88 Compare September 1, 2022 17:28
@CurtizJ CurtizJ self-assigned this Sep 2, 2022
@kssenii kssenii force-pushed the improve-marks-cache-loading branch from dc63bfd to 48dc32f Compare September 4, 2022 17:54

MergeTreeMarksLoader::~MergeTreeMarksLoader()
{
if (future.valid())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting to check, at what moment in time, the future becomes invalid. Looks like some race condition is still possible.

@kssenii kssenii force-pushed the improve-marks-cache-loading branch from 9abb137 to 487bc0f Compare September 6, 2022 14:51
Comment on lines +230 to +232
/// thread status should be destructed before shared context because it relies on process list.
thread_status.reset();
shared_context.reset();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider just placing thread_status and shared_context fields in ClientBase in reverse order

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume they are already in the right order: shared_context in order of destruction goes after thread_status in ClientBase. Am I wrong?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yes, so they will be destructed in the right order without this change (but the comment is still useful, we can move it to ClientBase.h)

@kssenii
Copy link
Copy Markdown
Member Author

kssenii commented Sep 7, 2022

I merged branch from PR #41011 into this branch because they have common code (in threadPoolcCallbackRunner.cpp and LocalServer.cpp), so easier to merge them now, than later.

So need to wait before #41011 is merged before merging this PR.

@kssenii kssenii force-pushed the improve-marks-cache-loading branch 4 times, most recently from c851389 to 487bc0f Compare September 9, 2022 18:05
@kssenii kssenii force-pushed the improve-marks-cache-loading branch from 1a2bebb to b7d751b Compare September 11, 2022 11:37
@kssenii
Copy link
Copy Markdown
Member Author

kssenii commented Sep 12, 2022

Results look good for s3 and not good for local disk :/
So I'll disable this setting it by default (but will enable it as default in CI).

  • results after running benchmark over hits_100m_obfuscated with disk s3 and disabled cache:
  1. load_marks_asynchronously=0
ip-172-31-4-213.eu-west-1.compute.internal :) select event, value / 1000 / 1000  from system.events where event = 'WaitMarksLoadMicroseconds'

SELECT
    event,
    (value / 1000) / 1000
FROM system.events
WHERE event = 'WaitMarksLoadMicroseconds'

Query id: dae62540-d6e0-48c0-b305-afb68489583b

┌─event─────────────────────┬─divide(divide(value, 1000), 1000)─┐
│ WaitMarksLoadMicroseconds │                         92.359618 │
└───────────────────────────┴───────────────────────────────────┘

1 row in set. Elapsed: 0.005 sec.
  1. load_marks_asynchronously=1
ip-172-31-4-213.eu-west-1.compute.internal :) select event, value / 1000 / 1000  from system.events where event = 'WaitMarksLoadMicroseconds'

SELECT
    event,
    (value / 1000) / 1000
FROM system.events
WHERE event = 'WaitMarksLoadMicroseconds'

Query id: 1027f644-5563-4d6f-9ffb-f8dba17f08b9

┌─event─────────────────────┬─divide(divide(value, 1000), 1000)─┐
│ WaitMarksLoadMicroseconds │                         17.028783 │
└───────────────────────────┴───────────────────────────────────┘

1 row in set. Elapsed: 0.004 sec.
  • results after running benchmark over hits with local disk:
  1. load_marks_asynchronously=0
ip-172-31-4-213.eu-west-1.compute.internal :) select event, value / 1000 / 1000  from system.events where event = 'WaitMarksLoadMicroseconds'

SELECT
    event,
    (value / 1000) / 1000
FROM system.events
WHERE event = 'WaitMarksLoadMicroseconds'

Query id: 4146b46b-00bf-4728-9a91-8ed01b23be9d

┌─event─────────────────────┬─divide(divide(value, 1000), 1000)─┐
│ WaitMarksLoadMicroseconds │                          1.628356 │
└───────────────────────────┴───────────────────────────────────┘

1 row in set. Elapsed: 0.004 sec.
  1. load_marks_asynchronously=1
ip-172-31-4-213.eu-west-1.compute.internal :) select event, value / 1000 / 1000  from system.events where event = 'WaitMarksLoadMicroseconds'

SELECT
    event,
    (value / 1000) / 1000
FROM system.events
WHERE event = 'WaitMarksLoadMicroseconds'

Query id: e59ebb10-f003-4c99-ac3d-3483340b03a7

┌─event─────────────────────┬─divide(divide(value, 1000), 1000)─┐
│ WaitMarksLoadMicroseconds │                4.8989080000000005 │
└───────────────────────────┴───────────────────────────────────┘

1 row in set. Elapsed: 0.130 sec.

@kssenii
Copy link
Copy Markdown
Member Author

kssenii commented Sep 13, 2022

Integration tests (tsan) [3/4] — fail: 1

test_dns_cache/test.py::test_host_is_drop_from_cache_after_consecutive_failures - flaky in master.

Stateless tests (debug, s3 storage) [1/3] — fail: 1

00632_get_sample_block_cache - flaky in master

Stateless tests (release, s3 storage) — fail: 1

01825_type_json_ghdata_insert_select - DB::Exception: Message: The specified key does not exist. - flaky in master.

@kssenii kssenii merged commit b2c9c04 into ClickHouse:master Sep 13, 2022
@tavplubix
Copy link
Copy Markdown
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants