Balanced reading from JBOD#16423
Merged
KochetovNicolai merged 3 commits intoClickHouse:masterfrom Oct 29, 2020
Merged
Conversation
52259fe to
55673c1
Compare
amosbird
commented
Oct 29, 2020
| } | ||
|
|
||
| /// Before processing next thread, change volume if possible. | ||
| /// Different threads will likely start reading from different volumes, |
Collaborator
Author
There was a problem hiding this comment.
It's actually different disks
amosbird
commented
Oct 29, 2020
|
|
||
| { | ||
| /// Group parts by volume name. | ||
| /// We try minimize the number of threads concurrently read from the same volume. |
Collaborator
Author
There was a problem hiding this comment.
It's disk instead of volume.
Fix comment
Member
|
|
This was referenced Dec 18, 2020
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Better read task scheduling for JBOD architecture and
MergeTreestorage. New settingread_backoff_min_concurrencywhich serves as the lower limit to the number of reading threads.Detailed description / Documentation draft:
Disk-aware read scheduling is useful to avoid tail latency issues when dealing with huge data on JBOD array. I've observed a lot of read clustering issues, that is, we concurrently read from one disk for 20 seconds, and then switch to another one for the next 20 seconds.
I've tested it in some production environment with 12 disks JBOD array setup, and the results are very promising.
The baseline takes 573.039 sec, with JBOD task split, it reaches 429.389 sec, with random read task stealing, it gets 185.612 sec.
It works well with current read backoff mechanism.
update
Random stealing incurs reader reinit cost. Now we use a different scheme. First we try if any backoff threads can be resurrected. If no, we steal the next one. Thanks to the pre-balanced workloads, it should have a pretty good uniform distribution in general.
With this steal strategy, the runtime varies from 105 ~ 135 secs.