-
Notifications
You must be signed in to change notification settings - Fork 2.2k
*: support read only and recovery when disk full #10264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@tier-cap: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Welcome @tier-cap! |
|
store_heartbeat_tick_interval is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to do tests in Callback. In the current implementation, we can't know whether split fails or it just is waitting for executing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to extract and test the error code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to rename leader, follower_1 to peer_1, peer_2, ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not call unwrap to simplify the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: tier-cap <[email protected]>
NingLin-P
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
Also, I found this PR didn't cover snapshot generating and receiving, those could also consume disk space.
| let msg_type = msg.get_message().get_msg_type(); | ||
| let store_id = self.ctx.store_id(); | ||
|
|
||
| if disk::disk_full_precheck(store_id) || disk::is_disk_full() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking an atomic for every raft message and proposal may bring extra cost.
Maybe we can reduce the cost by only check the atomic once for each batch of peer, more specifically setting a flag like poller_ctx.is_disk_full at PollHandler::begin and then checking the poller_ctx.is_disk_full for each raft message and proposal instead.
This may bring little delay to detect disk full but I think the delay will be much shorter than the disk stats monitor interval (1s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I found this PR didn't cover snapshot generating and receiving, those could also consume disk space.
Yes, currently, these are also not limited in this PR: snapshot, running log, rocksdb compaction and wal, etc.
Maybe a good strategy is to solve the main problem first, and then observe the online status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking an atomic for every raft message and proposal may bring extra cost.
Maybe we can reduce the cost by only check the atomic once for each batch of peer, more specifically setting a flag like
poller_ctx.is_disk_fullatPollHandler::beginand then checking thepoller_ctx.is_disk_fullfor each raft message and proposal instead.This may bring little delay to detect disk full but I think the delay will be much shorter than the disk stats monitor interval (1s).
The impact of access to atomic variables on performance should not be so big, we can add a performance test observation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just performance but also resource cost (like CPU), the impact may not big enough to be noticeable but I think it is an unnecessary cost though.
Signed-off-by: tier-cap <[email protected]>
| pub fn is_disk_full() -> bool { | ||
| DISK_FULL.load(Ordering::Acquire) | ||
| } | ||
|
|
||
| pub fn disk_full_precheck(_store_id: u64) -> bool { | ||
| fail_point!("disk_full_peer_1", _store_id == 1, |_| true); | ||
| fail_point!("disk_full_peer_2", _store_id == 2, |_| true); | ||
| false | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like disk_full_precheck is always use as disk_full_precheck || is_disk_full, how about combining these two into one like:
pub fn is_disk_full() -> bool {
fail_point!("disk_full_peer_1", _store_id == 1, |_| true);
fail_point!("disk_full_peer_2", _store_id == 2, |_| true);
DISK_FULL.load(Ordering::Acquire)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not. Because disk_full_precheck only used to ut, if we merge both, then the disk stats worker will clean it up periodically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And there will be a more obscure logic, that is, in other places where precheck is not needed, you have to pass a store_id to this merged function.
NingLin-P
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. DetailsReviewer can indicate their review by submitting an approval review. |
|
/test |
|
/run-integration-common-test |
|
/merge |
|
@hicqu: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
This pull request has been accepted and is ready to merge. DetailsCommit hash: 3b8e470 |
|
/merge |
|
@hicqu: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
@tier-cap: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
* [feature] support read only and recovery when disk full. 1. when disk full, all the business write traffic will not be allowed. 2. no mather minority or majority or all servers disk full happen, you can recove by adding machines or storage, then service normally. Signed-off-by: tier-cap <[email protected]> * [feature] support read only and recovery when disk full. adjust impl on reviews. Signed-off-by: tier-cap <[email protected]> * [feature] support read only and recovery when disk full. opt one test case taking long time prbs. Signed-off-by: tier-cap <[email protected]> * [feature] support read only and recovery when disk full. change ut impl on reviews. Signed-off-by: tier-cap <[email protected]> * [feature] support read only and recovery when disk full. change ut impl on reviews. Signed-off-by: tier-cap <[email protected]> * [feature] support read only and recovery when disk full. remote the unused code and commend Signed-off-by: tier-cap <[email protected]> * clean up tests and fix a bug about transfer leader Signed-off-by: qupeng <[email protected]> * a little fix Signed-off-by: qupeng <[email protected]> * disk full recovery change 3 details 1. config change allowed 2. campaign success log allowed 3. add the raft engine size stats. Signed-off-by: tier-cap <[email protected]> * fix one bug Signed-off-by: tier-cap <[email protected]> * change details by review comment. Signed-off-by: tier-cap <[email protected]> * simplify some impls by comments. Signed-off-by: tier-cap <[email protected]> * change the atomic var access mode by review comments Signed-off-by: tier-cap <[email protected]> Co-authored-by: tier-cap <[email protected]> Co-authored-by: qupeng <[email protected]> Signed-off-by: tiancaiamao <[email protected]>
What problem does this PR solve?
closes #10597
Solve the availability problem when the disk is full, no mather one or two or the majority or all.
What is changed and how it works?
Check List
Test
Release note