-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Audit storage: validate consistency of replica and shard location metadata #9628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…-helium/foundationdb into persisted-validate-data-consistency
…sisted-validate-data-consistency
Moved AuditUtils to fdbserver/
Throw/Send audit_storage_error when there is a data corruption. Added doAuditStorage() for resuming Audit.
…sisted-validate-data-consistency
…sisted-validate-data-consistency
…sisted-validate-data-consistency
…sisted-validate-data-consistency
…sisted-validate-data-consistency
…sisted-validate-data-consistency
…sisted-validate-data-consistency
…sisted-validate-data-consistency
Contributor
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Collaborator
Doxense CI Report for Windows 10
|
Contributor
Result of foundationdb-pr on Linux CentOS 7
|
Contributor
Result of foundationdb-pr-clang on Linux CentOS 7
|
Contributor
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Contributor
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Collaborator
Doxense CI Report for Windows 10
|
Contributor
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Contributor
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Contributor
Result of foundationdb-pr-clang on Linux CentOS 7
|
Contributor
Result of foundationdb-pr on Linux CentOS 7
|
Contributor
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Collaborator
Doxense CI Report for Windows 10
|
Contributor
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Contributor
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Contributor
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Contributor
Result of foundationdb-pr on Linux CentOS 7
|
Contributor
Result of foundationdb-pr-clang on Linux CentOS 7
|
Contributor
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
liquid-helium
approved these changes
May 1, 2023
This was referenced May 4, 2023
kakaiu
added a commit
to kakaiu/foundationdb
that referenced
this pull request
May 11, 2023
…adata (apple#9628) * Implemented AuditUtils.actor.cpp Moved AuditUtils to fdbserver/ * Persist AuditStorageState. * Passed persisted AuditStorageState test. * Added audit_storage_error to indicate a corruption is caught. Throw/Send audit_storage_error when there is a data corruption. Added doAuditStorage() for resuming Audit. * Load and resume AuditStorage when DD restarts. * Generate audit id monotonically. * Fixed minor issue AuditId/Type was not set. * Adding getLatestAuditStates. * Improved persisted errors and added AuditStorageCommand.actor.cpp for fdbcli. * Added `audit_storage` fdbcli command. * fmt. * Fixed null shared_ptr issue. * Improve audit data. * Change DDAuditFailed to SevWarn. * Sev. * set SERVE_AUDIT_STORAGE_PARALLELISM to 1. * Moved AuditUtils* to fdbclient/. * Added getAuditStatus fdbcli command. * Refactor audit storage fdb cli commands. * Added auditStorage in sim. * Cleanup. * Resolved comments. * Resolved comments. * Added SystemData for metadata audit. Refactored audit workflow to make sure all sub-tasks are executed w/o early exit. * Improvements. * Persisted Failed state after too many retries. * Added retryCount for resumeAuditStorage(). * resolving conflict. * Resolved conflicts. * allow-merged-to-run * add timeout to audit client * fmt * validate replica * add audit serverKey * address comments and fmt * fix audit_storage_exceeded_request_limit * fix segfault in getLatestAuditStatesImpl * fix bugs * remove timeout from workload * fix bugs * audit local view of shard assignment * fmt * fix-stuck-issue-and-make-dd-audit-storage-self-retry * fix timeout * fix timeout * fix bugs and cleanup * fix nit * change name state to coreState for audit metadata * address comments * code clean * fmt * setup debug * cleanup * clean up * code cleanup * code clean * remove tmp file * fmt * trace portion of shards that of anonymous physical shard * remove unnecessary actor cleanup * do not give up when tr is too old * address commits * refactor * clean * fmt * fix-command-help-text * fix-auditstate-restore-and-enable-restore-to-metadata-audit * address comments * fmrt * debug and improve efficient of resume audit * small change * fix audit cli * bypass completed audit when dd restart * fix auditStorageCommandActor * make mismatch key range more visable * address comments * make local shard metadata check can make progress by retries * address comments * address comments * partition location metadata validation by range and server * unset MIN_TRACE_SEVERITY * address comments and SS auto proceed until failed then notify dd * persistNewAuditState should checkMoveKeysLock * audit storage location metadata partitioned by range and move shard assignment history def to the end of SS structure * code cleanup * fix error message in metadata validation * fix registerAuditsForShardAssignmentHistoryCollection input for local shard validation * add comments to code and add guard to make sure the SS audit does not proceeds automatically for many times without being notified by DD --- to support audit cancellation later * fix coalesceRangeList * replace rangeOverlapping func with operator and use struct instead of complicated type for return value of getKeyServer/serverKey/shardInfo * simplify shard assignment history * shardAssignmentRecordRequests should be unorder_map * address comments, make trackShardAssignment simple, make anyChildAuditFailed cover all audit children, keep only one audit actor run at a time on each SS * only run validate shard info once at a time, other audit type does not have this limitation --------- Co-authored-by: He Liu <[email protected]> Co-authored-by: He Liu <[email protected]> Co-authored-by: Zhe Wang <[email protected]>
5 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduction
AuditStorage is a functionality that serves to audit the system's data storage by checking for data consistency. It is triggered when a client requests an audit or when a consistency check is required at the end of simulation.
When a client issues an auditStorage request, the request is first processed by CC. CC forwards the request to DD, which is responsible for processing audit requests.
DD checks for ongoing audits before processing a new audit request. If there is an ongoing audit with the same audit type and range as the new request, DD obtains the audit ID for that audit. If there is an ongoing but irrelevant audit, DD returns an error message indicating that the system is busy, as currently only one ongoing audit is allowed at a time. If there is no ongoing audit, DD creates a new audit and persists its state.
AuditStorage requests are asynchronous. DD immediately replies with the existing audit ID to CC. If DD is unable to get and persist the result, it captures the failure and automatically retries the audit until one of three outcomes is achieved: (1) the maximum number of retry attempts is exceeded, resulting in a "Failed" result; (2) the audit is completed without any error, resulting in a "Complete" result; or (3) the audit is completed with errors detected, resulting in an "Error" result.
In some cases, CC might not know whether the request has been delivered to DD, for example, when DD restarts after CC sends an audit request. In such cases, CC replies with "request_maybe_delivered" to the client. The client can then issue a new audit request if necessary.
Following designs are obeyed when developing the audit storage:
(1) A audit request generates an AuditStorage;
(2) An AuditStorage can automatically retry for failures;
(3) Any component of AuditStorage must not block or kill SS and DD and CC;
(4) Audit storage must be retriable --- being able to make progress by retrying. A large audit is partitioned into tasks and assigned to SSes. Each SS runs assigned tasks until completing all assigned tasks or failed. Upon completing each task, SS persists the progress. If a task is failed, SS notifies DD, and DD loads the progress made by the SS and resend the remaining tasks to the SS.
Current limitations:
(1) TSS servers are not covered;
(2) If a bad assignment consistently updated to metadata, this bad assignment is not detected. For example, DD assigns a removed SSID (or invalid SSID, like 0) to KeyServer and ServerKey. This bad assignment cannot be detected by current implementation of AuditStorage.
AduitStorageTest 100k:
20230429-063732-zhewang-b06297784516243e compressed=True data_size=32945656 duration=3028909 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=0:45:18 sanity=False started=100000 stopped=20230429-072250 submitted=20230429-063732 timeout=5400 username=zhewang
100k correctness with two irrelevant failures:
20230429-063413-zhewang-422a958e38f9fa0b compressed=True data_size=32915465 duration=5142264 ended=100000 fail=2 fail_fast=10 max_runs=100000 pass=99998 priority=100 remaining=0 runtime=1:19:41 sanity=False started=100000 stopped=20230429-075354 submitted=20230429-063413 timeout=5400 username=zhewang
Code-Reviewer Section
The general pull request guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
release-branchormainif this is the youngest branch)