Skip to content

Conversation

@9aman
Copy link
Contributor

@9aman 9aman commented Jan 13, 2025

Pauseless Ingestion Failure Resolution

Please refer to PR: #14741 for happy path. This PR aims to only cover the failure scenarios. Once the above one is merged a better diff covering only failures will be visible.

To view only diff covering failure scenarios, for the time being, refer to:

Summary

This PR aims to provide ways to resolve the failure scenarios that we can encounter during pauseless ingestion. The detailed list of failure scenarios can be found here: link along with the failure handling strategies: link

Following sequence diagrams summarizes the failure scenarios and the resolution.
Screenshot 2025-01-03 at 2 53 46 PM
Screenshot 2025-01-03 at 2 54 45 PM

Failure Scenarios & Resolution Approaches

Failures encountered during the commit protocol can be categorized into two types: recoverable and unrecoverable failures.

Recoverable failures are those in which at least one of the servers retains the segment on disk.

Unrecoverable failures occur when none of the servers have the segment on disk.

Recoverable Failures

Recoverable failures will be addressed through RealtimeSegmentValidationManager. This approach will handle scenarios such as upload failures and incomplete commit protocol executions.

The controller or server can run into issues in between any of the steps of the commit protocol as listed below:

Request Type: COMMIT_START

  1. Update the Segment ZK metadata for the committing segment (seg__0__0)
    • Change status to COMMITTING
    • Set endOffset
  2. Create Segment ZK metadata for the new segment (seg__0__1) with status IN_PROGRESS
  3. Update the Ideal State for the:
    • Committing segment (seg__0__0) to ONLINE
    • New/ Consuming segment (seg__0__1) to CONSUMING

Request Type: COMMIT_END_METADATA
4. Update Segment ZK metadata for the committing segment (seg__0__0):
- Change status to DONE.
- Update deepstore url.
- Any additional metadata.

The RealtimeSegmentValidationManager figures out which step of the commit protocol failed and how can it be fixed. This is very similar to how commit protocol failures were handled before with some minor changes.

Non-recoverable Failures

These failures require ingesting the segment again from upstream, followed by build, upload and ZK metadata update.

9aman and others added 15 commits January 2, 2025 16:57
1. Changing FSM
2. Changing the 3 steps performed during the commit protocol to update ZK and Ideal state
1. Changes in the commit protocol to start segment commit before the build
2. Changes in the BaseTableDataManager to ensure that the locally built segment is replaced by a downloaded one
   only when the CRC is present in the ZK Metadata
3. Changes in the download segment method to allow waited download in case of pauseless consumption
…segment commit end metadata call

Refactoing code for redability
… ingestion by moving it out of streamConfigMap
…auseless ingestion in RealtimeSegmentValidationManager
…d by RealtimeSegmentValitdationManager to fix commit protocol failures
@KKcorps KKcorps requested a review from Jackie-Jiang January 14, 2025 01:59
@Jackie-Jiang Jackie-Jiang added feature release-notes Referenced by PRs that need attention when compiling the next release notes ingestion real-time labels Jan 14, 2025
return;
TableConfig tableConfig = indexLoadingConfig.getTableConfig();
// For pauseless tables, we should replace the segment if download url is missing even if crc is same
if (!PauselessConsumptionUtils.isPauselessEnabled(tableConfig)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also LOG that since pauseless is enabled we are not going directly for segment download but waiting for catchup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KKcorps I don't understand why this is different for pauseless. I see you have added these along with reingestion tests.

private static final ConcurrentHashMap<String, AtomicBoolean> SEGMENT_INGESTION_MAP = new ConcurrentHashMap<>();

// Semaphore to enforce global concurrency limit
private static final Semaphore REINGESTION_SEMAPHORE = new Semaphore(MAX_PARALLEL_REINGESTIONS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this count configurable

@KKcorps KKcorps changed the title Resolve failures pauseless ingestion Pauseless Ingestion #2: Handle Failure scenarios without DR Jan 21, 2025
@9aman 9aman force-pushed the resolve-failures-pauseless-ingestion branch from 5782928 to aee514c Compare January 23, 2025 08:33
* SimpleSegmentMetadata deserialized = objectMapper.readValue(json, SimpleSegmentMetadata.class);
*/
@JsonIgnoreProperties(ignoreUnknown = true)
public class SimpleSegmentMetadata {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jackie-Jiang have added this to simplify ser/deser of SegmentZkMetadata. Added few getters for easy access from simpleFields for important keys like crc.

ImmutableSegmentDataManager immutableSegmentDataManager = (ImmutableSegmentDataManager) segmentDataManager;
SegmentMetadataImpl segmentMetadata =
(SegmentMetadataImpl) immutableSegmentDataManager.getSegment().getSegmentMetadata();
SegmentZKMetadata segmentZKMetadata = getSegmentZKMetadata(segmentMetadata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to reuse the methods in ZKMetadataUtils. You can move ZKMetadataUtils into pinot-common

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// 1. DONE for normal consumption
// 2. COMMITTING for pauseless consumption
Status statusPostSegmentMetadataUpdate =
PauselessConsumptionUtils.isPauselessEnabled(tableConfig) ? Status.COMMITTING : Status.DONE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this upfront, instead of on each partition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// Define upload methods in order of preference
List<UploadAttempt> uploadAttempts = Arrays.asList(
// Primary method
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the primary method failed, we should abort the committing segment fix because fallbacks won't be able to handle that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been added for backward compatibility.
I initially relied only on the primary function but that introduced Backward compatibility issues.

if (segmentZKMetadata.getStatus() == Status.COMMITTING) {
LOGGER.info("Updating additional metadata in ZK for segment {} as pauseless is enabled",
segmentZKMetadata.getSegmentName());
segmentZKMetadata.setStartTime(uploadedSegmentZKMetadata.getStartTimeMs());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we simply put all simple fields from the uploaded ZK metadata into current?

Copy link
Contributor Author

@9aman 9aman Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have update the function to copy all the simpleFields. The SegementZkMetadata returned by the uploadLLSegmentToDeepStore now also the previously missing fields like creationTime and numReplicas

@KKcorps KKcorps force-pushed the resolve-failures-pauseless-ingestion branch from c0bb465 to 3b4c9de Compare January 28, 2025 02:47
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new added tests are quite slow. PauselessRealtimeIngestionCommitEndMetadataFailureTest itself is taking more than 5 minutes, and we are adding 4 similar tests. Are we able to merge them into one and reduce the overall testing time?

…can only be updated after a fixed time has elapsed

Reduced the time requirements by creating a FakePauselessLLCRealtimeSegmentManager.
@9aman
Copy link
Contributor Author

9aman commented Jan 29, 2025

The new added tests are quite slow. PauselessRealtimeIngestionCommitEndMetadataFailureTest itself is taking more than 5 minutes, and we are adding 4 similar tests. Are we able to merge them into one and reduce the overall testing time?

@Jackie-Jiang I have updated the tests. Each test now runs in under a minute.

@KKcorps KKcorps merged commit 3525116 into apache:master Jan 29, 2025
21 checks passed
zeronerdzerogeekzerocool pushed a commit to zeronerdzerogeekzerocool/pinot that referenced this pull request Feb 20, 2025
@9aman 9aman mentioned this pull request Mar 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature ingestion real-time release-notes Referenced by PRs that need attention when compiling the next release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants