Skip to content

Conversation

@9aman
Copy link
Contributor

@9aman 9aman commented Mar 17, 2025

Context

A segment is marked ONLINE before it is build for pauseless tables. This increases the chances of data missing in the middle e.g.

seg 0 : Build and stored on the segment
seg 1: Ingested but build failed
seg 2: Ingest and build succeeded

The gap created in the data due to absence of seg 1 is filled by DR: #14920

For Dedup and Partial Upserts, Segment 2 may contain incorrect data because it processed without having Segment 1's updates

Proposed Solution

This PR introduces a new API that enables deleting segments from pauseless-enabled tables. For each specified segment, it deletes that segment and all segments with higher sequence IDs in the same partition.

An optional force flag to bypass pauseless and table state checks.
Screenshot 2025-03-19 at 1 09 02 PM

Testing

  • Unit tests for the API endpoints

@codecov-commenter
Copy link

codecov-commenter commented Mar 18, 2025

Codecov Report

Attention: Patch coverage is 45.28302% with 29 lines in your changes missing coverage. Please review.

Project coverage is 63.19%. Comparing base (59551e4) to head (3fd8332).
Report is 1929 commits behind head on master.

Files with missing lines Patch % Lines
...ler/api/resources/PinotSegmentRestletResource.java 45.28% 26 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #15299      +/-   ##
============================================
+ Coverage     61.75%   63.19%   +1.43%     
- Complexity      207     1375    +1168     
============================================
  Files          2436     2812     +376     
  Lines        133233   158931   +25698     
  Branches      20636    24339    +3703     
============================================
+ Hits          82274   100430   +18156     
- Misses        44911    50910    +5999     
- Partials       6048     7591    +1543     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.15% <45.28%> (+1.44%) ⬆️
java-21 63.18% <45.28%> (+1.55%) ⬆️
skip-bytebuffers-false 63.18% <45.28%> (+1.44%) ⬆️
skip-bytebuffers-true 63.15% <45.28%> (+35.42%) ⬆️
temurin 63.19% <45.28%> (+1.43%) ⬆️
unittests 63.18% <45.28%> (+1.43%) ⬆️
unittests1 56.21% <ø> (+9.32%) ⬆️
unittests2 33.82% <45.28%> (+6.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@KKcorps KKcorps changed the title Allows segments deletion in buld for pauseless tables Allows segments deletion in build for pauseless tables Mar 18, 2025
+ "The retention period controls how long deleted segments are retained before permanent removal. "
+ "It follows this precedence: input parameter → table config → cluster setting → 7d default. "
+ "Use 0d or -1d for immediate deletion without retention.")
public SuccessResponse deletePauselessSegments(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API should be named better. Like reconcile pauseless table

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not reconciling right ?
It's more of a way to manually reset the table to a stable state. It requires too much manual intervention to call it reconciliation.
It basically is deleting from the oldest segment for that partition and does not care about the state of the table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about pruneSegmentsFromSequence or truncatePartitionSegments

Map<Integer, LLCSegmentName> partitionToOldestSegment = new HashMap<>();

for (String segment : segments) {
LLCSegmentName llcSegmentName = new LLCSegmentName(segment);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add NPE check here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a minor same table name check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add NPE check here

I have thrown an exception in case this segment name passed is invalid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a minor same table name check

Done. Thanks for pointing this out.
I have updated the UT's to reflect to this.

Copy link
Contributor

@noob-se7en noob-se7en left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some check in Validation job to atleast automate alerting part

Map<String, Map<String, String>> segmentsToInstanceState = idealState.getRecord().getMapFields();

for (String segmentName : segmentsToInstanceState.keySet()) {
LLCSegmentName llcSegmentName = new LLCSegmentName(segmentName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a check here if segment name doesn't fit into to LLC segment name scheme. It can happen if it was uploaded by some task or user manually.

we should skip that segment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of throwing exception in this case. Do you feel customers usually do this for realtime tables as well ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i have seen instances in our envs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it it's an uploaded segment how do we know the sequence id for the same.

@KKcorps
Copy link
Contributor

KKcorps commented Mar 20, 2025

Since we are deleting even the latest consuming segment, are we relying on RealtimeSegmentValidationManager to create new ones after deletion?

I feel like we should trigger it by default in this API.

@9aman
Copy link
Contributor Author

9aman commented Mar 27, 2025

Since we are deleting even the latest consuming segment, are we relying on RealtimeSegmentValidationManager to create new ones after deletion?

I feel like we should trigger it by default in this API.

I think these two operations should not be clubbed together. The rationale for this is that this is an API for DR and that should be a rare scenario.

The user can the trigger RealtimeValidationManager.

@9aman 9aman force-pushed the delete_segment_from_specific_segments branch from aee5f5b to 3fd8332 Compare March 27, 2025 11:49
@KKcorps KKcorps merged commit eacd6c0 into apache:master Mar 27, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants