-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[Fix][Zeta] Fix operation thread leak when master switch #9464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an operation thread leak when the master switch is triggered by updating how coordinator service calls are retried and by introducing a retryable exception. Key changes include:
- Adding a new test serializer hook and a test operation (ReturnRetryTimesOperation) to simulate retryable exceptions.
- Updating the coordinator service logic to throw a retryable exception with fixed maxRetry and retryPause values.
- Introducing a new SeaTunnelEngineRetryableException in the common module.
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| seatunnel-engine/seatunnel-engine-server/src/test/java/org/apache/seatunnel/engine/server/operation/TestSerializerHook.java | Added a test hook for Hazelcast’s data serialization. |
| seatunnel-engine/seatunnel-engine-server/src/test/java/org/apache/seatunnel/engine/server/operation/ReturnRetryTimesOperation.java | Introduced a test operation that throws a retryable exception. |
| seatunnel-engine/seatunnel-engine-server/src/test/java/org/apache/seatunnel/engine/server/CoordinatorServiceTest.java | Added a test to verify retry behavior for retryable exceptions. |
| seatunnel-engine/seatunnel-engine-server/src/main/java/org/apache/seatunnel/engine/server/SeaTunnelServer.java | Modified retry logic by replacing property‐based configuration with fixed values. |
| seatunnel-engine/seatunnel-engine-common/src/main/java/org/apache/seatunnel/engine/common/exception/SeaTunnelEngineRetryableException.java | Added a new exception class for retryable errors. |
Comments suppressed due to low confidence (2)
seatunnel-engine/seatunnel-engine-server/src/test/java/org/apache/seatunnel/engine/server/CoordinatorServiceTest.java:110
- The test assertion expects the retry count to reach 250, which might become fragile if the static counter is not reset or if the new hardcoded retry values affect the overall retry behavior; consider making the expected value configurable or resetting the counter before the test.
.getMessage()
seatunnel-engine/seatunnel-engine-server/src/main/java/org/apache/seatunnel/engine/server/SeaTunnelServer.java:246
- The hardcoded retry values (maxRetry = 3 and retryPause = 500) replace the previous dynamic configuration; please document the rationale behind these specific choices to ensure clarity on how they address the thread leak issue.
int maxRetry = 3;
| public class ReturnRetryTimesOperation extends Operation | ||
| implements IdentifiedDataSerializable, AllowedDuringPassiveState { | ||
|
|
||
| private static final AtomicInteger retryTimes = new AtomicInteger(0); |
Copilot
AI
Jun 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a static AtomicInteger for tracking retry times may lead to cross-test state contamination; consider resetting or scoping it per test to ensure independent behavior.
| private static final AtomicInteger retryTimes = new AtomicInteger(0); | |
| private final AtomicInteger retryTimes = new AtomicInteger(0); |
Purpose of this pull request
Fix operation thread leak when master switch, the log will print lots of
Does this PR introduce any user-facing change?
no
How was this patch tested?
add new test.
Check list
New License Guide