Skip to content

Conversation

@dahai1996
Copy link
Contributor

Purpose of the pull request

when using task group for jobs,every workflow will rob taskGroup When it gets a "WAIT_TASK_GROUP" message from the queue.And the workflow will not go to check next message until the current message rob taskGroup successfully.
When every workflow is going to "rob taskGroup",but taskGroup has no resource.
it's get a deadlock problem.
I test my code on version 3.0.1 , and it works well.

Brief change log

move the failed message to the fail of queue.

Verify this pull request

This pull request is code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(or)

If your pull request contain incompatible change, you should also add it to docs/docs/en/guide/upgrede/incompatible.md

@ruanwenjun
Copy link
Member

ruanwenjun commented Dec 14, 2022

Good catch, this PR will change the event order, but not all event can work well after change the order.

Can we throw a exception to tell the looper this event need to put in the end?

@SbloodyS SbloodyS added first time contributor First-time contributor 3.0.x labels Dec 14, 2022
@SbloodyS SbloodyS added this to the 3.0.4 milestone Dec 14, 2022
@dahai1996
Copy link
Contributor Author

dahai1996 commented Dec 14, 2022

Good catch, this PR will change the event order, but not all event can work well after change the order.

Can we throw a exception to tell the looper this event need to put in the end?

yes,it will change the order, I thought about making a temporary list to save the failed event.
or we can make a new Exception for "rob taskGroup failed"

@codecov-commenter
Copy link

codecov-commenter commented Dec 14, 2022

Codecov Report

❌ Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.38%. Comparing base (ca5af01) to head (732663c).
⚠️ Report is 1168 commits behind head on dev.

Files with missing lines Patch % Lines
.../server/master/runner/WorkflowExecuteRunnable.java 0.00% 5 Missing ⚠️
...r/server/master/event/StateEventHandleFailure.java 0.00% 4 Missing ⚠️
...er/master/event/TaskWaitTaskGroupStateHandler.java 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##                dev   #13191      +/-   ##
============================================
- Coverage     39.39%   39.38%   -0.02%     
+ Complexity     4282     4280       -2     
============================================
  Files          1066     1067       +1     
  Lines         40482    40493      +11     
  Branches       4657     4662       +5     
============================================
  Hits          15947    15947              
- Misses        22746    22757      +11     
  Partials       1789     1789              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dahai1996
Copy link
Contributor Author

Good catch, this PR will change the event order, but not all event can work well after change the order.

Can we throw a exception to tell the looper this event need to put in the end?

i do it by throwing a exception,pls check

@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

37.0% 37.0% Coverage
0.0% 0.0% Duplication

Copy link
Member

@ruanwenjun ruanwenjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ruanwenjun ruanwenjun added the bug Something isn't working label Dec 16, 2022
Copy link
Contributor

@davidzollo davidzollo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
Good job

@davidzollo
Copy link
Contributor

Thanks for your first contribution, Looking forward to your deep participation ^-^

@davidzollo davidzollo merged commit 7a0a2c2 into apache:dev Dec 16, 2022
@davidzollo davidzollo changed the title Solve the deadlock problem caused by queuing [Fix]Solve the deadlock problem caused by queuing Dec 16, 2022
zhongjiajie pushed a commit that referenced this pull request Dec 28, 2022
* Solve the deadlock problem caused by queuing

* Solve the deadlock problem caused by queuing

* Solve the deadlock problem caused by queuing

* Solve the deadlock problem caused by queuing,move the event to the tail by throwing a exception

Co-authored-by: wfs <[email protected]>

(cherry picked from commit 7a0a2c2)
@zhuangchong zhuangchong modified the milestones: 3.0.4, 3.1.9 Sep 11, 2023
@zhuangchong zhuangchong added the 3.1.x for 3.1.x version label Sep 11, 2023
@zhuangchong zhuangchong added the release cherry-pick Mark this issue/PR had cherry-pick for release version label Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.1.x for 3.1.x version backend bug Something isn't working first time contributor First-time contributor release cherry-pick Mark this issue/PR had cherry-pick for release version

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants