Skip to content

Conversation

@bibibox
Copy link
Contributor

@bibibox bibibox commented Apr 23, 2024

Current errTask has some problems as follows:

  1. use taskInfo pointer as key, which means same task maybe has different keys, and is handled respectively
  2. when handling items in the errTask, if the pod is not found and clean task from cache failed, the task will be added to errTask again, and it will caused endless loop
  3. the default controller rate limiting queue qps is just 10. since the pod with not found pvc is added to errTask at present, the qps maybe too small

this PR fix these problems

@volcano-sh-bot volcano-sh-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 23, 2024
@bibibox bibibox changed the title Release-1.9-dev fix errTask channel memory leak Apr 23, 2024
sc.resyncTask(task)
reSynced = true
} else {
klog.V(4).Infof("sync task <%s/%s> success", task.Namespace, task.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Successfully synced task xxx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

reSynced = true
} else {
klog.V(4).Infof("sync task <%s/%s> success", task.Namespace, task.Name)
sc.errTasks.Forget(obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CustomBindErrHandler execute successfully, we should also forget it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have forgot it when sync task successfully, just need resync task when CustomBindErrHandler execute failed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

} else {
jobErr = fmt.Errorf("failed to find Job <%v> for Task %v/%v",
pi.Job, pi.Namespace, pi.Name)
klog.Warningf("failed to find job <%v> for Task <%v/%v>", pi.Job, pi.Namespace, pi.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uppper case here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

taskKey, ok := obj.(string)
if !ok {
klog.Errorf("failed to convert %v to *schedulingapi.TaskInfo", obj)
klog.Errorf("Failed to convert %v to string.", obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The obj format is incorrect. Is it more reasonable to exclude Forget?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Monokaix
Copy link
Member

Should also merge to master.

Signed-off-by: Wenbo Zhang <[email protected]>
@wangyang0616
Copy link
Member

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 24, 2024
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: william-wang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 25, 2024
@volcano-sh-bot volcano-sh-bot merged commit ffd0588 into volcano-sh:release-1.9 Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants