Skip to content

Conversation

@dashpole
Copy link
Contributor

@dashpole dashpole commented Mar 8, 2017

In our e2e and e2e_node tests, we had a number of different timeouts for deletion.
Recent changes to the way deletion works (#41644, #41456) have resulted in some timeouts in e2e tests. #42661 was the most recent fix for this.
Most of these tests are not meant to test pod deletion latency, but rather just to clean up pods after a test is finished.
For this reason, we should change all these tests to use a standard, fairly high timeout for deletion.

cc @vishh @Random-Liu

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 8, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Mar 8, 2017
@ncdc
Copy link
Member

ncdc commented Mar 8, 2017

cc @derekwaynecarr @sjenning

@dashpole
Copy link
Contributor Author

dashpole commented Mar 8, 2017

@k8s-bot unit test this

@dashpole
Copy link
Contributor Author

dashpole commented Mar 8, 2017

@k8s-bot kops aws e2e test this

@dashpole
Copy link
Contributor Author

dashpole commented Mar 8, 2017

/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Mar 8, 2017
@ncdc
Copy link
Member

ncdc commented Mar 8, 2017

Seems ok to me. @dchen1107 @yujuhong @vishh @kubernetes/sig-node-pr-reviews wdyt?

@smarterclayton
Copy link
Contributor

Not 100% clear - this is to ensure we get a stronger signal to debug flakes? or it's to make flakes go away? Or a flaw in the tests that they have timeouts with force deletion?

@ncdc
Copy link
Member

ncdc commented Mar 8, 2017

I thought it was to make flakes go away by waiting longer for pods to be deleted. If that's not the case, I would like to know the purpose.

@dashpole
Copy link
Contributor Author

dashpole commented Mar 8, 2017

@smarterclayton not a flaw in the tests. This change has two motivations:

  • Make flakes go away by increasing timeouts (for some tests, which currently have low timeouts)
  • Make timeouts consistent across tests.

@dashpole
Copy link
Contributor Author

dashpole commented Mar 8, 2017

Making timeouts consistent should prevent needing to update each timeout individually as I have in the past: #42661 #41157

@smarterclayton
Copy link
Contributor

Ok, I just wanted to make sure this wasn't "we have bugs in cleanup of pods when short termination deadlines are set".

/approve

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 8, 2017
@ncdc
Copy link
Member

ncdc commented Mar 8, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 8, 2017
@ncdc ncdc added the kind/bug Categorizes issue or PR as related to a bug. label Mar 8, 2017
@ncdc ncdc added this to the v1.6 milestone Mar 8, 2017
@yujuhong
Copy link
Contributor

yujuhong commented Mar 8, 2017

The idea SGTM; didn't review the PR.

@dashpole
Copy link
Contributor Author

dashpole commented Mar 8, 2017

@k8s-bot gce etcd3 e2e test this

@caesarxuchao
Copy link
Contributor

@dashpole could you let this pull also fix the tests reported in #42646 as well? Let me know if you prefer me sending a PR based on this one.

@dashpole
Copy link
Contributor Author

dashpole commented Mar 8, 2017

@caesarxuchao just saw this. Working on adding it now... If i dont make it before this merges then Ill just open another PR

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 8, 2017
@dashpole
Copy link
Contributor Author

dashpole commented Mar 8, 2017

@caesarxuchao @ncdc I updated this PR to include the timeouts in test/e2e/common/pods.go and test/e2e/generated_clientset.go. PTAL

@caesarxuchao
Copy link
Contributor

Changes in test/e2e/generated_clientset.go lgtm. Thanks.

@yujuhong
Copy link
Contributor

yujuhong commented Mar 9, 2017

See #42776. Pod deletion shouldn't normally take this long.

@dashpole
Copy link
Contributor Author

dashpole commented Mar 9, 2017

I agree that they should not normally take this long. Ideally, we should have separate tests for pod deletion latency. I plan to add these in the near future. However, I still think having uniform pod deletion timeouts is useful.

@dashpole
Copy link
Contributor Author

dashpole commented Mar 9, 2017

Assuming #42779 merges, which fixes deletion problems (#42685, #42776), this PR is no longer required for 1.6. It is still useful, but it is not urgent.

@yujuhong
Copy link
Contributor

yujuhong commented Mar 9, 2017

I agree that they should not normally take this long. Ideally, we should have separate tests for pod deletion latency. I plan to add these in the near future. However, I still think having uniform pod deletion timeouts is useful.

Agreed. Didn't mean to imply that this PR is no longer needed. I think it'd still be good to get this in for 1.6 as part of the test cleanup.

@ethernetdan ethernetdan added the kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. label Mar 9, 2017
@yujuhong yujuhong self-assigned this Mar 9, 2017
@yujuhong
Copy link
Contributor

yujuhong commented Mar 9, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 9, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

The following people have approved this PR: dashpole, ncdc, smarterclayton, yujuhong

Needs approval from an approver in each of these OWNERS Files:

We suggest the following people:
cc @liggitt
You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694)

@k8s-github-robot k8s-github-robot merged commit 7c08e81 into kubernetes:master Mar 9, 2017
@dchen1107
Copy link
Member

@smarterclayton RE: #42734 (comment): you did smell a real issue here :-)

@dashpole dashpole deleted the deletion_timeout branch March 14, 2017 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants