Skip to content

Fix pull-containerd-node-e2e failure on master#5095

Merged
dmcgowan merged 1 commit intocontainerd:masterfrom
dims:fix-pull-containerd-node-e2e-failure
Mar 5, 2021
Merged

Fix pull-containerd-node-e2e failure on master#5095
dmcgowan merged 1 commit intocontainerd:masterfrom
dims:fix-pull-containerd-node-e2e-failure

Conversation

@dims
Copy link
Copy Markdown
Member

@dims dims commented Feb 26, 2021

Signed-off-by: Davanum Srinivas [email protected]

@dims
Copy link
Copy Markdown
Member Author

dims commented Feb 26, 2021

/test pull-containerd-node-e2e

1 similar comment
@dims
Copy link
Copy Markdown
Member Author

dims commented Feb 26, 2021

/test pull-containerd-node-e2e

@dmcgowan
Copy link
Copy Markdown
Member

Is the single failure an improvement over before still?

@dims
Copy link
Copy Markdown
Member Author

dims commented Feb 26, 2021

latest logs shows the following

Trying to find master named 'bootstrap-e2e-master'
Looking for address 'bootstrap-e2e-master-ip'
ERROR: (gcloud.compute.addresses.describe) Could not fetch resource:
 - The resource 'projects/cri-c8d-pr-node-e2e/regions/us-central1/addresses/bootstrap-e2e-master-ip' was not found

it's failing earlier now @dmcgowan

@adisky
Copy link
Copy Markdown
Contributor

adisky commented Mar 3, 2021

/test pull-containerd-node-e2e

@dims dims force-pushed the fix-pull-containerd-node-e2e-failure branch 11 times, most recently from f567bbd to 7568d7b Compare March 3, 2021 20:23
@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 3, 2021

/test pull-containerd-node-e2e

@dims dims force-pushed the fix-pull-containerd-node-e2e-failure branch from 7568d7b to 5af99a2 Compare March 3, 2021 20:31
@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 3, 2021

/test pull-containerd-node-e2e

1 similar comment
@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 3, 2021

/test pull-containerd-node-e2e

@estesp
Copy link
Copy Markdown
Member

estesp commented Mar 3, 2021

Seems we still have an issue that nothing is in the bucket?

Mar 03 21:17:06 tmp-node-e2e-d29df482-ubuntu-gke-1804-1-16-v20200330 configure.sh[1831]: + curl -f --ipv4 -Lo containerd.tar.gz --connect-timeout 20 --max-time 300 --retry 6 --retry-delay 10 https://storage.googleapis.com/cri-containerd-staging/containerd/45e1ead35f3386660973ce686998f4e1a91e5740/cri-containerd-cni-1.5.0-beta.2-14-gb84a8f4bc.linux-amd64.tar.gz
Mar 03 21:17:06 tmp-node-e2e-d29df482-ubuntu-gke-1804-1-16-v20200330 configure.sh[1831]:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Mar 03 21:17:06 tmp-node-e2e-d29df482-ubuntu-gke-1804-1-16-v20200330 configure.sh[1831]:                                  Dload  Upload   Total   Spent    Left  Speed
Mar 03 21:17:06 tmp-node-e2e-d29df482-ubuntu-gke-1804-1-16-v20200330 configure.sh[1831]: 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
Mar 03 21:17:06 tmp-node-e2e-d29df482-ubuntu-gke-1804-1-16-v20200330 configure.sh[1831]: curl: (22) The requested URL returned error: 404

Don't you have to run the pull-containerd-build prow job to upload that file to the bucket? That's what I found in a different PR, but this all looks like too much magic and I might be missing something :)

@estesp
Copy link
Copy Markdown
Member

estesp commented Mar 3, 2021

/test all

@estesp
Copy link
Copy Markdown
Member

estesp commented Mar 3, 2021

/test pull-containerd-build

@dims dims force-pushed the fix-pull-containerd-node-e2e-failure branch 2 times, most recently from 2990779 to 1044586 Compare March 3, 2021 21:46
@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 3, 2021

/test pull-containerd-node-e2e

@dims dims force-pushed the fix-pull-containerd-node-e2e-failure branch from 1044586 to 074ea8f Compare March 3, 2021 22:47
@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 3, 2021

/test pull-containerd-node-e2e

@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
@containerd containerd deleted a comment from k8s-ci-robot Mar 4, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
@containerd containerd deleted a comment from theopenlab-ci Bot Mar 4, 2021
Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 4, 2021

one more green run : https://prow.k8s.io/log?container=test&id=1367301014088060928&job=pull-containerd-node-e2e

�[1m�[32mRan 200 of 338 Specs in 686.312 seconds�[0m
�[1m�[32mSUCCESS!�[0m -- �[32m�[1m200 Passed�[0m | �[91m�[1m0 Failed�[0m | �[33m�[1m0 Pending�[0m | �[36m�[1m138 Skipped�[0m


Ginkgo ran 1 suite in 11m28.685597405s
Test Suite Passed

Success Finished Test Suite on Host tmp-node-e2e-49ae82d2-cos-85-13310-1209-17

@estesp
Copy link
Copy Markdown
Member

estesp commented Mar 4, 2021

/test pull-containerd-node-e2e

@estesp
Copy link
Copy Markdown
Member

estesp commented Mar 4, 2021

@dims any idea why the test doesn't "shut down" after completion? https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/directory/pull-containerd-node-e2e/1367480599085846528

Seems the last several runs are all pass, but then sit for 30-35 minutes until the 65 minute "timeout" timer strikes and then fails with timeout. :(

@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 4, 2021

@estesp yep. on my TODO list for today

@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 4, 2021

/test pull-containerd-node-e2e

1 similar comment
@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 4, 2021

/test pull-containerd-node-e2e

@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 5, 2021

/test pull-containerd-node-e2e

@k8s-ci-robot
Copy link
Copy Markdown

k8s-ci-robot commented Mar 5, 2021

@dims: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-containerd-build 5af99a292e62c4521129d5578c66ea60e48b8ead link /test pull-containerd-build

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 5, 2021

/test pull-containerd-node-e2e

@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 5, 2021

ok touchdown!

@mikebrow @dmcgowan @estesp This is ready!

Copy link
Copy Markdown
Member

@dmcgowan dmcgowan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dmcgowan dmcgowan merged commit 8e20726 into containerd:master Mar 5, 2021
Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thx @dims

@mikebrow
Copy link
Copy Markdown
Member

mikebrow commented Mar 5, 2021

now .. where did our master node go.. are we changing master node to main or control node?

2021/03/05 04:25:04 process.go:155: Step 'go run /home/prow/go/src/k8s.io/kubernetes/test/e2e_node/runner/remote/run_remote.go --cleanup --logtostderr --vmodule=*=4 --ssh-env=gce --results-dir=/logs/artifacts --project=cri-c8d-pr-node-e2e --zone=us-central1-f --ssh-user=prow --ssh-key=/workspace/.ssh/google_compute_engine --ginkgo-flags=--nodes=8 --focus="\[NodeConformance\]|\[NodeFeature:.+\]" --skip="\[Flaky\]|\[Slow\]|\[Serial\]" --test_args=--container-runtime=remote --container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime-process-name=/home/containerd/usr/local/bin/containerd --container-runtime-pid-file= --kubelet-flags="--cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/containerd.service" --extra-log="{\"name\": \"containerd.log\", \"journalctl\": [\"-u\", \"containerd\"]}" --test-timeout=1h5m0s --image-config-file=/home/prow/go/src/k8s.io/test-infra/jobs/e2e_node/containerd/containerd-master/image-config-pr.yaml -node-env=PULL_REFS=master:fa66f93c0c0c6400fbfc41592eeb89815a42fe88,5095:15a4df0ba9bc8a702e01398104f474b32eee3aa3' finished in 17m45.461677165s
2021/03/05 04:25:04 e2e.go:544: Dumping logs locally to: /logs/artifacts
2021/03/05 04:25:04 process.go:153: Running: /workspace/log-dump.sh /logs/artifacts
Checking for custom logdump instances, if any
Using gce provider, skipping check for LOG_DUMP_SSH_KEY and LOG_DUMP_SSH_USER
Project: cri-c8d-pr-node-e2e
Network Project: cri-c8d-pr-node-e2e
Zone: us-central1-f
Dumping logs from master locally to '/logs/artifacts'
Trying to find master named 'bootstrap-e2e-master'
Looking for address 'bootstrap-e2e-master-ip'
ERROR: (gcloud.compute.addresses.describe) Could not fetch resource:
 - The resource 'projects/cri-c8d-pr-node-e2e/regions/us-central1/addresses/bootstrap-e2e-master-ip' was not found

Could not detect Kubernetes master node.  Make sure you've launched a cluster with 'kube-up.sh'
Master not detected. Is the cluster up?
Dumping logs from nodes locally to '/logs/artifacts'
Detecting nodes in the cluster
No nodes found!
WARNING: The following filter keys were not present in any resource : name, zone
WARNING: The following filter keys were not present in any resource : name, zone
INSTANCE_GROUPS=
NODE_NAMES=
2021/03/05 04:25:09 process.go:155: Step '/workspace/log-dump.sh /logs/artifacts' finished in 5.408109272s
2021/03/05 04:25:09 node.go:53: Noop - Node Down()
2021/03/05 04:25:09 process.go:96: Saved XML output to /logs/artifacts/junit_runner.xml.
2021/03/05 04:25:09 process.go:153: Running: bash -c . hack/lib/version.sh && KUBE_ROOT=. kube::version::get_version_vars && echo "${KUBE_GIT_VERSION-}"
2021/03/05 04:25:10 process.go:155: Step 'bash -c . hack/lib/version.sh && KUBE_ROOT=. kube::version::get_version_vars && echo "${KUBE_GIT_VERSION-}"' finished in 215.128515ms

@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 5, 2021

there is no master node in node e2e tests :) node-e2e was hacked on top of an existing harness and the scripts reflect that bias.

@dims
Copy link
Copy Markdown
Member Author

dims commented Mar 5, 2021

@mikebrow
Copy link
Copy Markdown
Member

mikebrow commented Mar 5, 2021

there is no master node in node e2e tests :) node-e2e was hacked on top of an existing harness and the scripts reflect that bias.

kk so that would be the e2e node bucket over here https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/e2e-tests.md not the node e2e bucket? nvm :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants