Skip to content

Conversation

@mattip
Copy link
Contributor

@mattip mattip commented Jul 22, 2020

xref issue gh-37855
xref PR gh-40827 (which had this exact change but was reverted)
xref PR gh-37584 gh-38796 (which ran into the inconsistency between the tags)

Before this PR, the situation is that .circleci/docker/build_docker.sh uses CIRCLE_WORKFLOW_ID to tag images, but the test whether to build an image in .circleci/verbatim-sources/job-specs/docker_jobs.yml uses DOCKER_TAG calculated from git log --oneline --pretty='%H'. Somehow PR gh-37584 gh-38796 hit a situation where the pytorch_linux_xenial_py3_clang5_android_ndk_r19c image build failed, but an image with the DOCKER_TAG formatted-tag appeared on http://docker.pytorch.org/pytorch.html. So the DOCKER_TAG image is the one checked in the "Check if image should be built" step, not the CIRCLE_WORKFLOW_ID one.

@dr-ci
Copy link

dr-ci bot commented Jul 22, 2020

💊 CI failures summary and remediations

As of commit df67fdb (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 1/2 non-CircleCI failure(s)

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_macos_10_13_py3_test (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

Jul 22 03:33:18 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future
Jul 22 03:33:18 At: 
Jul 22 03:33:18   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Jul 22 03:33:18   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Jul 22 03:33:18  
Jul 22 03:33:18 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Jul 22 03:33:18  
Jul 22 03:33:18 At: 
Jul 22 03:33:18   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Jul 22 03:33:18   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Jul 22 03:33:18  
Jul 22 03:33:18 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Jul 22 03:33:18  
Jul 22 03:33:18 At: 
Jul 22 03:33:18   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Jul 22 03:33:18   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Jul 22 03:33:18  
Jul 22 03:33:18 [W tensorpipe_agent.cpp:491] RPC agent for worker3 encountered error when reading incoming request from worker0: EOF: end of file (this is expected to happen during shutdown) 
Jul 22 03:33:18 [W tensorpipe_agent.cpp:491] RPC agent for worker2 encountered error when reading incoming request from worker1: EOF: end of file (this is expected to happen during shutdown) 
Jul 22 03:33:18 [W tensorpipe_agent.cpp:491] RPC agent for worker0 encountered error when reading incoming request from worker1: EOF: end of file (this is expected to happen during shutdown) 
Jul 22 03:33:18 [W tensorpipe_agent.cpp:491] RPC agent for worker0 encountered error when reading incoming request from worker2: EOF: end of file (this is expected to happen during shutdown) 
Jul 22 03:33:18 [W tensorpipe_agent.cpp:491] RPC agent for worker2 encountered error when reading incoming request from worker0: EOF: end of file (this is expected to happen during shutdown) 

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 2 times.

@ezyang
Copy link
Contributor

ezyang commented Jul 22, 2020

I'm a bit confused now because of these lines: https://github.com/pytorch/pytorch/pull/41846/files#diff-8c89e5adbd6a94d9054a2066d12f1dcbR51

which suggest the new tag name should also already be pushed.

@ezyang
Copy link
Contributor

ezyang commented Jul 22, 2020

NACKing this PR, it turns out there isn't a bug and I was just hallucinating.

@ezyang ezyang closed this Jul 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants