Skip to content

Conversation

@yf225
Copy link
Contributor

@yf225 yf225 commented Oct 5, 2018

Changes in this PR:

  1. Intermediate Docker image is shared from build stage to test stage through ECR, in order to fix the Caffe2 flaky CUDA tests.
  2. There are ~7 Caffe2 operator tests that are only flaky in caffe2_py2_gcc4_8_ubuntu14_04_test on CPU. Disabling those tests on that config only, which is okay to do because we are still running those tests in other test jobs.

After this PR is merged, CircleCI will be running on master automatically, and will be running on PRs if the author rebased their PR onto the newest master (which we will ask all the authors to do when we switch off Jenkins for Linux).

@ezyang
Copy link
Contributor

ezyang commented Oct 5, 2018

I can't do a proper review of the yml, because it's already checked in, but we're due for one.

no_output_timeout: "10h"

This seems like an outlandishly large timeout for a git merge

export GIT_MERGE_TARGET=`git log -n 1 --pretty=format:"%H" origin/master`

Use git rev-parse origin/master instead. Much shorter, and it's appropriate use of Git plumbing. https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain

git checkout -f ${GIT_COMMIT}

Shouldn't you already be checked out at this point?

git reset --hard ${GIT_COMMIT}

And if you just checked out that commit, why do you need to do a hard reset?

working_directory: /var/lib/jenkins/workspace

LOL. I guess it will be forever immortalized in our Docker images haha.

no_output_timeout: "10h"

no_output_timeout here is appropriate, but it still seems ludicrously high. There are a bunch more of these which I will no longer comment on.

docker push ${COMMIT_DOCKER_IMAGE}

I'd quite like to get rid of this ASAP.

- persist_to_workspace:

You're pushing Docker now. Do you need this?

sudo chmod -R 777 /home/circleci/project/pytorch-ci-env

Hmm?

sudo apt-get update

The copypasta BUUURNS

sudo ln /dev/null /dev/raw1394

WTF! (No action needed)

- image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-trusty-py2.7.9:248

Doesn't this mean DockerVersion updates are broken now?

@ezyang
Copy link
Contributor

ezyang commented Oct 5, 2018

So, what's the deployment plan? How long will we run them side-by-side before switching Jenkins off?

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd quite like to see the copy-pasting reduced dramatically, but I'm giving approval to land.

@yf225 yf225 mentioned this pull request Oct 5, 2018
5 tasks
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@yf225
Copy link
Contributor Author

yf225 commented Oct 5, 2018

@ezyang Remaining issues are tracked in #12396.

For deployment, I am thinking of making an internal post on Monday, and then make a clean switch from Jenkins to CircleCI on the same day (probably at a set time such as 4pm PT).

@yf225
Copy link
Contributor Author

yf225 commented Oct 5, 2018

@ezyang Yes DockerVersion will be broken and in the short term we will need to open a PR to update the Docker image version. In the long term we might want to make the following change:

Right now the Docker image push flow is "create a new image" -> "test against master" -> "push to ECR" -> "update DockerVersion.groovy". But once all Linux jobs are removed from Jenkins, we might want to do "create a new image" -> "push to ECR" -> "open PR to test against master" -> "merge new image version number into master config.yml". For the ECR clean up job in Jenkins, we will let it check the newest version number from PyTorch repo and do the clean up accordingly.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants