Skip to content

Make setup linux action be more friendly with gcp linux runners#96289

Closed
weiwangmeta wants to merge 11 commits intomasterfrom
weiwangmeta/setup-linux-supports-gcp
Closed

Make setup linux action be more friendly with gcp linux runners#96289
weiwangmeta wants to merge 11 commits intomasterfrom
weiwangmeta/setup-linux-supports-gcp

Conversation

@weiwangmeta
Copy link
Contributor

Fixes issues like the following:
https://github.com/pytorch/pytorch/actions/runs/4362155257/jobs/7627059487 has a more serious core dump failure but the log of curl failures (GCP linux trying to get EC2 specific metadata like EC2 AMI-ID, Instance ID, and Instance Type) confused the HUD.
image
This PR gets rid of those curl failures.

This may have contributed to the impression of "flaky GCP" in #95416

cc @desertfire who first made the request to remove such annoying curl failures.

@weiwangmeta weiwangmeta requested a review from a team as a code owner March 8, 2023 09:10
@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Mar 8, 2023
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 8, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/96289

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures

As of commit cd43128:

BROKEN TRUNK - The following jobs failed but were present on the merge base 3ce1e15:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@weiwangmeta weiwangmeta added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 8, 2023
Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! There are only some minor lint complains. You could also consider updating log classifier rule set https://github.com/pytorch/test-infra/blob/main/aws/lambda/log-classifier/ruleset.toml to string match the error line to show on HUD.

In this case, the curl error showed up first because it's the first rule that mached.

@weiwangmeta
Copy link
Contributor Author

@pytorchbot merge -f "failure is not due to this PR"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@weiwangmeta weiwangmeta deleted the weiwangmeta/setup-linux-supports-gcp branch March 8, 2023 22:22
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 12, 2023
Fixes issues like the following:
https://github.com/pytorch/pytorch/actions/runs/4362155257/jobs/7627059487 has a more serious core dump failure but the log of curl failures (GCP linux trying to get EC2 specific metadata like EC2 AMI-ID, Instance ID, and Instance Type) confused the HUD.
<img width="848" alt="image" src="https://user-images.githubusercontent.com/109318740/223670567-330521ba-050a-41c3-9efb-fae6ea3398c0.png">
This PR gets rid of those curl failures.

This may have contributed to the impression of "flaky GCP" in #95416

Pull Request resolved: pytorch/pytorch#96289
Approved by: https://github.com/huydhn, https://github.com/yanboliang
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 12, 2023
Fixes issues like the following:
https://github.com/pytorch/pytorch/actions/runs/4362155257/jobs/7627059487 has a more serious core dump failure but the log of curl failures (GCP linux trying to get EC2 specific metadata like EC2 AMI-ID, Instance ID, and Instance Type) confused the HUD.
<img width="848" alt="image" src="https://user-images.githubusercontent.com/109318740/223670567-330521ba-050a-41c3-9efb-fae6ea3398c0.png">
This PR gets rid of those curl failures.

This may have contributed to the impression of "flaky GCP" in #95416

Pull Request resolved: pytorch/pytorch#96289
Approved by: https://github.com/huydhn, https://github.com/yanboliang
ydwu4 added a commit to ydwu4/pytorch that referenced this pull request Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants