Skip to content

Conversation

@clee2000
Copy link
Contributor

@clee2000 clee2000 commented Jun 7, 2024

#127825

The majority of the g5 runner usage comes from inductor (its something like 2x everything else)
in the past week, inductor ran 1300 ish times on PRs and 300 times on main. Inductor-periodic ran 50 times on main, so the previous move from inductor -> inductor-periodic only results in 250 fewer runs.

I was under the impression that cu124 is experimental currently and eventually we'll need to switch to it, so this will stay until we switch or inductor uses much fewer runners

Are we expected to be able to handle two versions of cuda in CI? Because currently we cannot, at least not comfortably

@pytorch-bot
Copy link

pytorch-bot bot commented Jun 7, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/128250

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 5 New Failures

As of commit 57bb8d1 with merge base e647ea5 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Jun 7, 2024
@pytorch-bot
Copy link

pytorch-bot bot commented Jun 7, 2024

Warning: Unknown label ciflow/inductor-cu124.
Currently recognized labels are

  • ciflow/binaries
  • ciflow/binaries_conda
  • ciflow/binaries_libtorch
  • ciflow/binaries_wheel
  • ciflow/inductor
  • ciflow/inductor-perf-compare
  • ciflow/inductor-micro-benchmark
  • ciflow/linux-aarch64
  • ciflow/mps
  • ciflow/nightly
  • ciflow/periodic
  • ciflow/rocm
  • ciflow/slow
  • ciflow/trunk
  • ciflow/unstable
  • ciflow/xpu
  • ciflow/torchbench

Please add the new label to .github/pytorch-probot.yml

@clee2000 clee2000 marked this pull request as ready for review June 7, 2024 20:23
@clee2000 clee2000 requested a review from a team as a code owner June 7, 2024 20:23
Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! We need to remember to clean this up

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Jun 7, 2024
@clee2000
Copy link
Contributor Author

clee2000 commented Jun 7, 2024

@pytorchbot merge -f "no trunk needed"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@nWEIdia
Copy link
Collaborator

nWEIdia commented Jun 12, 2024

Is there a label to trigger these cu124 inductor jobs? Currently my #128423 does not seem to run cu124 jobs
Update: ciflow/inductor-cu124 seems to be it.

TharinduRusira pushed a commit to TharinduRusira/pytorch that referenced this pull request Jun 14, 2024
…red by ciflow/inductor (pytorch#128250)

pytorch#127825

The majority of the g5 runner usage comes from inductor (its something like 2x everything else)
in the past week, inductor ran 1300 ish times on PRs and 300 times on main.  Inductor-periodic ran 50 times on main, so the previous move from inductor -> inductor-periodic only results in 250 fewer runs.

I was under the impression that cu124 is experimental currently and eventually we'll need to switch to it, so this will stay until we switch or inductor uses much fewer runners

Are we expected to be able to handle two versions of cuda in CI?  Because currently we cannot, at least not comfortably

Pull Request resolved: pytorch#128250
Approved by: https://github.com/huydhn
@github-actions github-actions bot deleted the csl/inductor_cu124 branch July 12, 2024 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants