Skip to content

Conversation

@cyyever
Copy link
Collaborator

@cyyever cyyever commented Apr 29, 2024

This PR tries to decompose #122527 into a smaller one. Caffe2 python build scripts were removed and some tensorboard code using Caffe2 was removed too.
To be noted, this was inspired and is co-dev with @r-barnes.

cc @malfet @seemethere @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen @snadampal @albanD

@pytorch-bot
Copy link

pytorch-bot bot commented Apr 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125143

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 716ddb5 with merge base 31372fa (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added caffe2 module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration labels Apr 29, 2024
@cyyever cyyever added topic: bc breaking topic category skip-pr-sanity-checks module: build Build system issues module: bazel ciflow/trunk Trigger trunk jobs on your pull request labels Apr 29, 2024
@pytorch-bot pytorch-bot bot added the release notes: releng release notes category label Apr 29, 2024
@cyyever cyyever requested a review from a team as a code owner April 29, 2024 09:53
@cyyever cyyever requested a review from r-barnes April 30, 2024 12:56
@cyyever cyyever added the suppress-bc-linter Suppresses the failures of API backward-compatibility linter (Lint/bc_linter) label Apr 30, 2024
@cyyever
Copy link
Collaborator Author

cyyever commented Apr 30, 2024

@r-barnes The failures can be ignored.

@cyyever cyyever requested review from albanD, ezyang and malfet April 30, 2024 13:00
@cyyever cyyever added the ciflow/binaries Trigger all binary build and upload jobs on the PR label Apr 30, 2024
@cpuhrsch cpuhrsch added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 30, 2024
huydhn added a commit to pytorch/test-infra that referenced this pull request May 2, 2024
As the next part of #5149,
this PR provides some additional info about unstable and infra flaky
jobs.

* [x] #5149
* [x] Provide link to the issue that mark the job as unstable
* [x] Give the label(s) that suppress the job
* [x] Print the rule from
https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json
that marks the job as flaky
* [x] Explain the reason for infra flaky (I couldn't find any recent
examples for manual testing)

### Testing

1. From pytorch/pytorch#125264

<details open><summary><b>NEW FAILURE</b> - The following job has
failed:</summary><p>

* [pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5,
linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125264#24452065236)
([gh](https://github.com/pytorch/pytorch/actions/runs/8901751864/job/24452065236))

`inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_aliasing_static_ref`
</p></details>
<details ><summary><b>FLAKY</b> - The following job failed but was
likely due to flakiness present on trunk:</summary><p>

* [Lint / lintrunner-noclang /
linux-job](https://hud.pytorch.org/pr/pytorch/pytorch/125264#24446808455)
([gh](https://github.com/pytorch/pytorch/actions/runs/8901751878/job/24446808455))
(matched **linux** rule in
[flaky-rules.json](https://github.com/pytorch/test-infra/blob/generated-stats/stats/flaky-rules.json))
    `The process '/usr/bin/git' failed with exit code 1`
</p></details>


2. From pytorch/executorch#3318

<details ><summary><b>UNSTABLE</b> - The following job failed but was
likely due to flakiness present on trunk and has been marked as
unstable:</summary><p>

* [Android / test-llama-app / mobile-job
(android)](https://hud.pytorch.org/pr/pytorch/executorch/3318#24282434625)
([gh](https://github.com/pytorch/executorch/actions/runs/8842776071/job/24282434625))
([#3344](pytorch/executorch#3344))
`Credentials could not be loaded, please check your action inputs: Could
not load credentials from any providers`
</p></details>


3. From pytorch/pytorch#125143

* [Lint / lintrunner-noclang /
linux-job](https://hud.pytorch.org/pr/pytorch/pytorch/125143#24373801771)
([gh](https://github.com/pytorch/pytorch/actions/runs/8878104746/job/24373801771))
    `>>> Lint for torch/onnx/_internal/onnx_proto_utils.py:`
</p></details>
<details ><summary><b>FLAKY</b> - The following jobs failed but were
likely due to flakiness present on trunk:</summary><p>

* [BC Lint /
bc_linter](https://hud.pytorch.org/pr/pytorch/pytorch/125143#24450453134)
([gh](https://github.com/pytorch/pytorch/actions/runs/8878104658/job/24450453134))
(suppressed by suppress-bc-linter)
    `Process completed with exit code 1.`
* [pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 2, 5,
linux.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125143#24374207841)
([gh](https://github.com/pytorch/pytorch/actions/runs/8878104713/job/24374207841))
([similar
failure](https://hud.pytorch.org/pytorch/pytorch/commit/1a0b24776212b383d025010e935f33f58a96e276#24348608242))

`test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_bool`
</p></details>
@cyyever cyyever force-pushed the caffe2_python branch 2 times, most recently from 64da49d to 038b531 Compare May 4, 2024 15:14
@cyyever cyyever marked this pull request as draft May 4, 2024 15:14
@cyyever cyyever force-pushed the caffe2_python branch 2 times, most recently from 8e1c9f6 to bbece45 Compare May 9, 2024 23:26
@cyyever cyyever marked this pull request as ready for review May 9, 2024 23:27
@cyyever
Copy link
Collaborator Author

cyyever commented May 10, 2024

@r-barnes Take a look at this?

@r-barnes
Copy link
Contributor

@cyyever - Sorry I missed this!

@r-barnes
Copy link
Contributor

@pytorchbot rebase -b main

@r-barnes
Copy link
Contributor

Rebasing to see if the 21 failures go away.

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased caffe2_python onto refs/remotes/origin/main, please pull locally before adding more changes (for example, via git checkout caffe2_python && git pull --rebase)

@albanD albanD removed their request for review May 10, 2024 14:46
@r-barnes
Copy link
Contributor

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Approvers from one of the following sets are needed:

  • superuser (pytorch/metamates)
  • Core Reviewers (mruberry, lezcano, Skylion007, ngimel, peterbell10)
  • Core Maintainers (soumith, gchanan, ezyang, dzhulgakov, malfet)
Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@r-barnes
Copy link
Contributor

@pytorchbot merge -i

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 0 checks:

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@cyyever cyyever deleted the caffe2_python branch May 10, 2024 23:47
tinglvv pushed a commit to tinglvv/pytorch that referenced this pull request May 14, 2024
This PR tries to decompose pytorch#122527 into a smaller one. Caffe2 python build scripts were removed and some tensorboard code using Caffe2 was removed too.
To be noted, this was inspired and is co-dev with @r-barnes.

Pull Request resolved: pytorch#125143
Approved by: https://github.com/r-barnes, https://github.com/albanD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

caffe2 ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/trunk Trigger trunk jobs on your pull request Merged module: bazel module: build Build system issues module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration open source release notes: releng release notes category skip-pr-sanity-checks suppress-bc-linter Suppresses the failures of API backward-compatibility linter (Lint/bc_linter) topic: bc breaking topic category topic: build topic: deprecation topic category topic: improvements topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants