Skip to content

Conversation

@lw
Copy link
Contributor

@lw lw commented Mar 6, 2025

Stack from ghstack (oldest at bottom):

This was added in #126320. It's a very nice feature, which can be used to predict memory usage for different budget values.

However, it had some limitations, notably in terms of resolution (it only sampled 21 points across the whole range thus missed many threshold values) and in distributed settings.

Here I fix those by using recursive binary searches to identify all thresholds (up to a resolution of 1e-3, which can be made configurable) and output them in SVG (to be able to discern different points), plus I add the rank to the filename and store it in a user-define directory.

cc @zou3519 @Chillee @samdow @kshitij12345

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 6, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148678

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 79f0696 with merge base 5fb0f45 (image):
💚 Looks good so far! There are no failures yet. 💚

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

lw added a commit that referenced this pull request Mar 6, 2025
This was added in #126320. It's a very nice feature, which can be used to predict memory usage for different budget values.

However, it had some limitations, notably in terms of resolution (it only sampled 21 points across the whole range thus missed many threshold values) and in distributed settings.

Here I fix those by using recursive binary searches to identify all thresholds (up to a resolution of 1e-3, which can be made configurable) and output them in SVG (to be able to discern different points), plus I add the rank to the filename and store it in a user-define directory.

ghstack-source-id: 4deba4a
Pull Request resolved: #148678
@lw lw added release notes: autograd release notes category module: functorch Pertaining to torch.func or pytorch/functorch release notes: torch.func release notes category for torch.vmap or torch.func.* APIs labels Mar 6, 2025
@lw
Copy link
Contributor Author

lw commented Mar 6, 2025

Before:
image

After:
image

@lw
Copy link
Contributor Author

lw commented Mar 7, 2025

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 7, 2025
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Approvers from one of the following sets are needed:

  • functorch (kshitij12345, srossross, chillee, zou3519)
  • superuser (pytorch/metamates)
  • Core Reviewers (mruberry, lezcano, Skylion007, ngimel, peterbell10, ...)
  • Core Maintainers (soumith, gchanan, ezyang, dzhulgakov, malfet, ...)
Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@lw
Copy link
Contributor Author

lw commented Mar 7, 2025

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actions github-actions bot deleted the gh/lw/7/head branch April 11, 2025 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: functorch Pertaining to torch.func or pytorch/functorch release notes: autograd release notes category release notes: torch.func release notes category for torch.vmap or torch.func.* APIs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants