sparse_broadcast_to: less memory footprint, fewer kernel launches #142364

nikitaved · 2024-12-09T12:02:56Z

As per title.

The following implementation removes the usage of repeat_interleave, tile and full_coo_indices and replaces them with broadcasting. That way we reduce memory traffic (and are likely to hit cache a lot) and the total number of launched kernels.

cc @alexsamardzic @pearu @cpuhrsch @amjames @bhosmer @jcaip

pytorch-bot · 2024-12-09T12:03:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142364

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1c535d6 with merge base c29b4ed ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

amjames

Nice! This looks good, minor nit in the CHECK message.

amjames · 2024-12-10T18:26:02Z

aten/src/ATen/native/TensorShape.cpp

+
+  for (int64_t i = 0; i < self.dim(); ++i) {
+    TORCH_CHECK(self_size[i] == 1 || self_size[i] == size[i + new_sparse_dims],
+                "The input's lenght ", self_size[i], " at dimension ", i,


Suggested change

"The input's lenght ", self_size[i], " at dimension ", i,

"The input's length ", self_size[i], " at dimension ", i,

amjames · 2024-12-10T18:26:13Z

aten/src/ATen/native/TensorShape.cpp

+  for (int64_t i = 0; i < self.dim(); ++i) {
+    TORCH_CHECK(self_size[i] == 1 || self_size[i] == size[i + new_sparse_dims],
+                "The input's lenght ", self_size[i], " at dimension ", i,
+                " does not broadcast over the requested shape of lenght ", size[i + new_sparse_dims],


Suggested change

" does not broadcast over the requested shape of lenght ", size[i + new_sparse_dims],

" does not broadcast over the requested shape of length ", size[i + new_sparse_dims],

cpuhrsch

Awesome!

nikitaved · 2024-12-11T13:49:07Z

Thank you for your reviews, guys!

@pytorchbot merge

pytorchmergebot · 2024-12-11T13:51:04Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

nikitaved requested review from amjames, cpuhrsch and pearu December 9, 2024 12:03

nikitaved added module: sparse Related to torch.sparse release notes: sparse release notes category labels Dec 9, 2024

pytorchbot added the open source label Dec 9, 2024

nikitaved force-pushed the nikitaved/coo_broadcast_less_memory branch 3 times, most recently from 9db7b0a to fae893d Compare December 9, 2024 13:32

janeyx99 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 9, 2024

amjames approved these changes Dec 10, 2024

View reviewed changes

cpuhrsch approved these changes Dec 10, 2024

View reviewed changes

sparse_broadcast_to: less memory footprint, fewer kernel launches

1c535d6

nikitaved force-pushed the nikitaved/coo_broadcast_less_memory branch from fae893d to 1c535d6 Compare December 11, 2024 11:19

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 11, 2024

pytorchmergebot added the merging label Dec 11, 2024

pytorchmergebot added the Merged label Dec 11, 2024

pytorchmergebot closed this in d5e0041 Dec 11, 2024

pytorchmergebot removed the merging label Dec 11, 2024

github-actions bot deleted the nikitaved/coo_broadcast_less_memory branch January 11, 2025 02:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sparse_broadcast_to: less memory footprint, fewer kernel launches #142364

sparse_broadcast_to: less memory footprint, fewer kernel launches #142364

Uh oh!

nikitaved commented Dec 9, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Dec 9, 2024 •

edited

Loading

Uh oh!

amjames left a comment

Uh oh!

amjames Dec 10, 2024

Uh oh!

amjames Dec 10, 2024

Uh oh!

cpuhrsch left a comment

Uh oh!

nikitaved commented Dec 11, 2024

Uh oh!

pytorchmergebot commented Dec 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	"The input's lenght ", self_size[i], " at dimension ", i,
	"The input's length ", self_size[i], " at dimension ", i,

	" does not broadcast over the requested shape of lenght ", size[i + new_sparse_dims],
	" does not broadcast over the requested shape of length ", size[i + new_sparse_dims],

sparse_broadcast_to: less memory footprint, fewer kernel launches #142364

sparse_broadcast_to: less memory footprint, fewer kernel launches #142364

Uh oh!

Conversation

nikitaved commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142364

✅ No Failures

Uh oh!

amjames left a comment

Choose a reason for hiding this comment

Uh oh!

amjames Dec 10, 2024

Choose a reason for hiding this comment

Uh oh!

amjames Dec 10, 2024

Choose a reason for hiding this comment

Uh oh!

cpuhrsch left a comment

Choose a reason for hiding this comment

Uh oh!

nikitaved commented Dec 11, 2024

Uh oh!

pytorchmergebot commented Dec 11, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

nikitaved commented Dec 9, 2024 •

edited

Loading

pytorch-bot bot commented Dec 9, 2024 •

edited

Loading