[inductor][cpp][gemm] cache blocking config for dynamic shapes #133538

jgong5 · 2024-08-15T03:06:24Z

Stack from ghstack (oldest at bottom):

cc @voznesenskym @penguinwu @EikanWang @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

[ghstack-poisoned]

pytorch-bot · 2024-08-15T03:06:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133538

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 6d10526 with merge base 32f3af7 ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

trunk / macos-py3-arm64 / test (default, 1, 3, macos-m1-stable) (gh) (trunk failure)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: e8926bc Pull Request resolved: #133538

[ghstack-poisoned]

ghstack-source-id: a83816a Pull Request resolved: #133538

[ghstack-poisoned]

leslie-fang-intel · 2024-09-07T00:44:52Z

torch/_inductor/codegen/cpp_prefix.h

+}
+
+std::tuple<std::shared_ptr<int64_t[]>, int> get_factors(int64_t number) {
+  thread_local std::map<int64_t, std::tuple<std::shared_ptr<int64_t[]>, int>> cache;


nit: why do we need to set this cache as thread_local? If multiple threads are running workloads, I think these threads can share these factors since it only relates to the number of threads for division. Is it due to the concern about initializing this cache with multi threads?

nit: why do we need to set this cache as thread_local? If multiple threads are running workloads, I think these threads can share these factors since it only relates to the number of threads for division. Is it due to the concern about initializing this cache with multi threads?

Yes, thread_local avoids the lock contention when multiple threads need to access and modify the cache.

[ghstack-poisoned]

jgong5 · 2024-09-07T07:30:25Z

@pytorchbot merge

pytorchmergebot · 2024-09-07T07:32:59Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…ch#133538) Pull Request resolved: pytorch#133538 Approved by: https://github.com/leslie-fang-intel ghstack dependencies: pytorch#135277, pytorch#133447 Co-authored-by: Wu, Chunyuan <[email protected]>

ghstack-source-id: 1f927ba Pull Request resolved: pytorch/pytorch#133538

Update

74fed56

[ghstack-poisoned]

jgong5 mentioned this pull request Aug 14, 2024

[inductor][cpp][gemm] improve large bs perf with better cache blocking #132729

Closed

pytorch-bot bot added ciflow/inductor module: inductor labels Aug 15, 2024

jgong5 pushed a commit that referenced this pull request Aug 15, 2024

[inductor][cpp][gemm] cache blocking config for dynamic shapes

99b6e56

ghstack-source-id: e8926bc Pull Request resolved: #133538

Update

31e1b1b

[ghstack-poisoned]

jgong5 pushed a commit that referenced this pull request Aug 15, 2024

[inductor][cpp][gemm] cache blocking config for dynamic shapes

81ff295

ghstack-source-id: a83816a Pull Request resolved: #133538

pytorchbot added the open source label Aug 15, 2024

jgong5 mentioned this pull request Aug 17, 2024

[inductor][cpp][gemm] improve cache blocking for small K and N #133755

Closed

Jiong Gong and others added 8 commits August 17, 2024 01:22

Update

1d3f3e3

[ghstack-poisoned]

Update

b6064fe

[ghstack-poisoned]

Update

48242df

[ghstack-poisoned]

Update

0896e8c

[ghstack-poisoned]

Update

0f310a1

[ghstack-poisoned]

Update

743e1fb

[ghstack-poisoned]

Update

7f77f54

[ghstack-poisoned]

Update

e485492

[ghstack-poisoned]

jgong5 mentioned this pull request Sep 5, 2024

[inductor][cpp][gemm] fix autotune runtime error from linear_binary fusion #135275

Closed

Update

587bd65

[ghstack-poisoned]

jgong5 mentioned this pull request Sep 5, 2024

[inductor][cpp][gemm] reduce memory alloc overhead by allocating local acc once per thread #135277

Closed

chunyuan-w and others added 5 commits September 5, 2024 21:05

Update

8befd10

[ghstack-poisoned]

Update

5c94914

[ghstack-poisoned]

Update

21457ac

[ghstack-poisoned]

Update

3a4ed57

[ghstack-poisoned]

Update

8088668

[ghstack-poisoned]

jgong5 requested review from chunyuan-w and leslie-fang-intel September 6, 2024 15:00

leslie-fang-intel reviewed Sep 7, 2024

View reviewed changes

leslie-fang-intel approved these changes Sep 7, 2024

View reviewed changes

Update

6d10526

[ghstack-poisoned]

jgong5 added ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category labels Sep 7, 2024

pytorchmergebot added the merging label Sep 7, 2024

pytorchmergebot added the Merged label Sep 7, 2024

pytorchmergebot closed this in d7c97e7 Sep 7, 2024

pytorchmergebot removed the merging label Sep 7, 2024

github-actions bot deleted the gh/jgong5/68/head branch October 9, 2024 02:04

KnAwnime pushed a commit to KnAwnime/Biblioteka that referenced this pull request Oct 16, 2024

[inductor][cpp][gemm] cache blocking config for dynamic shapes

5015762

ghstack-source-id: 1f927ba Pull Request resolved: pytorch/pytorch#133538

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[inductor][cpp][gemm] cache blocking config for dynamic shapes #133538

[inductor][cpp][gemm] cache blocking config for dynamic shapes #133538

Uh oh!

jgong5 commented Aug 15, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 15, 2024 •

edited

Loading

Uh oh!

leslie-fang-intel Sep 7, 2024 •

edited

Loading

Uh oh!

jgong5 Sep 7, 2024

Uh oh!

jgong5 commented Sep 7, 2024

Uh oh!

pytorchmergebot commented Sep 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[inductor][cpp][gemm] cache blocking config for dynamic shapes #133538

[inductor][cpp][gemm] cache blocking config for dynamic shapes #133538

Uh oh!

Conversation

jgong5 commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133538

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

leslie-fang-intel Sep 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jgong5 Sep 7, 2024

Choose a reason for hiding this comment

Uh oh!

jgong5 commented Sep 7, 2024

Uh oh!

pytorchmergebot commented Sep 7, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jgong5 commented Aug 15, 2024 •

edited

Loading

pytorch-bot bot commented Aug 15, 2024 •

edited

Loading

leslie-fang-intel Sep 7, 2024 •

edited

Loading