Skip to content

[inductor] show more kernel specific metrics in the benchmark result#96249

Closed
shunting314 wants to merge 4 commits intogh/shunting314/24/basefrom
gh/shunting314/24/head
Closed

[inductor] show more kernel specific metrics in the benchmark result#96249
shunting314 wants to merge 4 commits intogh/shunting314/24/basefrom
gh/shunting314/24/head

Conversation

@shunting314
Copy link
Contributor

@shunting314 shunting314 commented Mar 8, 2023

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 8, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/96249

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 Failures

As of commit 776b771:

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base fe05266:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2023

This PR needs a label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@shunting314 shunting314 requested review from Chillee and ngimel March 8, 2023 01:06
…ark result"


Show the following kernel specific metrics in the kernel benchmark:
- shared memory used
- number of registers used
- number of spills.

Depends on triton-lang/triton#1296 


cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
shunting314 added a commit that referenced this pull request Mar 8, 2023
@Chillee
Copy link
Collaborator

Chillee commented Mar 8, 2023

Hmm... I think we might need a way to config what info is shown for each kernel 🤔 I'm not sure we always want this info?

@shunting314
Copy link
Contributor Author

Hmm... I think we might need a way to config what info is shown for each kernel 🤔 I'm not sure we always want this info?

yea, I'll add an option to control those kernel details in the output

…ark result"


Show the following kernel specific metrics in the kernel benchmark:
- shared memory used
- number of registers used
- number of spills.

Depends on triton-lang/triton#1296 


cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
shunting314 added a commit that referenced this pull request Mar 8, 2023
@shunting314
Copy link
Contributor Author

shunting314 commented Mar 8, 2023

Add an option '--no-show-kernel-detail' to skip kernel details in the output. By default we will print kernel details.

Example, run the compiled module as follows:

Screenshot 2023-03-08 at 11 36 52 AM

@Chillee
Copy link
Collaborator

Chillee commented Mar 8, 2023

@shunting314 I don't think we need a flag necessarily. I think we should just turn it on when benchmarking kernels and not with the TORCHINDUCTOR_PROFILE path.

@shunting314
Copy link
Contributor Author

@shunting314 I don't think we need a flag necessarily. I think we should just turn it on when benchmarking kernels and not with the TORCHINDUCTOR_PROFILE path.

Ah, I misunderstood that then. So currently we have 2 places to print kernel bandwidth

  1. in DebugAutotuner when TORCHINDUCTOR_PROFILE is enabled
  2. in the kernel benchmarks implemented in this and previous PRs. This one follows what is done in DebugAutotuner to print the bandwidth information. But the code is separate.

So we won't see the kernel details in the output for TORCHINDUCTOR_PROFILE.

One improvement is we can create a single API to print bandwidth information and both cases can call that. The API will make printing kernel detail optional. Let me know if you want me to do that.

@Chillee
Copy link
Collaborator

Chillee commented Mar 8, 2023

The API will make printing kernel detail optional

Yeah that would be nice. Also, I'm wondering if you'll have some issues landing this since you need a corresponding Triton bump. Might need to guard on the attributes existing.

@shunting314
Copy link
Contributor Author

Also, I'm wondering if you'll have some issues landing this since you need a corresponding Triton bump. Might need to guard on the attributes existing.

Good question lol. I feel TORCHINDUCTOR_BENCHMARK_KERNEL is already a guard. People can use the kernel benchmarks only when the env-var is enabled. In case they enable the env-var but use a stale triton version, we indeed will show some attribute not found errors. I can improve this a bit: print a warning to ask user to update triton version

@Chillee
Copy link
Collaborator

Chillee commented Mar 8, 2023

It would be good to add a test for this stuff though, just to make sure they don't get broken by a random change.

@shunting314
Copy link
Contributor Author

It would be good to add a test for this stuff though, just to make sure they don't get broken by a random change.

yea, will add one

…ark result"


Show the following kernel specific metrics in the kernel benchmark:
- shared memory used
- number of registers used
- number of spills.

Depends on triton-lang/triton#1296 


cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
@shunting314
Copy link
Contributor Author

Duo to some merge issue (partially because of the diff train sev) I have to reland this as #96461 . Sorry for the trouble

@shunting314 shunting314 closed this Mar 9, 2023
shunting314 added a commit that referenced this pull request Mar 9, 2023
…s in the benchmark result"

cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot deleted the gh/shunting314/24/head branch June 8, 2023 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants