-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add threshold for ops using openmp macro #5584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@pytorchbot test this please |
aten/src/TH/generic/THTensorMath.c
Outdated
| #endif | ||
|
|
||
| #ifdef _OPENMP | ||
| LAB_IMPLEMENT_BASIC_FUNCTION(log,TH_MATH_NAME(log),HYPER_TH_OMP_OVERHEAD_THRESHOLD) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
CC @cpuhrsch |
|
Thanks @ezyang. I'm not sure what the philosophy here is, but CC @colesbury should know more. |
|
@ezyang @colesbury Could you spare some time to review this PR? I want to optimize nn module based on these macros. |
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with merging this but I don't regularly interact with this code.
|
@pytorchbot retest this please |
|
@pytorchbot retest this please |
|
@fmassa @zdevito @vedanuj , some changes in this PR is related to #5010. A parameter named VECTORIZE is added to determine whether to use THVector_(NAME) implementation or not. This can avoid defining a new macro just for sigmoid. |
|
@pytorchbot test this please |
| } else {\ | ||
| PRAGMA(simd) \ | ||
| PRAGMA( omp parallel for if (SIZE > TH_OMP_OVERHEAD_THRESHOLD_OMP) ) \ | ||
| PRAGMA( omp parallel for if (SIZE > OMP_THRESHOLD * 10) ) \ |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
|
||
| #define HYPER_TH_OMP_OVERHEAD_THRESHOLD 2000 | ||
| #define ORDIN_TH_OMP_OVERHEAD_THRESHOLD 20000 | ||
| #define UNCERTAIN_TH_OMP_OVERHEAD_THRESHOLD 50000 |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
Modify macro LAB_IMPLEMENT_VECTORIZED_FUNCTION to enable optional parameters This reverts commit 8ef783a. Conflicts: aten/src/TH/generic/THTensorMath.c
|
@pytorchbot test this please |
|
@ezyang Could you spare some time to review this PR? I want to optimize nn module based on these macros. |
* add threshold for ops using omp macro * modify interface for ops using omp macro * modify some thresholds * implement C macros with optional parameters to avoid duplicating definitions for all pointwise operations * add a parameter of LAB_IMPLEMENT_BASIC_FUNCTION for vectorizing * modify the comment * Revert "add a parameter of LAB_IMPLEMENT_BASIC_FUNCTION for vectorizing" Modify macro LAB_IMPLEMENT_VECTORIZED_FUNCTION to enable optional parameters This reverts commit 8ef783a. Conflicts: aten/src/TH/generic/THTensorMath.c * fix build error on windows * retrigger the test
* add threshold for ops using omp macro * modify interface for ops using omp macro * modify some thresholds * implement C macros with optional parameters to avoid duplicating definitions for all pointwise operations * add a parameter of LAB_IMPLEMENT_BASIC_FUNCTION for vectorizing * modify the comment * Revert "add a parameter of LAB_IMPLEMENT_BASIC_FUNCTION for vectorizing" Modify macro LAB_IMPLEMENT_VECTORIZED_FUNCTION to enable optional parameters This reverts commit 8ef783a. Conflicts: aten/src/TH/generic/THTensorMath.c * fix build error on windows * retrigger the test
As described in issue #4188 and benchmark, optimal OpenMP threshold is dependent on operation type, CPU type and tensors' contiguity.
This PR add a parameter named OMP_THRESHOLD for macros TH_TENSOR_APPLYX_OMP to control the OpenMP threshold.