-
Notifications
You must be signed in to change notification settings - Fork 26.3k
delegate parallelism to Ninja when possible #64733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) [ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 2e28130 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. ghstack-source-id: 3ed3bf6 Pull Request resolved: #64733
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. ghstack-source-id: c3cc538 Pull Request resolved: #64733
|
@dagitses has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
|
@dagitses has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. ghstack-source-id: 752dcd0 Pull Request resolved: #64733
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. ghstack-source-id: b9e932c Pull Request resolved: #64733
|
@dagitses has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. ghstack-source-id: 03c51be Pull Request resolved: #64733
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
|
@dagitses has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. ghstack-source-id: 5061a5f Pull Request resolved: #64733
|
@dagitses has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. Differential Revision: [D30968796](https://our.internmc.facebook.com/intern/diff/D30968796) [ghstack-poisoned]
The previous implementation was wrong when CPU scheduling affinity is set. In fact, it is still wrong if Ninja is not being used. When there is CPU scheduling affinity set, the number of processors available on the system likely exceeds the number of processors that are usable to the build. We ought to use `len(os.sched_getaffinity(0))` to determine the effective parallelism. This change is more minimal and instead just delegates to Ninja (which handles this correctly) when it is used. Test Plan: I verified this worked as correctly using Ninja on a 96-core machine with 24 cores available for scheduling by checking: * the cmake command did not specify "-j" * the number of top-level jobs in top/pstree never exceeded 26 (24 + 2) And I verified we get the legacy behavior by specifying USE_NINJA=0 on the build. And I also verified that MAX_JOBS trumps everything if specified. ghstack-source-id: 3457894 Pull Request resolved: #64733
|
@dagitses has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
This PR causes building failure on windows when building with $env:MAX_JOBS="15"
python setup.py developwith the failure as: See comment |
|
Summary: Reported by cloudhan in #64733 (comment) Fixes regression introduced by 047e682 cc malfet seemethere Pull Request resolved: #65444 Reviewed By: dagitses, seemethere Differential Revision: D31103260 Pulled By: malfet fbshipit-source-id: 9d5454a64cb8a0b96264119cf16582cc5afed284
Summary: Reported by cloudhan in pytorch#64733 (comment) Fixes regression introduced by pytorch@047e682 cc malfet seemethere Pull Request resolved: pytorch#65444 Reviewed By: dagitses, seemethere Differential Revision: D31103260 Pulled By: malfet fbshipit-source-id: 9d5454a64cb8a0b96264119cf16582cc5afed284
Summary: Reported by cloudhan in #64733 (comment) Fixes regression introduced by 047e682 cc malfet seemethere Pull Request resolved: #65444 Reviewed By: dagitses, seemethere Differential Revision: D31103260 Pulled By: malfet fbshipit-source-id: 9d5454a64cb8a0b96264119cf16582cc5afed284
Stack from ghstack:
The previous implementation was wrong when CPU scheduling affinity is
set. In fact, it is still wrong if Ninja is not being used.
When there is CPU scheduling affinity set, the number of processors
available on the system likely exceeds the number of processors that
are usable to the build. We ought to use
len(os.sched_getaffinity(0))to determine the effective parallelism.This change is more minimal and instead just delegates to Ninja (which
handles this correctly) when it is used.
Test Plan:
I verified this worked as correctly using Ninja on a 96-core machine
with 24 cores available for scheduling by checking:
2)
And I verified we get the legacy behavior by specifying USE_NINJA=0 on
the build.
And I also verified that MAX_JOBS trumps everything if specified.
Differential Revision: D30968796