Use process cpu affinity instead of hardware specs to get cpu count#17566
Use process cpu affinity instead of hardware specs to get cpu count#17566haampie wants to merge 1 commit intospack:developfrom
Conversation
71c533e to
5b877fc
Compare
5b877fc to
d41123f
Compare
I wonder how "72" is possible. We have a default limit to 16 unless users modify config (or there's a bug in the current logic somewhere). |
|
You're right, I've set So, with this patch, spack respects linux cgroups, which is used in slurm and container runtimes to restrict resources to certain processes. I would say not respecting that is a bug. Of course I could just |
Sorry, I don't get the bug being resolved here. Spack spawns 72 processes because it was told to do so. If the 72 processes are hogging more than the 8 physical cpus that Spack should be restricted to, then it seems a system configuration error to me rather than Spack fault. Am I missing part of the issue? |
|
So there's two things:
Now the problem with 2 is that it should not derive E.g. in docker you have this flag to specify the cpus being made available, and you can see the difference between nproc and nproc --all If I run spack in a container like the latter, I would assume the effective number of build jobs is computed as Hopefully that makes sense. |
|
Thanks @haampie now I understand what you're saying. Can you update the description at the top with the information in #17566 (comment)? I personally see this issue as something in between a minor bug and an implementation defined behavior. I say that because (though not documented) Python multiprocessing is calling this function: which in turn calls this other function : that return the number of processors "available" for some definition of "available". Python docs is not of much help: in I'd like to hear the opinion of other maintainers before merging this since it's in my opinion a slightly better implementation but:
Regarding point 4. I mean that using 16 processes on a system confined to less cores shouldn't be that much of a problem and seeing speed-up using more than |
I don't think this is a very useful definition of 'available', the man page talks about hot pluggable cpus, it's most likely something on the hardware level.
I would be highly surprised if someone relied on this behaviour. If you explicitly fix the number of cores available to a process, you most likely want that to be reflected in
Not in CI where
Yeah, maybe it can later move to its own python package. I would think almost all Linux users do
Yes, but in CI you might still use the same node for multiple builds where you take < 16 cores per container / build, and spack will still spin up 16 build tasks, making builds slower. Also, "You have issues because you set build jobs to a very large number to start with" is really an issue of spack. I just want it to automatically pick the optimal number of cores as I made them available to the process, which is the easiest way to configure builds in e.g. CI where you don't have as much control because I would say the current implementation is a bug, so it could / should be fixed within a patch version update, right? |
|
@haampie I'll try to summarize what I said above, since maybe it was not clear. I think this IS a slightly better implementation but since:
I'd like to hear opinions from other maintainers before merging it. That said, I think that the impact of this is low. EDIT: Some arguments in the discussion are also inherently based on an assumption for which I'd like to hear more about:
What is the reasoning that leads you to the fact that having e.g. 72 physical cores available the optimal way to build is Also, can you please put in the top level description the configuration setting you need to have to reproduce the issue you are encountering on Slurm? That will make the numbers
In cloud builds we might be using VMs and that adds another layer of indirection that this PR doesn't handle anyway, right? |
If it's seen as a bugfix, which I would say it is, it's only a good thing. Again, it only applies to the edge case where people are using cgroups (through slurm, docker, kubernetes), and if they do, they probably know what they're doing, and they want spack to not spawn 16 jobs if they confine the process to 1 thread.
Except if you use
With optimal I mean: number of build tasks = number of cores available. To the extreme: if I want to have 1 single-threaded spack build job on a node of my cluster, I don't want spack to spin up 72 tasks, which it would do now if I don't force it with
It handles it correctly. VMs do a good job at making |
|
To reiterate: I see this PR as a minor improvement in some heuristics (compute the number of "usable" cpus on Linux instead of the "available") that also brings some minor concerns with it so I'd like other people to weigh in before merging it. That said, since you are advocating so strongly for this modification, I wonder if I missed some points. Do you concur that:
Also, it seems to me that after some exchange this issue boils down to use cases tied to |
Ah, I just discovered part of the confusion might have been by something I had missed, I'm sorry! The whole time I was thinking this was about getting a proper, sensible default for the number of build jobs when the user does not specify the But this is not the case, since So I would agree that this PR indeed changes the defaults for everybody in a way that might be a bit of a nuisance (e.g. not being able to specify two parallel jobs in a container limited to a single thread, which might still be a useful thing to do). Wouldn't it be better to do the following:
|
|
Closing this in favor of #22360 |
When using spack in slurm and/or containers I found that spack does not respect the number of cores made available to the process, but rather takes as many threads as the hardware provides. This is an issue mainly in CI, where spack commands may be executed on different nodes with a different core count, and where many commands like
spack installare generated or implicit (e.g. inspack ci rebuild) s.t. one cannot easily pass the true number of cores available.As an example, in my case spack was running
make -j72, whereas only8cores were available. This happens because of the following:nproc --allbuild_jobsto a high/unlimited number with the idea to make spack autodetect the number of cores available. (This seems sensible to me for CI: same config for different nodes, some nodes have a large number of cores, so the default limit of 16 is too small).SLURM_CPUS_PER_TASKvariable, I had set it to 8, because I want to run multiple jobs on the same node. (Same story with Kubernetes and cpu requests or your favorite docker setup with the--cpusflag)spackuses the Python callmultiprocessing.cpu_count()to attempt to limit the number of build jobs to the number of cores available, but this returns the number of cores on the hardware level (72) instead of the number of cores available to the process (8).Note that this is the same difference between calling
$ nproc --all(returns 72) and$ nproc(returns 8). The latter is correct.To fix this I'm using
len(os.sched_getaffinity(0)), which is linux-specific, to get the number of cores/threads actually available. According to the man page using0is equivalent to getting the cpu affinity for the current process, so this should be fine.To reproduce in Docker:
it should have exactly 2 cores available for the process, but with the default config will build with
make -j16if you have at least 16 procs.To reproduce with Slurm:
it should have exactly 2 cores available for the process, but builds with
-j16in the default config if you have at least 16 procs.