Actually allow parallel test execution under CMake.#8484
Actually allow parallel test execution under CMake.#8484scheibelp merged 2 commits intospack:developfrom
Conversation
davydden
left a comment
There was a problem hiding this comment.
I am sorry, but
or one can invoke ctest instead of make test.
that will break dealii, we have make test which is separate from ctest. The former are quick tests, whereas ctest will run ~10k unit tests.
We need some mechanism to switch between the two cases. Could be a flag in Package class with default being what you suggest.
|
@davydden We can go the other route (setting an environment variable), but in my initial analysis was unable to find a way to ensure that the environment variable is set to the correct parallelism value, taking into account the various ways it could be overridden. I can take another look. |
lib/spack/spack/package.py
Outdated
| inspect.getmodule(self).make(target) | ||
|
|
||
| def _if_ninja_target_execute(self, target): | ||
| def _if_ninja_target(self, target): |
There was a problem hiding this comment.
In general I sympathize with @davydden here. Also a minor naming comment: I think the splitted functions should be named something like _make_has_target and ninja_has_target instead of _if_ninja_target. That in my opinion sounds better when reading the code. For instance:
if self._make_has_target('test'):
inspect.getmodule(self).ctest()versus:
if self._if_make_target('test'):
inspect.getmodule(self).ctest()
lib/spack/spack/package.py
Outdated
| dump_packages(self.spec, packages_dir) | ||
|
|
||
| def _if_make_target_execute(self, target): | ||
| def _if_make_target(self, target): |
There was a problem hiding this comment.
These changes would make #8223 much easier to unit test.
9502108 to
d5133fc
Compare
|
@davydden @adamjstewart @alalazo Per Denis' request, I have implemented more flexibility in the Note that this new implementation invalidates previous comments regarding the new functions |
| if self.generator == 'Unix Makefiles': | ||
| self._if_make_target_execute('test') | ||
| self._if_make_target_execute('test', | ||
| jobs_env='CTEST_PARALLEL_LEVEL') |
There was a problem hiding this comment.
how does one use it? Could you please add a sentence in class documentation?
there is a dedicated PR on documentation for build systems #5015 but it is not merged yet.
There was a problem hiding this comment.
@davydden If you're talking specifically about the cmake.py change you noted above, "one" doesn't "use" it. It's a change in the implementation to provide the parallelism that one could reasonably expect for the test target in the event that it is CMake-generated (i.e. invokes ctest). This implementation uses the expanded functionality of MakeExecutable, which is actually explained in the class documentation therefor.
In your case, where the test target is hand-provided by your package rather than generated by CMake, there will be no effect unless (of course) your target invokes ctest.
There was a problem hiding this comment.
@davydden I'm happy that it does what it set out to do. If everyone else thinks it's a reasonable implementation, I'm happy for it to be merged.
davydden
left a comment
There was a problem hiding this comment.
presumably the default behaviour stays the same as before?
|
@adamjstewart @davydden @alalazo Currently trying to understand the Python 2.6 failure. Insights welcome ... |
|
@davydden The default behavior is that |
4f503d6 to
0948ec5
Compare
|
As of right now the Python 2.6 issue has been fixed and the only unit test failure is transient ( |
I am AFK for 3 weeks, but I don’t mind merging this if you think it works |
0948ec5 to
9de1797
Compare
|
@adamjstewart Looks like someone with access to travis might need to delete the cached |
|
Unfortunately I don't have access (or I don't know how to clear the cache). @tgamblin might. Rebasing also seems to help. |
9de1797 to
3b6b49b
Compare
Split off from _if_make_target_execute() and _if_ninja_target_execute(), respectively. Intended to facilitate testing of other functionality, per spack#8484 (comment). spack#8484 (comment) was considered, but this word order maintains consistency with existing functions.
Split off from _if_make_target_execute() and _if_ninja_target_execute(), respectively. Intended to facilitate testing of other functionality, per spack#8484 (comment). spack#8484 (comment) was considered, but this word order maintains consistency with existing functions.
3b6b49b to
23f7ef2
Compare
|
Rebase against |
23f7ef2 to
588e069
Compare
|
@davydden Is this PR ready for merge? I'd be grateful if you had the time to look this over one last time and merge if you're happy. |
|
I approved it but I don’t have merge rights. You need to ping others |
588e069 to
9e7b5e9
Compare
|
I believe I have resolved the conflicts and rebased on |
lib/spack/spack/build_environment.py
Outdated
| if parallel: | ||
| jobs = "-j%d" % self.jobs | ||
| args = (jobs,) + args | ||
| jobs = kwargs.pop('jobs_arg', '-j{0}') |
There was a problem hiding this comment.
IMO the following would be more straightforward (I don't see anything setting the new jobs_arg variable so IMO it isn't needed):
jobs_arg = 'j{0}'.format(self.jobs)
args = (jobs_arg,) + args
if jobs_env:
...
(I also think this avoids setting multiple jobs arguments, which may be valid and appears to be handled here, but I'm not aware of any system that needs it so I'd prefer to avoid the complexity)
Does it make sense to avoid adding a -jN argument when jobs_env is set?
I think it's worth adding a note that you are using kwargs.pop because jobs_env is intended for use by this function specifically and that the remaining kwargs are to be passed to the make invocation.
There was a problem hiding this comment.
jobs_arg is there for a little more flexibility for new make-like execs, for wrappers using long-form options (ctest has a --parallel alias, for example) or separating the N from the -j, and for the case where one actually does want to avoid -jN when job_env is set. Note that in the case jobs_env was put in place to address (make test for CMake projects) we can't assume for the general CMake case because any given project can provide its own make test target which may or may not honor -jN. @davydden mentioned dealii as a package for which this is done.
I can certainly add a note about the use of kwargs.pop(). I'll defer to you on whether I should go for simplicity or flexibility on jobs_arg -- let me know what you think after my explanation.
There was a problem hiding this comment.
Is there any known case where we can't just supply a flag? Like could we do make -j 4 instead of make -j4? That would make things a bit simpler in my mind. I suppose there could be a flag that means "use all cores" and no option to supply a specific number. But I think that would break this anyway? That wouldn't need to be a MakeExecutable anyway.
There was a problem hiding this comment.
@adamjstewart The generated test target for CMake projects requires the CTEST_PARALLEL_LEVEL environment variable to be set for parallelism (ignoring the -j option to make), which is the whole reason for this PR. Different make-like applications handle non-specific parallelism differently anyway: ninja defaults to core-level parallelism without any -j, whereas make omits the number for "infinite" parallelism. It's messy. I could always go back to the old system of assuming -j{0} in the absence of use cases, at the cost of flexibility for the future. Is that the consensus?
There was a problem hiding this comment.
Personally, I would be fine with the previous behavior, the proposed behavior, or the behavior I'm suggesting. To be clear, what I'm suggesting is replacing jobs_arg='-j{0}' with jobs_flag='-j'. The only difference from our current behavior would be adding a space after the -j.
But if you want to avoid debate and just get this merged, I would avoid making changes to the UI that aren't absolutely necessary until we find a use case for those changes. Although @alalazo and I are generally fans of making things as generic as possible...
There was a problem hiding this comment.
for wrappers using long-form options (ctest has a --parallel alias, for example)
Like a wrapper around ctest, called ctest, which only accepts --parallel?
separating the N from the -j
Do any systems require this?
and for the case where one actually does want to avoid -jN when job_env is set
How does one accomplish that? e.g. if jobs_arg=None the next line the line args = tuple([j.format(self.jobs) for j in jobs]) + args fails. The only way I see to do that now is to pass an empty iterable container, which IMO is not straightforward.
Also, I didn't see any of those cases explaining the handling of jobs_arg as a tuple (other than my inference on the last one)
Overall I'd be fine if jobs_arg was a single argument like jobs_env.
There was a problem hiding this comment.
I have removed the currently-unneeded flexibility, leaving us with the minimum changes required to provide parallel test execution under CMake.
9e7b5e9 to
54664e1
Compare
|
Currently-unneeded flexibility has been removed. |
scheibelp
left a comment
There was a problem hiding this comment.
I have a request to clarify the docstring and a couple questions/requests about handling of env/make_env
| parallelism options on a per-invocation basis. Specifying | ||
| 'parallel' to the call will override whatever the package's | ||
| global setting is, so you can either default to true or false and | ||
| override particular calls. Specifying 'jobs_env' to a particular |
There was a problem hiding this comment.
IMO the following is clearer:
If specified,
jobs_envnames the environment variable which should be used to control the level of parallelism
There was a problem hiding this comment.
Sounds reasonable, but I wanted to avoid giving the impression that the -j{} behavior would be turned off. How about:
If specified, jobs_env names an environment variable to be set to the number of parallel jobs used.
?
There was a problem hiding this comment.
That sounds good to me, although IMO if you want to make sure the user knows that jobs_env doesn't turn off the -j argument, that stating it explicitly would benefit either explanation.
There was a problem hiding this comment.
@scheibelp I think the latest commit resolves all your issues. Please let me know whether you concur.
lib/spack/spack/build_environment.py
Outdated
| if jobs_env: | ||
| # Caller wants us to set an environment variable to | ||
| # control the parallelism. | ||
| make_env = kwargs.pop('env', []).copy() |
There was a problem hiding this comment.
This looks strange to me comparing to the next line: it looks like make_env can be a list. If jobs_env is a string, it looks like the next line will fail (attempt to index a list with a string). Should the default be {}?
There was a problem hiding this comment.
The default should be {}, correct. Apparently the existing tests don't attempt to run the tests for any CMake packages with parallelism enabled.
There was a problem hiding this comment.
@scheibelp Should I try to invoke a package build somehow that runs the package's tests? I'm not at all familiar with this part of the testing system.
There was a problem hiding this comment.
For this sort of error (and to just cover the lines of code) you could add a simple test to the make_executable python test module (where jobs_env is set).
lib/spack/spack/build_environment.py
Outdated
| # control the parallelism. | ||
| make_env = kwargs.pop('env', []).copy() | ||
| make_env[jobs_env] = str(self.jobs) | ||
| kwargs['env'] = make_env |
There was a problem hiding this comment.
In Executable.__call__, if env is set at all (including if it's an empty container), then Executable will use it. Overall the effect for MakeExecutable is that if the user:
- sets
job_env - does not set
env
that the Executable.__call__ will run with an empty environment, which changes the semantics (normally if you don't set env, it uses the current environment). Was that what you were going for?
There was a problem hiding this comment.
@scheibelp Genuinely not sure what to do here. In executable.py:
env_arg = kwargs.get('env', None)
if env_arg is None:
env = os.environ.copy()
env.update(self.default_env)
else:
env = self.default_env.copy()
env.update(env_arg)
In other words, if the environment is not specified, we overlay our defaults on the current environment; otherwise, we overlay our specified environment on the defaults. It would seem that the desirable behavior in the case of specifying jobs_env would be, "use the environment you were always going to use, but with this added. In the case where env was not specified this would have to be, "The current environment with default_env overlaid on top and then jobs_env added to it." This would not, of course, be robust against changes to executable.py. Thoughts?
There was a problem hiding this comment.
It would seem that the desirable behavior in the case of specifying jobs_env would be, "use the environment you were always going to use, but with this added["]
IMO that is the most sensible behavior (e.g. to keep CC etc.)
but with this added. In the case where env was not specified this would have to be, "The current environment with default_env overlaid on top and then jobs_env added to it."
I think I see what you mean: to get the semantics of default_env, you would need to give it priority in MakeExecutable when env was not specified, if Executable were to change that later, MakeExecutable may get left behind.
I don't think you have to deal with that here. However, if desired, you could make it more robust by adding an extra_environment argument to Executable.__call__ that doesn't define the whole environment but rather overlays the specified arguments on top. In that case MakeExecutable would be simplified.
54664e1 to
e531bf7
Compare
lib/spack/spack/util/executable.py
Outdated
|
|
||
| Keyword Arguments: | ||
| env (dict): The environment to run the executable with | ||
| env_dump (dict): Dict to be set to the environment actually used |
There was a problem hiding this comment.
A couple requests:
- I think this should be
dump_env(there are no references toenv_dump, but there are todump_env) - This appears to be used only for testing. IMO it's worth clarifying that in the explanation and also adding a
_to the prefix, i.e._dump_env.
There was a problem hiding this comment.
@scheibelp I wasn't sure what the policy on testing-only was, but I can do that.
e531bf7 to
39a7111
Compare
|
LGTM: but can you update the PR description |
|
@scheibelp Description updated. |
|
Thanks! |
* Add 'extra_env' argument to Executable.__call__: this will be added to the environment but does not affect whether the current environment is reused. If 'env' is not set, then the current environment is copied and the variables from 'extra_env' are added to it. * MakeExecutable can take a 'jobs_env' parameter that specifies the name of an environment variable used to set the level of parallelism. This is added to 'extra_env' (so does not affect whether the current environment is reused). * CMake-based Spack packages set 'jobs_env' when executing the 'test' target for make and ninja (which does not use -j)
* Add 'extra_env' argument to Executable.__call__: this will be added to the environment but does not affect whether the current environment is reused. If 'env' is not set, then the current environment is copied and the variables from 'extra_env' are added to it. * MakeExecutable can take a 'jobs_env' parameter that specifies the name of an environment variable used to set the level of parallelism. This is added to 'extra_env' (so does not affect whether the current environment is reused). * CMake-based Spack packages set 'jobs_env' when executing the 'test' target for make and ninja (which does not use -j)
Closes #4503 (implementing the spirit of the request—parallelism for CMake tests—rather than the spirit—using the
ctestcommand itself).The existing implementation of the test target when building a CMake package using the Makefiles generator invokes
make test, which does not honor the the-joption for parallelism. One can either set an environment variable(difficult to handle the overrides currently available)or one can invokectestinstead ofmake test. This implementation (now) does thelatterformer in order to accommodate packages (see below) for whichmake testruns a set of tests which is distinct from that run byctest.[Description has been edited from the original due to the evolution of implementation to satisfy issues raised), with some phrases struck through to suggest the original meaning. See the editing history for full details.