Skip to content

Conversation

@jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Feb 27, 2019

Currently being developed in loky. Will be move here once done.

This PR adds a new submodule _clibs which provides helpers for the OpenMP and BLAS libs. It's main purpose is to avoid oversubscription in nested parallelism, e.g. BLAS calls inside an OpenMP prange.
It's adapted from code of @tomMoral from Loky.

It also defines an effective_n_jobs_openmp which differs from joblib's effective_n_jobs when n_jobs is None in which case it's set to omp_get_max_threads, as discussed irl with @ogrisel

Copy link
Contributor

@tomMoral tomMoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

It would be nice to add a test that check openmp actually take this into account.
You can use something like proposed in tommoral/loky#135 with opemmp.omp_get_num_threads.

@jeremiedbb
Copy link
Member Author

requires #13294 for the CI to pass

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a first batch of comments:

@ogrisel
Copy link
Member

ogrisel commented Feb 28, 2019

@jeremiedbb
Copy link
Member Author

Yes I added coverage to appveyor but we dropped appveyor :/

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ogrisel
Copy link
Member

ogrisel commented Mar 7, 2019

The failure on circle ci is caused by an old version of scikit-image and is unrelated to this PR.

    from numpy.lib.arraypad import _validate_lengths
ImportError: cannot import name '_validate_lengths

@rth
Copy link
Member

rth commented Mar 25, 2019

Once https://github.com/tomMoral/loky/pull/135 is merged can we use most of this PR from loky then, or do we still need some cython files from this PR?

@jeremiedbb
Copy link
Member Author

We'll still need a helper to get the number of threads for openmp (similar to effective_n_jobs) since it's been decided to not include it in loky, and a test for that base on a prange.

@jeremiedbb
Copy link
Member Author

Besides, we'll need to vendor it in sklearn, because even if it's released, we use loky from joblib so we won't have it if we want to support several versions of joblib.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

Thanks a lot @jeremiedbb for your PRs on openmp support, that's making my life easier :)


config.add_extension("_threaded_libs_helpers",
sources=["_threaded_libs_helpers.pyx"],
include_dirs=[numpy.get_include()],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: you shouhldn't need to include numpy lib right?

# Check that an OpenMP library is loaded
limits = get_thread_limits()

assert not all([lib is None for lib in [limits['openmp_llvm'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use any(lib for lib...)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would create a generator.

In [1]: any(True for i in range(2))                                             
Out[1]: <generator object <genexpr> at 0x7fb5eff82308>

In [2]: any([True for i in range(2)])                                           
Out[2]: True


if clib == "openblas" and SKIP_OPENBLAS:
raise SkipTest("Possible bug in getting maximum number of threads with"
" OpenBLAS < 0.2.16 and OpenBLAS does not expose it's "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its (without the ')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the typo

return n_threads


def check_num_threads(int n=100):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only useful for the tests?

If that's the case then it should maybe be private, and that should be explained in the docstring so that developers don't think they might use it.

Also if I understand correctly, I would suggest the following docstring (since it's not obvious what default number of threads means here):

Run a short parallel section, and return the number of threads that where effectively used by openmp.

If None, does not do anything.
subset : string or None, optional (default="all")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this redundant with the limits parameter as a dict?
It's not clear to me why one would prefer one over the other.

Also the accepted keys of the dict for limits should be documented

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not redundant as the dict for limit permits to select specific implementation (for instance you can have two different openblas for numpy and scipy) while the subset (rename user_api in the loky version) permit to set globally, for all library implementing the BLAS or openmp protocol some limits.

-------
version : string or None
None means OpenBLAS is not loaded or version < 0.3.4, since OpenBLAS
did not expose it's verion before that.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its
version

@jeremiedbb
Copy link
Member Author

Thanks for the review @NicolasHug but most of the code here has changed. It's being developed in loky (https://github.com/tomMoral/loky/pull/135) and the plan is to vendor it in sklearn once it's done. Not sure if your comments apply to the current version of the code if you to take a look, go ahead :)

@NicolasHug
Copy link
Member

OK, then maybe indicate this in the header to avoid unnecessary reviews

@jeremiedbb jeremiedbb changed the title [MRG] Add helpers for BLAS and OpenMP libs [WIP] Add helpers for BLAS and OpenMP libs Mar 25, 2019
@jeremiedbb
Copy link
Member Author

yep, sorry about that

@tomMoral
Copy link
Contributor

OK, then maybe indicate this in the header to avoid unnecessary reviews

Thanks for the review @NicolasHug , I took your comments into accounts in the loky PR. :)

@ogrisel
Copy link
Member

ogrisel commented Apr 18, 2019

For the record, the helpers to protect against over subscription between Cython prange and 3rd party threadpools is being better tested and improved in this repo: https://github.com/joblib/threadpoolctl/

Once ready, the plan is to contribute it to numpy and vendor a backport in scikit-learn in the mean time.

@rth
Copy link
Member

rth commented Jun 20, 2019

the helpers to protect against over subscription between Cython prange and 3rd party threadpools is being better tested and improved in this repo: joblib/threadpoolctl

If we are going to vendor threadpoolctl soon, do we still need this PR then?

@jeremiedbb
Copy link
Member Author

indeed

@jeremiedbb jeremiedbb closed this Jun 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants