RFC How should we control/expose number of threads for our OpenMP based parallel cython code ?

Before adding OpenMP based parallelism we need to decide how to control the number of threads and how to expose it in the public API.

I've seen several proposition from different people:

&nbsp;&nbsp;**(1)** Use the existing `n_jobs` public parameter with `None` means 1 (same a for joblib parallelism)

&nbsp;&nbsp;**(2)** Use the existing `n_jobs` public parameter with `None` means -1 (like numpy lets BLAS use as many threads as possible)

&nbsp;&nbsp;**(3)** Add a new public parameter `n_omp_threads` when underlying parallelism is handled by OpenMP, with `None` means 1.

&nbsp;&nbsp;**(4)** Add a new public parameter `n_omp_threads` when underlying parallelism is handled by OpenMP, with `None` means -1.

&nbsp;&nbsp;**(5)** Do not expose that in the public API. Use as many threads as possible. The user can still have some control with `OMP_NUM_THREADS` before runtime or using threadpoolctl at runtime.

(1) or (2) will require improving documentation of `n_jobs` for each estimator: what's the default, what kind of parallelism, what is done in parallel... (see #14228)

@scikit-learn/core-devs, which solution do you prefer ?
If it's none of the previous ones, what's your solution ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

RFC How should we control/expose number of threads for our OpenMP based parallel cython code ? #14265

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

RFC How should we control/expose number of threads for our OpenMP based parallel cython code ? #14265

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions