Shift to using numpy.random.Generator and harmonize signature of seed across all sample functions by quaquel · Pull Request #662 · SALib/SALib

quaquel · 2025-09-08T18:30:22Z

This PR updates all sample functions to accept an integer, None, or a numpy.random.Generator instance as an argument for seed. It also ensures that this generator is used throughout the code to ensure reproducibility of the results.

I have kept seed as the name of the keyword argument. For spec-7 compliance, this should be renamed to rng. However, changing this would be backward incompatible. Of course, it would be easy to add it in here and, for the time being, accept either seed or rng and raise a deprecation warning on seed. If that is desirable, this is easy to add.

See #634 for some background

src/SALib/sample/_util.py

EwoutH · 2025-09-10T09:24:14Z

Without having reviewed the code line-by-line, this looks like a proper step towards standardizing seed passing and SPEC 7 compliance. It's nice it can be done without breaking backwards compatibility.

quaquel · 2025-09-10T13:34:22Z

The unit tests fail before the actual testing starts.

3.9 errors on setting up conda with Did not find Mambaforge-Linux-x86_64.sh latest in cache
3.12 errors on setting up conda with a 404 error.

Linter caught a couple of small issues with morris, which are now fixed. The other linter issue is that seed/rng is present in the signature of full factorial and finite difference, but not used internally by either method. I cannot really fix this. The issue was there before this PR already.

ConnectedSystems · 2025-09-14T01:27:44Z

Thanks @quaquel

Linter caught a couple of small issues with morris, which are now fixed. The other linter issue is that seed/rng is present in the signature of full factorial and finite difference, but not used internally by either method. I cannot really fix this. The issue was there before this PR already.

SALib methods have to define a seed/rng argument to maintain compatibility with the CLI.
There are ways to handle methods that do not require a seed value at all of course, but it's out of scope here I think.

Could anyone point me to a draft/interim SPEC7 document? As far as I can tell it's not yet finalised but I want to see how far we are in alignment with current discussions at least.

ConnectedSystems · 2025-09-14T02:14:19Z

@tupui

Did scipy change the interface for scipy.stats.uniform recently?

Looks like we had been running scipy.stats.uniform(loc=sample * d, scale=sample * d, size=num_groups, seed=<some value>) for years, but now it doesn't work (seed argument not supported).

Looks like the correct usage is now scipy.stats.uniform.rvs(loc=sample * d, scale=sample * d, size=num_groups)

But now all tests that check results within a certain tolerance (for LHS sampling) appear to fail?

tupui · 2025-09-14T02:33:09Z

Mmm there was some changes to use rng instead of seed. I am surprised it's breaking without warning though.

But in general for basic distributions there is the new API which is recommended because better in every way.

tupui · 2025-09-14T02:33:09Z

Mmm there was some changes to use rng instead of seed. I am surprised it's breaking without warning though.

But in general for basic distributions there is the new API which is recommended because better in every way.

tupui · 2025-09-14T02:39:57Z

The SPEC-7 is here otherwise https://scientific-python.org/specs/spec-0007/

quaquel · 2025-09-14T07:22:06Z

First, thanks for the commits and minor modifications.

SALib methods have to define a seed/rng argument to maintain compatibility with the CLI.
There are ways to handle methods that do not require a seed value at all of course, but it's out of scope here I think.

Yes, let's keep this out of scope for this PR. And indeed it is relatively easy to fix. I would suggest checking the validity of the CLI arguments for each method, rather than having a common interface with arguments that for certain methods don't matter.

Could anyone point me to a draft/interim SPEC7 document? As far as I can tell it's not yet finalised but I want to see how far we are in alignment with current discussions at least.

SPEC-7 is endorsed by ipython, networkx, numpy, scikit-image, and scipy, so I guess it has been finalized. The only discrepancy between this PR and spec-7 is that this PR still uses seed instead of rng as recommended by SPEC-7. Apart from that, everything is compliant. The simplest ways to make this PR fully compliant, without breaking backward compatibility, would be to:

Accept either seed or rng.
Raise a DeprecationWarning on seed and set rng = seed.
Raise a ValueError if both seed and rng are passed.

so , something like the below should do the trick.

import functools
import warnings

def handle_seed_rng(func):
    @functools.wraps(func)
    def wrapper_handle_seed_rng(*args, **kwargs):
        seed = kwargs.pop("seed", False)
        rng = kwargs.pop("rng", False)

        if seed is not False and rng is not False:
            raise ValueError("please use only rng")
        if seed is not False:
            warnings.warn("the use of seed as a keyword argument is being deprecated, use rng instead",
                             DeprecationWarning,
                             stacklevel=2)
            rng = seed
        elif rng is False:
            rng = None

        return func(*args, **kwargs, rng=rng)
    return wrapper_handle_seed_rng

It would be trivial to embed this logic in a function decorator and just decorate all sample functions with it. Once the use of seed is fully deprecated, just remove the decorator and you're done.

But now all tests that check results within a certain tolerance (for LHS sampling) appear to fail?

Not sure what's happening on the tests. I ran all tests locally before creating the PR, and everything was fine on my end.

in line with numpy docs, Generator.spawn is an equivalent short hand for the complicated code I drew on from scipy

quaquel · 2025-09-14T08:59:17Z

I have addressed a whole bunch of failing tests here that all had to do with the proper handling of the rng keyword argument. I also realized that analyze/morris was still using numpy.random instead of a Generator. So that is also addressed.

The remaining failing tests are all assertion errors because the results return from the updated code do not match what is expected. It is very hard to diagnose this for me, because I have no idea where these numbers against which the new results are compared came from.

It might be that in many cases it's just due to a shift from numpy.random.seed to numpy.random.default_generator. That is, the random number generating algorithm is different from what it was before. To wit, the former uses mersene twister as the random number generating algorithm, while Generator uses PCG64.

quaquel · 2025-09-14T14:52:45Z

@tupui

Did scipy change the interface for scipy.stats.uniform recently?

Looks like we had been running scipy.stats.uniform(loc=sample * d, scale=sample * d, size=num_groups, seed=<some value>) for years, but now it doesn't work (seed argument not supported).

Looks like the correct usage is now scipy.stats.uniform.rvs(loc=sample * d, scale=sample * d, size=num_groups)

But now all tests that check results within a certain tolerance (for LHS sampling) appear to fail?

The code on main uses numpy.random.uniform, I rewrote it to use scipy.stats, because this is compatible with the new generators (but made a mistake in doing so, which I fixed).

quaquel · 2025-09-14T17:33:17Z

I have done some further cleaning and reverted some changes in the analyze functions. At present, quite a few of the analyze functions also use np.random.seed. While fixing failing unit tests, I started to change this as well, but that made this PR too big and unwieldy.

I now have the following failing tests

FAILED tests/sample/morris/test_morris.py::TestGroupSampleGeneration::test_generate_x_star - AssertionError: 
FAILED tests/sample/test_latin.py::TestLatinSample::test_latin_sample_trivial - AssertionError: 
FAILED tests/sample/test_latin.py::TestLatinSample::test_latin_sample_trivial_group - AssertionError: 
FAILED tests/sample/test_latin.py::TestLatinSample::test_latin_sample_one_group - AssertionError: 
FAILED tests/sample/test_latin.py::TestLatinSample::test_latin_sample_two_groups - AssertionError: 
FAILED tests/test_cli_analyze.py::test_fast - AssertionError: Unexpected FAST results.
FAILED tests/test_cli_analyze.py::test_morris - AssertionError: Results did not match expected values:
FAILED tests/test_cli_analyze.py::test_morris_scaled - AssertionError: Results did not match expected values:
FAILED tests/test_regression.py::TestMorris::test_regression_morris_vanilla - AssertionError: 
FAILED tests/test_regression.py::TestMorris::test_regression_morris_scaled - AssertionError: 
FAILED tests/test_regression.py::TestMorris::test_regression_morris_groups - AssertionError: 
FAILED tests/test_regression.py::TestMorris::test_regression_morris_groups_brute_optim - AssertionError: 
FAILED tests/test_regression.py::TestMorris::test_regression_morris_optimal - AssertionError: 
FAILED tests/test_util.py::test_nonuniform_scale_samples_truncnorm - AssertionError:

As far as I can tell, having checked each of these tests, these assertions all fail because they test for specific return values. However, since the algorithm for the random number generation has changed, the number evidently also will change, so it's not surprising that these tests are failing.

Please let me know how you want me to continue with this PR.

ConnectedSystems · 2025-09-15T07:44:08Z

Thanks @quaquel , I'll need a bit more time to go through those tests. Some have existed before my time on project.

@willu47 would you have any recollection which tests are based on known and expected values, and which are based on what results were being obtained with Mersenne Twister RNG? If they are all the latter, then the resolution is simply updating with values from the new RNG I guess.

Whitespace auto-format

Testing against manually run values as simple indication that no regression has occurred.

Pandas expects numpy array at least instead of lists

ConnectedSystems · 2025-10-12T02:26:35Z

Updated tests - the values used appear to have been extracted manually and used as a simple indicator that nothing has changed unexpectedly.

Next efforts would be to test that values are within the estimated CIs rather than ensure outputs are identical.

Tests for Python 3.9 is failing due to some issue with Conda. I can confirm that all works fine locally however, so will be merging this in shortly.

Removing trailing comma

quaquel · 2025-10-12T07:31:56Z

thanks for finalizing this

initial commit

14082cd

quaquel commented Sep 8, 2025

View reviewed changes

src/SALib/sample/_util.py Outdated Show resolved Hide resolved

quaquel mentioned this pull request Sep 10, 2025

Overhaul of sampling to draw on developments in scientific python ecosystem quaquel/EMAworkbench#410

Merged

linter fixes for morris

88ef169

ConnectedSystems mentioned this pull request Sep 14, 2025

Enhance delta.analyze() with step/balanced bootstrap methods and interpretable outputs (fixes #659) #661

Merged

ConnectedSystems added 4 commits September 14, 2025 11:51

Restructure

2f7f7f1

Update morris tests

9bb03ef

Update imports

13160d8

Formatting

370678c

quaquel added 3 commits September 14, 2025 09:48

code simplification of handle_seed

ed9c5ad

in line with numpy docs, Generator.spawn is an equivalent short hand for the complicated code I drew on from scipy

proper handling or rng inside analyze morris

7109e32

updates to tests

dfb9a8e

quaquel added 4 commits September 14, 2025 12:07

Update latin.py

e025140

ensure rbd_fast also uses rng instead of seed

1c376f4

move analyze delta to use rng

0b86508

move analyze fast to use rng

12b1540

quaquel added 4 commits September 14, 2025 16:56

Update latin.py

fc3ded7

Update latin.py

e94562e

revert analyze fast and delta seed rng change

95ba72e

Update test_regression.py

4d55c9a

quaquel added 2 commits September 14, 2025 19:54

Update latin.py

d04ae7a

cleanup

966820b

ConnectedSystems added 5 commits October 12, 2025 11:31

HDMR: Add types and use rng/handle_seed

444bd77

Types and auto-format

7ce3a71

Whitespace auto-format

LHS: Update test values

454a752

Morris: rng and update test values

4741e7a

Ignore intermediate test output

bfd1cdd

ConnectedSystems force-pushed the rng branch from 65c1c1a to 213e242 Compare October 12, 2025 00:36

Update test values

3b060c9

Testing against manually run values as simple indication that no regression has occurred.

ConnectedSystems force-pushed the rng branch from 213e242 to 3b060c9 Compare October 12, 2025 00:38

ConnectedSystems added 8 commits October 12, 2025 11:42

LHS: Update remaining test values

e840907

Update CLI test values

4f95f76

Update regression test values

6c4b21f

auto-format

0ddf13b

Update util test values

e24e16b

Address deprecation warning

80d69f4

Pandas expects numpy array at least instead of lists

HDMR test: Ignore deprecation warnings

bdcb003

Ignore deprecation warnings

b16090c

auto-format

a992c46

Removing trailing comma

ConnectedSystems merged commit 55e7079 into SALib:main Oct 12, 2025
1 of 4 checks passed

ConnectedSystems mentioned this pull request Feb 13, 2026

SALib.analyze.rbd_fast shows machine depending behavior CLIMADA-project/climada_python#1081

Closed

Conversation

quaquel commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

EwoutH commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quaquel commented Sep 10, 2025

Uh oh!

ConnectedSystems commented Sep 14, 2025

Uh oh!

ConnectedSystems commented Sep 14, 2025

Uh oh!

tupui commented Sep 14, 2025

Uh oh!

tupui commented Sep 14, 2025

Uh oh!

tupui commented Sep 14, 2025

Uh oh!

quaquel commented Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quaquel commented Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quaquel commented Sep 14, 2025

Uh oh!

quaquel commented Sep 14, 2025

Uh oh!

ConnectedSystems commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ConnectedSystems commented Oct 12, 2025

Uh oh!

Uh oh!

quaquel commented Oct 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

quaquel commented Sep 8, 2025 •

edited

Loading

EwoutH commented Sep 10, 2025 •

edited

Loading

quaquel commented Sep 14, 2025 •

edited

Loading

quaquel commented Sep 14, 2025 •

edited

Loading

ConnectedSystems commented Sep 15, 2025 •

edited

Loading