ENH Add dtype preservation to LocallyLinearEmbedding #24337

MisaOgura · 2022-09-02T13:42:12Z

Reference Issues/PRs

In scope of #11000

What does this implement/fix? Explain your changes.

Implement dtype preservation to LocallyLinearEmbedding and relevant tests for various methods and solvers.

Any other comments?

jeremiedbb · 2022-09-02T14:51:44Z

Thanks for the PR @MisaOgura ! Looking quite good. Please add an entry in the v1.2.rst what's new, and then we'll wait for the CI to be green :)

MisaOgura · 2022-09-03T09:17:54Z

@jeremiedbb

I've looked into the failing tests on CI sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold (apologies, I didn't realise that these tests were skipped when I ran the test suit locally) - the below are initial findings.

It seems that there is some instability with this test, as;

1. Different random seeds lead to different parameterised cases failing

e.g. running tests locally on M1

# default seed 0

sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-standard] FAILED *
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-hessian] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-modified] FAILED *
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-ltsa] PASSED

# with seed 105

sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-standard] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-hessian] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-modified] FAILED *
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-ltsa] PASSED

2. Same random seeds lead to different parameterised cases failing on different platform

e.g. comparing tests on CI & locally, with the same default seed 0

# on CI

sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-standard] FAILED *
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-hessian] FAILED *
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-modified] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-ltsa] PASSED

# locally

sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-standard] FAILED *
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-hessian] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-modified] FAILED *
sklearn/manifold/tests/test_locally_linear.py::test_lle_manifold[float32-dense-ltsa] PASSED

In both cases above, it is float32-dense-* cases that are affected by random seeds.

Cf.

A similar phenomenon is flagged in another test within the same file sklearn/manifold/tests/test_locally_linear.py::test_lle_simple_grid, but in this case it is explained that the flaky-ness due to ARPACK's numerically instability.

e.g. running tests locally on M1

# default seed 42

sklearn/manifold/tests/test_locally_linear.py::test_lle_simple_grid[float32-dense] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_simple_grid[float32-arpack] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_simple_grid[float64-dense] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_simple_grid[float64-arpack] PASSED

# with seed 0

sklearn/manifold/tests/test_locally_linear.py::test_lle_simple_grid[float32-dense] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_simple_grid[float32-arpack] FAILED *
sklearn/manifold/tests/test_locally_linear.py::test_lle_simple_grid[float64-dense] PASSED
sklearn/manifold/tests/test_locally_linear.py::test_lle_simple_grid[float64-arpack] PASSED

glemaitre · 2022-11-03T11:06:15Z

I just resolve the conflict. I will give a go to this PR.

glemaitre

I made a pass on the algorithm to check which structures are not yet preserved in np.float32.

Regarding the failure, I only think that we will have to define a more lenient tolerance for the 32-bit case.

I would like first to impose the bitness in the algorithm and then focus more specifically on the tests.

@MisaOgura Would you have time to carry on the changes?

glemaitre · 2022-09-05T15:51:42Z