Skip to content

Support for float32 in KDTree and BallTree #15474

@LalliAcqua

Description

@LalliAcqua

Description

Relates to #7059 and #11000.

The conversion from float32 to float64 in Isomap leads to cython class BinaryTree

Steps/Code to Reproduce

Debug example:

from sklearn.datasets import load_digits
from sklearn.manifold import Isomap
X, _ = load_digits(return_X_y=True)
X = X.astype('float32')
embedding = Isomap(n_components=2)
X_transformed = embedding.fit_transform(X[:100])

The change of data type occurs in line 157 of isomap when calling function:

kng = kneighbors_graph(self.nbrs_, self.n_neighbors,
                               metric=self.metric, p=self.p,
                               metric_params=self.metric_params,
                               mode='distance', n_jobs=self.n_jobs)

which leads to class KNeighborsMixin in _base.py, method kneighbors (line 531):

chunked_results = Parallel(n_jobs, **parallel_kwargs)(
                delayed_query(
                    self._tree, X[s], n_neighbors, return_distance)
                for s in gen_even_slices(X.shape[0], n_jobs)
            )

here, X is dtype float32, chunked_results is a list of arrays dtype float64

The problem arises in the method query of class BinaryTree in _binary_tree.pxi (line 1271). Data are casted into either DTYPE or DTYPE_t, which are defined as np.float64.

class BinaryTree should be changed to allow computations in both float32 and float64 data types

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions