Skip to content

documentation of k-means param n_init isn't worded nicely for people unfamiliar with the implementation #25539

@magical-inference

Description

@magical-inference

Describe the issue linked to the documentation

Currently the doc says:

When n_init='auto', the number of runs will be 10 if using init='random', and 1 if using init='kmeans++'.

in https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html,

and

When n_init='auto', the number of runs will be 10 if using
init='random', and 1 if using init='kmeans++'.

in https://github.com/scikit-learn/scikit-learn/blob/f7e5f412ddba4f30c871749515bbc0393378aa15/sklearn/cluster/_kmeans.py.

Careful readers will make sense of it, but I'm sure we can do better for hasty readers / people unfamiliar with the implementation (because n_init and init look almost identical).

Suggest a potential alternative/fix

Suggestion:

When n_init='auto', the number of runs depends on the value of init:
10 if usinginit='random', 1 if using init='kmeans++'

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions