Skip to content

Should we make most functions private? #14897

@NicolasHug

Description

@NicolasHug

We have a lot of functions referenced in our API reference.

For our own sanity, I would like to propose to make private any function that either:

  • has an obvious estimator equivalent, e.g. k_means or fastica. There should be one obvious way to do things. (Some of them might be subject to debate, like scale which is arguably convenient.)
  • isn't immediately useful for users. For example, I don't know much about OPTICS but I feel like cluster_optics_xi should be private (it's not even mentioned in the user guide).

There are plenty of other functions, and even more that aren't in the API ref.


Motivation:
k_means and fastica are making my life difficult right now as I'm working on #13603. KMeans.fit() does nothing but calling k_means which does all the work.

I cannot change anything to the KMeans class now (e.g. add an attribute), because I would I need to change the signature / interface / behaviour of k_means, which is a public function.

(I think that in general the design of having the estimator call a public helper is bad (it should be the other way around), but that's another issue.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions