Skip to content

SLEP006 - metadata handling: fit, transform, fit_transform #22987

@adrinjalali

Description

@adrinjalali

The question of how to handle metadata routing in fit_transform has come up a few times and we don't have a clear solution to it yet. This is the latest conversation: https://github.com/scikit-learn/scikit-learn/pull/22083/files#r811448325

A nice summery made by @lorentzenchr is:

  1. Handle fit_transform as separate method with its own set_fit_transform_requests.
  2. Merge the requests of fit and transform (error if inconsistent)
  3. Distinguish between fit_transform that only calls .fit(X).transform(X) and the rest (where it does something meaningful).

Option 1 is the easiest to implement, but it gets confusing when users put transformers in a pipeline. The user wouldn't necessarily know which meta-estimator does fit().transform() and which meta-estimator would do .fit_transform().

Option 2 is doable by adding machinery in MetadataRequest and MetadataRouter objects. We can have a common test which makes sure fit_transform always accepts fit_requests | transform_requests.

Option 3 requires change in the base.py and probably leads to much more discussions than the other 2 options.

also cc @jnothman @agramfort

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions