Skip to content

RFC Suggesting HistGradientBoosting in RandomForest and GradientBoosting pages #26220

@adrinjalali

Description

@adrinjalali

Right now we have this in the GradientBoosting API page:

sklearn.ensemble.HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

After LogisticRegression and LinearRegression, RandomForest is the third most visited page among models we have. And I think we'd agree that in most cases users can probably be better off using HGBT models instead. Right now users compare random forests with xgboost, while they could be using HGBT.

So my question is, should we add a message on forest/tree pages regarding HGBT, and have the statement a bit bolder than just saying "it's probably faster"?

@amueller had done quite a bit of analysis on this, maybe we could link to those?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions