Skip to content

Conversation

@NicolasHug
Copy link

Hey Thomas,

This PR passes the known categories from the Hist-GBDT estimator to the BinMapper, instead of letting the BinMapper figure it out.

I think this is preferable when there is some early stopping with a train/val split: we still want the BinMapper to know all the existing categories, not just those that are exclusive to the validation set. This is for the same reason that we allow users to pass a "categories" parameter to e.g. the OneHotEncoder.

One benefit is that we don't need _find_bin_categories categories anymore. There is also no risk of using bin_mapper.transform on unknown categories now.

@NicolasHug
Copy link
Author

Wow I'm jealous of this neat CI you have

@thomasjpfan thomasjpfan merged commit def2b69 into thomasjpfan:cat_hgbt_256 Sep 21, 2020
@thomasjpfan
Copy link
Owner

It has been pretty useful to have a CI on my fork as well. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants