Skip to content

Conversation

@rcurtin
Copy link
Member

@rcurtin rcurtin commented May 15, 2025

@Patschkowski pointed out in #3909 that using KFoldCV with RandomForest<> did not seem to work. I thought it would be simple to figure out why but I actually had to dive in fairly deep. As it turned out the fix was simple:

  • KFoldCV uses the class MetaInfoExtractor, which determines various information about a given machine learning algorithm, like what Train() variants it supports and so forth.

  • MetaInfoExtractor, in order to determine what variants are available, uses the HAS_METHOD_FORM() macro in sfinae_utility.hpp, which allows a method to either meet a fixed given form, or have additional extra arguments.

  • The number of extra arguments is limited to 7... but RandomForest::Train() has 8 extra hyperparameters after the training data and labels! Therefore MetaInfoExtractor does not work.

  • The solution is to increase the maximum number of extra allowed arguments to 10.

  • I then added tests to cv_test.cpp to ensure that RandomForest<> works correctly both with MetaInfoExtractor and with KFoldCV.

  • I also noticed that the labels and weights types were not templatized for RandomForest<>, so I also generalized those.

@rcurtin rcurtin mentioned this pull request May 15, 2025
@shrit shrit merged commit 5ec5a5d into mlpack:master May 22, 2025
14 of 15 checks passed
@rcurtin rcurtin mentioned this pull request May 22, 2025
@rcurtin rcurtin deleted the cv-max-extra-args branch May 22, 2025 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants