Skip to content

Conversation

@rcurtin
Copy link
Member

@rcurtin rcurtin commented Nov 24, 2023

This is a fix for #3556, in the Python bindings. The generated documentation would print a line like

X_test, y_test, X_train, y_train = preprocess_split(input_=X, input_labels=y)

but preprocess_split() returns a Python dict with several members. Instead, this is what's necessary to get the training/test datasets:

d = preprocessd_split(input_=X, input_labels=y)
X_train = d['training']
y_train = d['training_labels']
X_test = d['test']
y_test = d['test_labels']

These changes update the function that prints this documentation to use the second version. I tested that the generated code runs successfully.

Copy link
Contributor

@NippunSharma NippunSharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@NippunSharma
Copy link
Contributor

As a side note: should we also look into changing what preprocess_split returns and change that to individual matrices. That seems to be a bit more "pythonic" way of doing it?

Copy link

@mlpack-bot mlpack-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second approval provided automatically after 24 hours. 👍

@rcurtin
Copy link
Member Author

rcurtin commented Nov 27, 2023

@NippunSharma agreed, it would be great if we could make the preprocess_split() API cleaner, but I am not sure of a nice automated way to do that with our binding generation system. If you have ideas I am all ears 😄

@rcurtin rcurtin merged commit bbde929 into mlpack:master Nov 27, 2023
@rcurtin rcurtin deleted the py-split-doc-fix branch November 27, 2023 23:58
This was referenced May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants