Skip to content

Conversation

@verakye
Copy link
Contributor

@verakye verakye commented Mar 12, 2022

Reference Issues/PRs

targets #22406

What does this implement/fix? Explain your changes.

Adapts plot_kernel_ridge_regression.py to fit notebook style and doc style.

Any other comments?

#pariswimlds

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, it's already a net improvement on its own but if you are willing, it would be great to further improve this example (please consider the suggestions below) by pushing new commits to this same PR branch.

The main point would be to move the last 2 paragraphs of the information of the beginning of the example to analyze cell outputs and the figures right after each relevant code cell and before the header of the next section.

Note that the content of those paragraph is no longer exact, most likely because the code of scikit-learn has evolved and some conclusion no longer hold. For instance:

However, prediction of 100000 target values is more than tree times faster with SVR
since it has learned a sparse model using only approx. 1/3 of the 100 training
datapoints as support vectors.

Should probably be rewritten to something like:

However, prediction of 100000 target values is could be in theory approximately
tree times faster with SVR since it has learned a sparse model using only
approximately 1/3 of the training datapoints as support vectors. However, in
practice, this is not necessarily the case because of implementation details
in the way the kernel function is computed for each model that can make the
KRR model as fast or even faster despite computing more arithmetic operations.

For reference the rendering of the current state of the example is here:

https://181259-843222-gh.circle-artifacts.com/0/doc/auto_examples/miscellaneous/plot_kernel_ridge_regression.html (from the link named "doc artifact" in the continuous integration report of the last commit).

# Visualize learning curves

from sklearn.model_selection import learning_curve

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the learning curves figure below would benefit from displaying both the train scores (train_scores_kr / train_scores_svr) and the test scores (already displayed).

To be consistent with the previous figure on training and prediction times, it would be nice to use dashed lines for the test scores ("o--") and use the solid line style ("o-") for the training scores.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also reduce the number of steps from 10 to 7:

train_sizes=np.linspace(0.1, 1, 7),

while increasing the training set size a bit, e.g. from X[:100]/y[:100] to X[:1000]/y[:1000]. Using the full training set would be even more informative but I am afraid that this would increase the duration of the execution of this code example too much.


t0 = time.time()
svr.fit(X[:train_size], y[:train_size])
svr_fit = time.time() - t0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the lines below, it would be interesting to also display the results of the hyper-parameter search for both models, for instance:

print(f"Best SVR with params: {svr.best_params_} and R2 score: {svr.best_score_:.3f}")
print(f"Best KRR with params: {kr.best_params_} and R2 score: {kr.best_score_:.3f}")

Then we could reuse those tuned hyper-parameters for the latest figure instead of using arbitrary values for alpha, C and gamma.

@ogrisel
Copy link
Member

ogrisel commented Mar 13, 2022

For information, I was curious and I tried to updated the learning curves as I suggested above and here is the result. Each model uses tuned hyper-parameters for the largest dataset size (1000 samples):

image

I really don't know what to do of this plot: in particular I do not understand why would the training error be significantly larger than the test error for the smallest training set sizes. I have tried several times with different cv= values and this seems to always happen. This is very confusing.

@ogrisel
Copy link
Member

ogrisel commented Mar 13, 2022

For reference here are the results of the tuning I observed on my machine:

SVR complexity and bandwidth selected and model fitted in 7.026 s
Best SVR with params: {'C': 1000.0, 'gamma': 0.01} and R2 score: 0.77
KRR complexity and bandwidth selected and model fitted in 3.012 s
Best KRR with params: {'alpha': 0.1, 'gamma': 0.1} and R2 score: 0.78

@lesteve lesteve added the Quick Review For PRs that are quick to review label Mar 14, 2022
@lesteve lesteve mentioned this pull request Mar 14, 2022
47 tasks
@lesteve
Copy link
Member

lesteve commented Mar 30, 2022

I pushed some tweaks to follow @ogrisel's suggestions, some of this is still WIP, I'll try to follow up on this soon.

@lesteve
Copy link
Member

lesteve commented Apr 8, 2022

I am going to restrict this PR on using notebook-style and easy improvements. I will create an issue to try to track the issue mentioned in #22804 (comment).

lesteve added 2 commits April 8, 2022 17:47
Also use dashed line for test score for consistency
@lesteve
Copy link
Member

lesteve commented Apr 8, 2022

Note that this is not true anymore:

SVR is faster than KRR for all sizes of the training set because of the learned sparse solution.

I am not quite sure what to do about it ...

@cmarmo cmarmo added Needs Decision Requires decision and removed Quick Review For PRs that are quick to review labels Apr 14, 2022
@lesteve lesteve added Quick Review For PRs that are quick to review and removed Quick Review For PRs that are quick to review labels Apr 20, 2022
@lesteve
Copy link
Member

lesteve commented Apr 29, 2022

@lesteve lesteve added the Quick Review For PRs that are quick to review label Apr 29, 2022
@glemaitre glemaitre merged commit 1393a63 into scikit-learn:main Apr 29, 2022
@glemaitre
Copy link
Member

Merging. We should revisit the example. @lesteve already opened an issue for this purpose.
In the meanwhile, we have a better rendering and a slightly more appropriate discussion.

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request May 19, 2022
glemaitre pushed a commit that referenced this pull request May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Documentation Needs Decision Requires decision Quick Review For PRs that are quick to review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants