TST replace Boston in test_gradient_boosting.py#16937
TST replace Boston in test_gradient_boosting.py#16937glemaitre merged 17 commits intoscikit-learn:masterfrom
Conversation
|
I wonder if |
|
ping @glemaitre 😅 |
|
@ogrisel I was looking a the LS loss in GBDT, it seems that we divide by the sum of the weight meaning that all 1s or all 2s will lead to the same loss. |
|
Ah actually the issue is with the Huber and LAD losses. |
|
By digging in, the initial raw predictions given by the FYI LAD and Huber divide by the sum of the sample weights as well and it should work as expected. |
|
I made a small push by merging master |
|
|
||
| if last_y_pred is not None: | ||
| assert_array_almost_equal(last_y_pred, y_pred) | ||
| assert_array_almost_equal(last_y_pred, y_pred, decimal=0) |
There was a problem hiding this comment.
Before to merge we need to add a FIXME to explain the issue and link to the relevant PR/issue just to not lose track
| assert_array_almost_equal(last_y_pred, y_pred, decimal=0) | |
| # FIXME: `decimal=0` is very permissive. This is due to the fact that | |
| # GBRT with and without `sample_weight` do not use the same implementation | |
| # of the median during the initialization with the `DummyRegressor`. | |
| # In the future, we should make sure that both implementations should be the same. | |
| # Refer to https://github.com/scikit-learn/scikit-learn/pull/17377 for some detailed | |
| # explanations. | |
| assert_array_almost_equal(last_y_pred, y_pred, decimal=0) |
It also allows setting of |
|
For |
This is weird. I am not sure why the bitness of Python would influence the results of this test. But maybe we should focus on fixing #17377 first and then see what is the scale of discrepancy that remains when fitting those models with |
|
The 4 failing tests with
AFAICT the 4 tests are failing at
Not sure why the difference in results. Trying |
|
If it is only on 32 bits, I would not be surprised that it is linked to the underlying trees: #8853 |
|
Sorry I misread. This is happening on both architectures, discard my last comment |
| ones = np.ones(len(boston.target)) | ||
| ones = np.ones(len(y_reg)) | ||
| last_y_pred = None | ||
| for sample_weight in None, ones, 2 * ones: |
There was a problem hiding this comment.
| for sample_weight in None, ones, 2 * ones: | |
| for sample_weight in [None, ones, 2 * ones]: |
|
Is it always failing for the |
| X_reg, y_reg = make_regression( | ||
| n_samples=500, n_features=10, n_informative=8, noise=10, random_state=7 | ||
| ) | ||
| y_reg = StandardScaler().fit_transform(y_reg.reshape((-1, 1))) |
There was a problem hiding this comment.
you can use the function from sklearn.preprocessing import scale
Yes, the failure is always with |
| last_y_pred = None | ||
| for sample_weight in None, ones, 2 * ones: | ||
| for sample_weight in [None, ones, 2 * ones]: | ||
| clf = GradientBoostingRegressor(n_estimators=100, |
There was a problem hiding this comment.
could you rename the clf by reg
| sample_weight=sample_weight) | ||
| leaves = clf.apply(boston.data) | ||
| assert leaves.shape == (506, 100) | ||
| assert_raises(ValueError, clf.predict, X_reg) |
There was a problem hiding this comment.
remove this check since it is covered by the common test
| # FIXME: `rtol=65` is very permissive. This is due to the fact that | ||
| # GBRT with and without `sample_weight` do not use the same | ||
| # implementation of the median during the initialization with the | ||
| # `DummyRegressor`. In the future, we should make sure that both | ||
| # implementations should be the same. See PR #17377 for more. | ||
| assert_allclose(last_y_pred, y_pred, rtol=100) |
There was a problem hiding this comment.
| # FIXME: `rtol=65` is very permissive. This is due to the fact that | |
| # GBRT with and without `sample_weight` do not use the same | |
| # implementation of the median during the initialization with the | |
| # `DummyRegressor`. In the future, we should make sure that both | |
| # implementations should be the same. See PR #17377 for more. | |
| assert_allclose(last_y_pred, y_pred, rtol=100) | |
| # FIXME: We temporarily bypass this test. This is due to the fact that | |
| # GBRT with and without `sample_weight` do not use the same | |
| # implementation of the median during the initialization with the | |
| # `DummyRegressor`. In the future, we should make sure that both | |
| # implementations should be the same. See PR #17377 for more. | |
| # assert_allclose(last_y_pred, y_pred) | |
| pass |
There was a problem hiding this comment.
Let's bypass the test. I think that we cannot test this part until we don't fix the percentile computation.
There are no difference between checking with tolerance so large that we don't test anything and not testing it :)
|
Thanks @glemaitre, changes made |
|
Thanks @lucyleeow |
Co-authored-by: Guillaume Lemaitre <[email protected]>
Co-authored-by: Guillaume Lemaitre <[email protected]>
Reference Issues/PRs
Towards #16155
What does this implement/fix? Explain your changes.
Remove Boston dataset in
sklearn/ensemble/tests/test_gradient_boosting.py. Use a subset of California housing dataset fortest_bostonand use diabetes dataset for all remaining tests.Any other comments?
Used California dataset for
test_bostondue to the high mse error if diabetes dataset used (for parameters in test, mse ranged from ~600-1700). Correspondingly there was also a bigger difference between predictions (assert_array_almost_equal(last_y_pred, y_pred)) and predictions were only equal withdecimals=-2.Happy to change to diabetes/another dataset if California not suitable.