Skip to content

Conversation

@amueller
Copy link
Contributor

closes #734.

explicitly handle extension in flow creation
@amueller amueller requested review from PGijsbers and mfeurer and removed request for mfeurer July 22, 2019 18:52
@mfeurer
Copy link
Collaborator

mfeurer commented Jul 23, 2019

Yep, that should solve it, thanks for the PR. However, it seems like you forgot to add the MyLR class.

@amueller
Copy link
Contributor Author

damn, should be MyDummy now... I'll fix after lunch.

self.flow_id = flow_id

self._extension = get_extension_by_flow(self)
if extension is None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to remove this? Would work without it but I found it more explicit to directly give the extension.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is totally fine by me.

@amueller
Copy link
Contributor Author

probably fine now ;)

Copy link
Collaborator

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good now. However, I'm afraid that your unit test fails for scikit-learn 0.18.X:

self = <tests.test_extensions.test_sklearn_extension.test_sklearn_extension.TestSklearnExtensionRunFunctions testMethod=test_run_model_on_task>
    def test_run_model_on_task(self):
        class MyDummy(sklearn.dummy.DummyClassifier):
            pass
        task = openml.tasks.get_task(1)
>       openml.runs.run_model_on_task(MyDummy(), task)
/home/travis/build/openml/openml-python/tests/test_extensions/test_sklearn_extension/test_sklearn_extension.py:1226: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/travis/build/openml/openml-python/openml/runs/functions.py:106: in run_model_on_task
    upload_flow=upload_flow,
/home/travis/build/openml/openml-python/openml/runs/functions.py:222: in run_flow_on_task
    add_local_measures=add_local_measures,
/home/travis/build/openml/openml-python/openml/runs/functions.py:446: in _run_task_get_arffcontent
    X_test=test_x,
/home/travis/build/openml/openml-python/openml/extensions/sklearn/extension.py:1248: in _run_model_on_fold
    pred_y = model_copy.predict(X_test)
/home/travis/miniconda/envs/testenv/lib/python3.6/site-packages/sklearn/dummy.py:174: in predict
    X = check_array(X, accept_sparse=['csr', 'csc', 'coo'])
/home/travis/miniconda/envs/testenv/lib/python3.6/site-packages/sklearn/utils/validation.py:407: in check_array
    _assert_all_finite(array)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
X = array([[nan,  0.,  1., ...,  0.,  0., nan],
       [ 8.,  0.,  1., ..., nan,  0., nan],
       [ 8.,  0.,  1., ..., na...nan,  0., nan],
       [nan,  0., nan, ..., nan,  0., nan],
       [ 8.,  0.,  0., ..., nan,  0., nan]], dtype=float32)
    def _assert_all_finite(X):
        """Like assert_all_finite, but only for ndarray."""
        X = np.asanyarray(X)
        # First try an O(n) time, O(1) space solution for the common case that
        # everything is finite; fall back to O(n) space np.isfinite to prevent
        # false positives from overflow in sum method.
        if (X.dtype.char in np.typecodes['AllFloat'] and not np.isfinite(X.sum())
                and not np.isfinite(X).all()):
            raise ValueError("Input contains NaN, infinity"
>                            " or a value too large for %r." % X.dtype)
E           ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
/home/travis/miniconda/envs/testenv/lib/python3.6/site-packages/sklearn/utils/validation.py:58: ValueError

Could you please check that?

@amueller
Copy link
Contributor Author

I think the failures now are unrelated?

@mfeurer mfeurer merged commit 4c71d1d into openml:develop Jul 26, 2019
@amueller
Copy link
Contributor Author

ping @PGijsbers ;)
This one is what prevents us from pushing flows with non-sklearn components

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Still not able to use non-sklearn estimators without wrapping them in a pipeline

2 participants