Skip to content

cross_val_predict with method='predict_proba' throws a NotFittedError #9639

@sophiechauvin

Description

@sophiechauvin

Description

I am trying to use the cross_val_predict function for cross-validation, with the 'predict_proba' method to output probabilities instead of class tags. If my classifier is not fitted beforehand, I get a NotFittedError. This error does not show up when calling cross_val_predict with the default 'predict' method.

This issue is new to version 0.19.0 and my scripts used to work with version 0.18.1.

Steps/Code to Reproduce

Similar piece of code as in the example on http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_predict.html#sklearn.model_selection.cross_val_predict

from sklearn import datasets, linear_model
from sklearn.model_selection import cross_val_predict
iris = datasets.load_iris()
X = iris.data
y = iris.target
X, y = X[y != 2], y[y != 2]
classifier = linear_model.SGDClassifier(loss='log')
y_pred = cross_val_predict(classifier, X, y, method='predict_proba')

Expected Results

Cross-validation is performed and outputs probabilities for each training example as computed from fitting N times a classifier on N folds.

Actual Results

I get an error:

NotFittedError                            Traceback (most recent call last)
<ipython-input-4-66f6870430ff> in <module>()
----> 1 y_pred = cross_val_predict(classifier, X, y, method='predict_proba')

/home/sophie/.virtualenvs/something/lib/python3.5/site-packages/sklearn/model_selection/_validation.py in cross_val_predict(estimator, X, y, groups, cv, n_jobs, verbose, fit_params, pre_dispatch, method)
    639 
    640     # Ensure the estimator has implemented the passed decision function
--> 641     if not callable(getattr(estimator, method)):
    642         raise AttributeError('{} not implemented in estimator'
    643                              .format(method))

/home/sophie/.virtualenvs/something/lib/python3.5/site-packages/sklearn/linear_model/stochastic_gradient.py in predict_proba(self)
    824         http://jmlr.csail.mit.edu/papers/volume2/zhang02c/zhang02c.pdf
    825         """
--> 826         self._check_proba()
    827         return self._predict_proba
    828 

/home/sophie/.virtualenvs/something/lib/python3.5/site-packages/sklearn/linear_model/stochastic_gradient.py in _check_proba(self)
    782 
    783     def _check_proba(self):
--> 784         check_is_fitted(self, "t_")
    785 
    786         if self.loss not in ("log", "modified_huber"):

/home/sophie/.virtualenvs/something/lib/python3.5/site-packages/sklearn/utils/validation.py in check_is_fitted(estimator, attributes, msg, all_or_any)
    735 
    736     if not all_or_any([hasattr(estimator, attr) for attr in attributes]):
--> 737         raise NotFittedError(msg % {'name': type(estimator).__name__})
    738 
    739 

NotFittedError: This SGDClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

Versions

Linux-4.4.0-92-generic-x86_64-with-Ubuntu-16.04-xenial
Python 3.5.2 (default, Nov 17 2016, 17:05:23) 
[GCC 5.4.0 20160609]
NumPy 1.12.0
SciPy 0.19.0
Scikit-Learn 0.19.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugEasyWell-defined and straightforward way to resolve

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions