Single sample test time performance

Test time performance on single samples is important in real-world applications. Currently, performance on individual samples is often governed by input validation rather than model evaluation. Consider the following profile of `GradientBoostingRegressor.decision_function` trained on boston using 250 trees::

```
645         1          151    151.0     53.7          X = array2d(X, dtype=DTYPE, order='C')
646         1           49     49.0     17.4            score = self._init_decision_function(X)
647         1           78     78.0     27.8            predict_stages(self.estimators_, X, self.learning_rate, score)
648         1            3      3.0      1.1               return score
```

The major reason is that `sklearn.validation.array2d` calls `scipy.sparse.issparse` twice - this could be fixed but still the overhead from checking if the array values are finite is considerable.

We should optimize input validation or provide means to turn it off.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Single sample test time performance #1363

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Single sample test time performance #1363

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions