Log loss metric can be Infinity or NaN

For binary classification (and perhaps multiclass classification) the `log loss` can be infinite. The `log loss reduction` can also be negative infinity, as it is a shifting and rescaling of the `log loss`.

Similarly, the `log loss` can be a `NaN`. This is specifically guarded against in the code, but does seems like a bug too.

The culprit for both cases lies in the initial calculations in the `ProcessRow()` method of the `Aggregator` for the `BinaryClassifierEvaluator`.

```cs
Double logloss;
if (!Single.IsNaN(prob))
{
    if (_label > 0)
    {
        // REVIEW: Should we bring back the option to use ln instead of log2?
        logloss = -Math.Log(prob, 2);
    }
    else
        logloss = -Math.Log(1.0 - prob, 2);
}
else
    logloss = Double.NaN;
```

I propose that to guard against infinities we add an epsilon before taking the log.

To guard against `NaNs`, we will need to fix the probability calculations (e.g. in the calibrator(s)).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log loss metric can be Infinity or NaN #2708

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Log loss metric can be Infinity or NaN #2708

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions