Skip to content

Issues with negative values in sample_weight #12464

@agamemnonc

Description

@agamemnonc

Description

I am not sure what the interpretation of a negative value in sample_weight might be and why this should be supported, but I believe that there should be constraints in using non-negative values in several cases; the use of negative ones can lead to some very strange results.

See an example below for r2_score where the use of negative weights yields a value larger than one, which really does not make sense.

Steps/Code to Reproduce

import numpy as np

from sklearn.metrics import r2_score

np.random.seed(seed=2)
x = np.random.randn(100,)
y = x + 0.3*np.random.randn(*x.shape)
w = np.random.randn(*x.shape)

r2_score(x, y, sample_weight=w)

Expected Results

Something smaller or equal to 1.0

Actual Results

1.1919195778883198

Versions

System:
python: 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)]
executable: C:\Users\nak142\Miniconda3\envs\sklearn_contrib\pythonw.exe
machine: Windows-10-10.0.17134-SP0

BLAS:
macros: SCIPY_MKL_H=None, HAVE_CBLAS=None
lib_dirs: C:/Users/nak142/Miniconda3/envs/sklearn_contrib\Library\lib
cblas_libs: mkl_rt

Python deps:
pip: 10.0.1
setuptools: 40.0.0
sklearn: 0.21.dev0
numpy: 1.15.0
scipy: 1.1.0
Cython: 0.28.5
pandas: None

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions