-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
Closed
Description
The GaussianProcessRegressor with a DotProduct and a ConstantKernel produces large negative variances internally.
Here's the code producing the negative variance warning:
>>> import numpy as np
>>>
>>> from sklearn.gaussian_process import GaussianProcessRegressor
>>> from sklearn.gaussian_process.kernels import (DotProduct,
... ConstantKernel)
>>>
>>>
>>> kernel = ConstantKernel(0.1, (0.01, 10.0)) * \
... (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)
>>>
>>> # Specify Gaussian Process
>>> gp = GaussianProcessRegressor(kernel=kernel)
>>>
>>> # Generate data and fit GP
>>> rng = np.random.RandomState(4)
>>> X = rng.uniform(0, 5, 10)[:, np.newaxis]
>>> y = np.sin((X[:, 0] - 2.5) ** 2)
>>> gp.fit(X, y)
GaussianProcessRegressor(alpha=1e-10, copy_X_train=True,
kernel=0.316**2 * DotProduct(sigma_0=1) ** 2,
n_restarts_optimizer=0, normalize_y=False,
optimizer='fmin_l_bfgs_b', random_state=None)
>>>
>>> # Plot posterior
>>> X_ = np.linspace(0, 5, 100)
>>> y_mean, y_std = gp.predict(X_[:, np.newaxis], return_std=True)
/home/adrin/Projects/github.com/sklearn/.venv/lib/python3.6/site-packages/sklearn/gaussian_process/gpr.py:343: UserWarning: Predicted variances smaller than 0. Setting those variances to 0.
warnings.warn("Predicted variances smaller than 0. "
If we check the calculated variances in gpr.py (by printing the y_var here), here's what we'd see:
[ 2.42145464e+01 4.85567023e+01 4.34729485e+00 1.34989407e+02
-6.81354941e+01 5.30776911e+01 8.10185240e+01 -4.94827674e+01
-7.40955943e+01 -2.02364363e+01 -1.20589902e+01 9.65798936e+01
-3.96335590e+01 1.13534957e+01 -6.16167350e+00 -2.80861813e+01
-4.07486721e+01 -2.39095894e+01 1.16871546e+01 4.60747733e+01
6.05989950e+01 -4.47698228e+01 -4.16105974e+01 -6.73931040e+01
4.08890391e+01 3.71879775e+00 5.12442614e+01 8.00092437e+01
7.09434327e+01 5.31810295e+00 -9.30518449e+00 7.57705512e+01
-1.29907498e+02 2.10651826e+02 -3.36288370e+01 2.89838706e+01
-1.33949448e+02 5.52126093e+01 2.16885722e+01 1.83546804e+00
2.63437968e+01 1.80020376e+01 5.58780224e+01 1.16537045e+02
8.01138442e+01 -3.36979937e+01 1.05969985e+02 1.25843312e+02
-1.67483760e+02 6.32855098e+01 -1.10059500e+02 -1.89635624e+02
-5.40204713e+01 1.70672383e+02 -2.14725018e+02 -1.73279596e+02
-4.93609765e+01 2.63706620e+02 -8.66689829e+01 -1.59377726e+01
1.79508509e+02 6.84593854e+01 2.23730884e+02 6.45890080e+01
1.69288616e+02 1.31872161e+02 -8.22187725e+00 -8.89816401e+01
2.24920680e+02 -2.77729738e+01 4.24908030e+01 1.09571647e+02
2.69768964e+02 -9.73400851e+00 1.74536122e+01 6.66466984e+01
-7.42109446e+01 4.43525795e+01 4.26498238e+01 -1.74072677e+02
2.32544203e+02 1.78199055e+02 4.09884022e+01 -1.12488149e+01
2.11883087e+01 -1.10979607e+01 1.64940864e+02 2.86743262e+01
3.60303895e+02 3.73809525e+01 1.78025798e+02 2.60973291e+02
2.00362066e+02 -5.94849307e+01 6.09383870e+01 -2.33417095e-01
-1.57363098e+01 4.20521569e+02 -1.02415712e+02 -5.21116455e+01]
If the error was due to numerical issues, wouldn't those variances be negative, but small? They seem too large to me.
See #11562
Note to self (issue #11562 is half fixed, finish the fix after this issue is resolved)