Skip to content

GaussianProcessRegressor internally calculates large negative variances for the posterior with DotProduct and ConstantKernel #11663

@adrinjalali

Description

@adrinjalali

The GaussianProcessRegressor with a DotProduct and a ConstantKernel produces large negative variances internally.

Here's the code producing the negative variance warning:

>>> import numpy as np
>>> 
>>> from sklearn.gaussian_process import GaussianProcessRegressor
>>> from sklearn.gaussian_process.kernels import (DotProduct,
...                                               ConstantKernel)
>>> 
>>> 
>>> kernel = ConstantKernel(0.1, (0.01, 10.0)) * \
...     (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)
>>> 
>>> # Specify Gaussian Process
>>> gp = GaussianProcessRegressor(kernel=kernel)
>>> 
>>> # Generate data and fit GP
>>> rng = np.random.RandomState(4)
>>> X = rng.uniform(0, 5, 10)[:, np.newaxis]
>>> y = np.sin((X[:, 0] - 2.5) ** 2)
>>> gp.fit(X, y)
GaussianProcessRegressor(alpha=1e-10, copy_X_train=True,
             kernel=0.316**2 * DotProduct(sigma_0=1) ** 2,
             n_restarts_optimizer=0, normalize_y=False,
             optimizer='fmin_l_bfgs_b', random_state=None)
>>> 
>>> # Plot posterior
>>> X_ = np.linspace(0, 5, 100)
>>> y_mean, y_std = gp.predict(X_[:, np.newaxis], return_std=True)
/home/adrin/Projects/github.com/sklearn/.venv/lib/python3.6/site-packages/sklearn/gaussian_process/gpr.py:343: UserWarning: Predicted variances smaller than 0. Setting those variances to 0.
  warnings.warn("Predicted variances smaller than 0. "

If we check the calculated variances in gpr.py (by printing the y_var here), here's what we'd see:

[ 2.42145464e+01  4.85567023e+01  4.34729485e+00  1.34989407e+02
 -6.81354941e+01  5.30776911e+01  8.10185240e+01 -4.94827674e+01
 -7.40955943e+01 -2.02364363e+01 -1.20589902e+01  9.65798936e+01
 -3.96335590e+01  1.13534957e+01 -6.16167350e+00 -2.80861813e+01
 -4.07486721e+01 -2.39095894e+01  1.16871546e+01  4.60747733e+01
  6.05989950e+01 -4.47698228e+01 -4.16105974e+01 -6.73931040e+01
  4.08890391e+01  3.71879775e+00  5.12442614e+01  8.00092437e+01
  7.09434327e+01  5.31810295e+00 -9.30518449e+00  7.57705512e+01
 -1.29907498e+02  2.10651826e+02 -3.36288370e+01  2.89838706e+01
 -1.33949448e+02  5.52126093e+01  2.16885722e+01  1.83546804e+00
  2.63437968e+01  1.80020376e+01  5.58780224e+01  1.16537045e+02
  8.01138442e+01 -3.36979937e+01  1.05969985e+02  1.25843312e+02
 -1.67483760e+02  6.32855098e+01 -1.10059500e+02 -1.89635624e+02
 -5.40204713e+01  1.70672383e+02 -2.14725018e+02 -1.73279596e+02
 -4.93609765e+01  2.63706620e+02 -8.66689829e+01 -1.59377726e+01
  1.79508509e+02  6.84593854e+01  2.23730884e+02  6.45890080e+01
  1.69288616e+02  1.31872161e+02 -8.22187725e+00 -8.89816401e+01
  2.24920680e+02 -2.77729738e+01  4.24908030e+01  1.09571647e+02
  2.69768964e+02 -9.73400851e+00  1.74536122e+01  6.66466984e+01
 -7.42109446e+01  4.43525795e+01  4.26498238e+01 -1.74072677e+02
  2.32544203e+02  1.78199055e+02  4.09884022e+01 -1.12488149e+01
  2.11883087e+01 -1.10979607e+01  1.64940864e+02  2.86743262e+01
  3.60303895e+02  3.73809525e+01  1.78025798e+02  2.60973291e+02
  2.00362066e+02 -5.94849307e+01  6.09383870e+01 -2.33417095e-01
 -1.57363098e+01  4.20521569e+02 -1.02415712e+02 -5.21116455e+01]

If the error was due to numerical issues, wouldn't those variances be negative, but small? They seem too large to me.

See #11562

Note to self (issue #11562 is half fixed, finish the fix after this issue is resolved)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions