This repository was archived by the owner on Feb 28, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 554
This repository was archived by the owner on Feb 28, 2024. It is now read-only.
New unexpected behavior with scikit-learn==0.24.1 #994
Copy link
Copy link
Open
Description
Hi, I've noticed that gp_minimize() has different behavior with scikit-learn==0.21.3 and 0.24.1
In older version if an objective func was returning the same value several times in a row everything was fine, but in new version it throws an error
ValueError: array must not contain infs or NaNs
Small example:
from skopt import gp_minimize
from skopt.space import Real
from skopt.utils import use_named_args
dimensions = [Real(name='x', low=0.0, high=1.0)]
@use_named_args(dimensions=dimensions)
def objective_negative(x):
if x < 0.9:
return -1.0
else:
return -x/2
res = gp_minimize(objective_negative,
dimensions=dimensions,
x0=None,
y0=None,
acq_func="LCB",
n_calls=3,
n_initial_points=2,
noise='gaussian',
random_state=1234,
initial_point_generator='hammersly',
verbose=True,
kappa=3)
print(res.x_iters)
print(res.func_vals)
if scikit-learn==0.21.3 the result will be:
Iteration No: 1 started. Evaluating function at random point.
Iteration No: 1 ended. Evaluation done at random point.
Time taken: 0.0002
Function value obtained: -1.0000
Current minimum: -1.0000
Iteration No: 2 started. Evaluating function at random point.
Iteration No: 2 ended. Evaluation done at random point.
Time taken: 0.1511
Function value obtained: -1.0000
Current minimum: -1.0000
Iteration No: 3 started. Searching for the next optimal point.
Iteration No: 3 ended. Search finished for the next optimal point.
Time taken: 0.1819
Function value obtained: -0.5000
Current minimum: -1.0000
[[0.75], [0.125], [1.0]]
[-1. -1. -0.5]
if scikit-learn==0.24.1 the result will be:
Iteration No: 1 started. Evaluating function at random point.
Iteration No: 1 ended. Evaluation done at random point.
Time taken: 0.0002
Function value obtained: -1.0000
Current minimum: -1.0000
Iteration No: 2 started. Evaluating function at random point.
Traceback (most recent call last):
File "scratch_11.py", line 25, in <module>
kappa=3)
File "~/scikit-optimize/skopt/optimizer/gp.py", line 268, in gp_minimize
callback=callback, n_jobs=n_jobs, model_queue_size=model_queue_size)
File "~/scikit-optimize/skopt/optimizer/base.py", line 302, in base_minimize
result = optimizer.tell(next_x, next_y)
File "~/scikit-optimize/skopt/optimizer/optimizer.py", line 493, in tell
return self._tell(x, y, fit=fit)
File "~/scikit-optimize/skopt/optimizer/optimizer.py", line 536, in _tell
est.fit(self.space.transform(self.Xi), self.yi)
File "~/scikit-optimize/skopt/learning/gaussian_process/gpr.py", line 195, in fit
super(GaussianProcessRegressor, self).fit(X, y)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/sklearn/gaussian_process/_gpr.py", line 237, in fit
self.kernel_.bounds))]
File "~/scikit-optimize/venv/lib/python3.7/site-packages/sklearn/gaussian_process/_gpr.py", line 508, in _constrained_optimization
bounds=bounds)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/optimize/_minimize.py", line 618, in minimize
callback=callback, **options)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/optimize/lbfgsb.py", line 308, in _minimize_lbfgsb
finite_diff_rel_step=finite_diff_rel_step)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 262, in _prepare_scalar_function
finite_diff_rel_step, bounds, epsilon=epsilon)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py", line 76, in __init__
self._update_fun()
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py", line 166, in _update_fun
self._update_fun_impl()
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py", line 73, in update_fun
self.f = fun_wrapped(self.x)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py", line 70, in fun_wrapped
return fun(x, *args)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 74, in __call__
self._compute_if_needed(x, *args)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 68, in _compute_if_needed
fg = self.fun(x, *args)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/sklearn/gaussian_process/_gpr.py", line 228, in obj_func
theta, eval_gradient=True, clone_kernel=False)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/sklearn/gaussian_process/_gpr.py", line 481, in log_marginal_likelihood
alpha = cho_solve((L, True), y_train) # Line 3
File "~/scikit-optimize/venv/lib/python3.7/site-packages/scipy/linalg/decomp_cholesky.py", line 194, in cho_solve
b1 = asarray_chkfinite(b)
File "~/scikit-optimize/venv/lib/python3.7/site-packages/numpy/lib/function_base.py", line 486, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs
I guess the new behavior appeares in the GaussianProcessRegressor.
During fitting GaussianProcessRegressor y == array([-1., -1.]) before lines 198-204 in sklearn/gaussian_process/_gpr.py
Then normalization occurs:
# Normalize target value
if self.normalize_y:
self._y_train_mean = np.mean(y, axis=0) # -1.0
self._y_train_std = np.std(y, axis=0) # 0.0
# Remove mean and make unit variance
y = (y - self._y_train_mean) / self._y_train_std # dividing by zero and array([nan, nan]) as a result , which causes an error
Could you, please, tell if this behavior is going to be fixed, or it is assumed to be the new expected behavior?
fridrichmrtn
Metadata
Metadata
Assignees
Labels
No labels