Skip to content

Bug in scikit-learn GaussianProcessRegressor #22

@dreamer2368

Description

@dreamer2368

GaussianProcessRegressor exhibits an erroneous behavior depending on the data. Specifically, optimal length scale for kernel shrinks down to minimum value when fitted with the data that has a sign change. Following is the minimum failing example.

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.gaussian_process.kernels import ConstantKernel, Matern, RBF
from sklearn.gaussian_process import GaussianProcessRegressor

x = np.array([[0.], [1.0]])
xpred = np.linspace(x[0][0], x[-1][0], 1000)
xpred = xpred.reshape(-1, 1)

# this is a working dataset, which does not have a sign change.
y1 = np.array([-4, -1e-3])
y1 = y1.reshape(-1, 1)

kernel = ConstantKernel() * RBF(length_scale_bounds = (1e-5, 1e5))
gp = GaussianProcessRegressor(kernel = kernel, n_restarts_optimizer = 100, random_state = 1, alpha=1e-10)
gp = gp.fit(x, y1)

print(gp.kernel_.get_params()['k2__length_scale'])
yavg1, ystd1 = gp.predict(xpred, return_std = True)

plt.figure(1)
plt.plot(x, y1, 'ok')
plt.plot(xpred, yavg1, '-r')
plt.plot(xpred, yavg1 + 2 * ystd1, '--r')
plt.plot(xpred, yavg1 - 2 * ystd1, '--r')
plt.title('With (0, -4) and (1, -1e-3)')

# this is a failing dataset, which is close to the dataset above but has a sign change.
y2 = np.array([-4, 1e-3])
y2 = y2.reshape(-1, 1)

kernel = ConstantKernel() * RBF(length_scale_bounds = (1e-5, 1e5))
gp = GaussianProcessRegressor(kernel = kernel, n_restarts_optimizer = 100, random_state = 1, alpha=1e-10)
gp = gp.fit(x, y2)

print(gp.kernel_.get_params()['k2__length_scale'])
yavg2, ystd2 = gp.predict(xpred, return_std = True)

plt.figure(2)
plt.plot(x, y2, 'ok')
plt.plot(xpred, yavg2, '-r')
plt.plot(xpred, yavg2 + 2 * ystd2, '--r')
plt.plot(xpred, yavg2 - 2 * ystd2, '--r')
plt.title('With (0, -4) and (1, +1e-3)')

The reason for this behavior is not known yet. Running the same example with other packages, such as GPy, does not show this behavior, so this is a bug from scikit-learn package.

So far, the only so-called 'fix' is to set the lower bound to a 'reasonable' value. However, this does not really fix the root cause and forfeits the reason of using GP for properly tuning length scale hyperparameter under statistical principle. At this point, scikit-learn is better to be replaced with other GP packages.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions