Navigation

Operators and Keywords

Function List:

C++ API

Loadable Function: GPM = pgp_train (X, F, y, theta, opts)

Loadable Function: [GPM, nll] = pgp_train (X, F, y, theta, nu, nlin, corf, opts)

If requested, estimates the hyperparameters for Gaussian Process Regression (inverse length scales and relative noise) via reduced maximum likelihood, and then sets up the model for inference (prediction), storing necessary information in the structure GPM, intended for use with pgp_predict.

X is the matrix of independent variables of the observations, F is the matrix of inducing points (cluster centers), y is a vector containing the dependent variables, theta contains the (initial) inverse length scales for the regression model. If theta is a row vector, rows of X correspond to observations, columns to variables. Otherwise, it is the other way around.

nu specifies the (initial) relative noise level. If not supplied, it defaults to 1e-5. nlin specifies the number of leading variables to include in linear underlying trend. If not supplied, it defaults to 0 (constant trend).

corf specifies the decreasing function type for correlation function: corr(x,y) = f(norm(theta.*(x-y))). Possible values:

gau
f(t) = exp(-t^2) (gaussian)
exp
f(t) = exp(-t) (exponential)
imq
f(t) = 1/sqrt(1+t^2) (inverse multiquadric)
mt3
f(t) = (1+sqrt(6*t))*exp(-sqrt(6*t)) (Matern-3/2 covariance)
mt5
f(t) = (1+sqrt(10*t)+10*t^2/3)*exp(-sqrt(10*t)) (Matern-5/2 covariance)

opts is a cell array in the form {"option name",option value,...}. Possible options:

maxev
maximum number of factorizations to be used during training. default 500.
tol
stopping tolerance (minimum trust-region radius). default 1e-6. the iteration terminates if the trust region gets below tol.
ftol
stopping tolerance (minimum objective reduction). default 1e-4. the iteration terminates if the relative reduction of two successive downhill steps gets below ftol and the second one is smaller.
numin
minimum allowable noise. Default is sqrt(1e1*eps).

Training cell array opts is recognized even if other arguments are omitted. If it is not supplied (the last argument is not a cell array), training is skipped.

On return the function creates the GPM structure, which can subsequently be used for predictions with pgp_predict. If nll is present, it is set to the resulting negative log likelihood.

See also: pgp_predict