R: Initialize method or type of the model

method {gplite}

R Documentation

Initialize method or type of the model

Description

Functions for initializing the method or type of the model, which can then be passed to gp_init. The supported methods are:

method_full: Full GP, so full exact covariance function is used, meaning that the inference will be for the n latent function values (fitting time scales cubicly in n).
method_fitc: Fully independent training (and test) conditional, or FITC, approximation (see Quiñonero-Candela and Rasmussen, 2005; Snelson and Ghahramani, 2006). The fitting time scales O(n*m^2), where n is the number of data points and m the number of inducing points num_inducing. The inducing point locations are chosen using the k-means algorithm.
method_rf: Random features, that is, linearized GP. Uses random features (or basis functions) for approximating the covariance function, which means the inference time scales cubicly in the number of approximating basis functions num_basis. For stationary covariance functions random Fourier features (Rahimi and Recht, 2007) is used, and for non-stationary kernels using case specific method when possible (for example, drawing the hidden layer parameters randomly for cf_nn). For cf_const and cf_lin this means using standard linear model, and the inference is performed on the weight space (not in the function space). Thus if the model is linear (only cf_const and cf_lin are used), this will give a potentially huge speed-up if the number of features is considerably smaller than the number of data points.

Usage

method_full()

method_fitc(
  inducing = NULL,
  num_inducing = 100,
  bin_along = NULL,
  bin_count = 10,
  seed = 12345
)

method_rf(num_basis = 400, seed = 12345)

Arguments

`inducing`	Inducing points to use. If not given, then `num_inducing` points will be placed in the input space using a clustering algorithm.
`num_inducing`	Number of inducing points for the approximation. Will be ignored if the inducing points are given by the user.
`bin_along`	Either an index or a name of the input variable along which to bin the values before placing the inducing inputs. For example, if `bin_along=3`, then the input data is divided into `bin_count` bins along 3rd input variable, and each bin will have the same number inducing points (or as close as possible). This can sometimes be useful to ensure that inducing points are spaced evenly with respect to some particular variable, for example time in spatio-temporal models.
`bin_count`	The number of bins to use if `bin_along` given. Has effect only if `bin_along` is given.
`seed`	Random seed for reproducible results.
`num_basis`	Number of basis functions for the approximation.

Value

The method object.

References

Rahimi, A. and Recht, B. (2008). Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems 20.

Quiñonero-Candela, J. and Rasmussen, C. E (2005). A unifying view of sparse approximate Gaussian process regression. Journal of Machine Learning Research 6:1939-1959.

Snelson, E. and Ghahramani, Z. (2006). Sparse Gaussian processes using pseudo-inputs. In Advances in Neural Information Processing Systems 18.

Examples



#' # Generate some toy data
# NOTE: this is so small dataset that in reality there would be no point
# use sparse approximation here; we use this small dataset only to make this
# example run fast
set.seed(1242)
n <- 50
x <- matrix(rnorm(n * 3), nrow = n)
f <- sin(x[, 1]) + 0.5 * x[, 2]^2 + x[, 3]
y <- f + 0.5 * rnorm(n)
x <- data.frame(x1 = x[, 1], x2 = x[, 2], x3 = x[, 3])

# Full exact GP with Gaussian likelihood
gp <- gp_init(cf_sexp())
gp <- gp_optim(gp, x, y)

# Approximate solution using random features (here we use a very small 
# number of random features only to make this example run fast)
gp <- gp_init(cf_sexp(), method = method_rf(num_basis = 30))
gp <- gp_optim(gp, x, y)

# Approximate solution using FITC (here we use a very small 
# number of incuding points only to make this example run fast)
gp <- gp_init(cf_sexp(), method = method_fitc(num_inducing = 10))
gp <- gp_optim(gp, x, y)

[Package gplite version 0.13.0 Index]