optimal_holdout_size_emulation {OptHoldoutSize}R Documentation

Estimate optimal holdout size under semi-parametric assumptions

Description

Compute optimal holdout size for updating a predictive score given a set of training set sizes and estimates of mean cost per sample at those training set sizes.

This is essentially a wrapper for function mu_fn().

Usage

optimal_holdout_size_emulation(
  nset,
  k2,
  var_k2,
  N,
  k1,
  var_u = 1e+07,
  k_width = 5000,
  k2form = powerlaw,
  theta = powersolve_general(nset, k2, y_var = var_k2)$par,
  npoll = 1000,
  ...
)

Arguments

nset

Training set sizes for which a cost has been evaluated

k2

Estimated values of k2() at training set sizes nset

var_k2

Variance of error in k2 estimate at each training set size.

N

Total number of samples on which the model will be fitted/used

k1

Mean cost per sample with no predictive score in place

var_u

Marginal variance for Gaussian process kernel. Defaults to 1e7

k_width

Kernel width for Gaussian process kernel. Defaults to 5000

k2form

Functional form governing expected cost per sample given sample size. Should take two parameters: n (sample size) and theta (parameters). Defaults to function powerlaw.

theta

Current estimates of parameter values for k2form. Defaults to the MLE power-law solution corresponding to n,k2, and var_k2.

npoll

Check npoll equally spaced values between 1 and N for minimum. If NULL, check all values (this can be slow). Defaults to 1000

...

Passed to function optimise()

Value

Object of class 'optholdoutsize_emul' with elements "cost" (minimum cost),"size" (OHS),"nset","k2","var_k2","N","k1","var_u","k_width","theta" (parameters)

Examples


# See examples for mu_fn()

[Package OptHoldoutSize version 0.1.0.0 Index]