R: Repeatedly estimate model using resampling with replacement

mxBootstrap {OpenMx}

R Documentation

Repeatedly estimate model using resampling with replacement

Description

Bootstrapping is used to quantify the variability of parameter estimates. A new sample is drawn from the model data (uniformly sampling the original data with replacement). The model is re-fitted to this new sample. This process is repeated many times. This yields a series of estimates from these replications which can be used to assess the variability of the parameters.

note: mxBootstrap only bootstraps free model parameters:

To bootstrap algebras, see mxBootstrapEval

To report bootstrapped standardized paths in RAM models, mxBootstrap the model, and then run through mxBootstrapStdizeRAMpaths

Usage

mxBootstrap(model, replications=200, ...,
                        data=NULL, plan=NULL, verbose=0L,
                        parallel=TRUE, only=as.integer(NA),
			OK=mxOption(model, "Status OK"), checkHess=FALSE, unsafe=FALSE)

Arguments

`model`	The MxModel to be run.
`replications`	The number of resampling replications. If available, replications from prior mxBootstrap invocations will be reused.
`...`	Not used. Forces remaining arguments to be specified by name.
`data`	A character vector of data or model names
`plan`	Deprecated
`verbose`	For levels greater than 0, enables runtime diagnostics
`parallel`	Whether to process the replications in parallel (not yet implemented!)
`only`	When provided, only the given replication from a prior run of `mxBootstrap` will be performed. See details.
`OK`	The set of status code that are considered successful
`checkHess`	Whether to approximate the Hessian in each replication
`unsafe`	A boolean indicating whether to ignore errors.

Details

By default, all datasets in the given model are resampled independently. If resampling is desired from only some of the datasets then the models containing them can be listed in the ‘data’ parameter.

The frequency column in the mxData object is used represent a resampled dataset. When resampling, the original row proportions, as given by the original frequency column, are respected.

When the model has a default compute plan and ‘checkHess’ is kept at FALSE then the Hessian will not be approximated or checked. On the other hand, ‘checkHess’ is TRUE then the Hessian will be approximated by finite differences. This procedure is of some value because it can be informative to check whether the Hessian is positive definite (see mxComputeHessianQuality). However, approximating the Hessian is often costly in terms of CPU time. For bootstrapping, the parameter estimates derived from the resampled data are typically of primary interest.

On occasion, replications will fail. Sometimes it can be helpful to exactly reproduce a failed replication to attempt to pinpoint the cause of failure. The ‘only’ option facilitates this kind of investigation. In normal operation, mxBootstrap uses the regular R random number generator to generate a seed for each replication. This seed is used to seed an internal pseudorandom number generator (currently the Mersenne Twister algorithm). These per-replication seeds are stored as part of the bootstrap output. When ‘only’ is specified, the associated stored seed is used to seed the internal random number generator so that identical weights can be regenerated.

mxBootstrap does not currently offer special support for nested, multilevel, or other dependent data structures. mxBootstrap assumes rows of data are independent. Multilevel models and state space models violate the independence assumption employed by mxBootstrap. By default the unsafe argument prevents multilevel and state space models from using mxBootstrap; however, setting unsafe=TRUE allows multilevel and state space models to use bootstrapping under the – perhaps foolish – assumption that the user is sufficiently knowledgeable to interpret the results.

Value

The given model is returned with the compute plan modified to consist of mxComputeBootstrap. Results of the bootstrap replications are stored inside the compute plan. mxSummary can be used to obtain per-parameter quantiles and standard errors.

Examples

library(OpenMx)

data(multiData1)

manifests <- c("x1", "x2", "y")

biRegModelRaw <- mxModel(
  "Regression of y on x1 and x2",
  type="RAM",
  manifestVars=manifests,
  mxPath(from=c("x1","x2"), to="y", 
         arrows=1, 
         free=TRUE, values=.2, labels=c("b1", "b2")),
  mxPath(from=manifests, 
         arrows=2, 
         free=TRUE, values=.8, 
         labels=c("VarX1", "VarX2", "VarE")),
  mxPath(from="x1", to="x2",
         arrows=2, 
         free=TRUE, values=.2, 
         labels=c("CovX1X2")),
  mxPath(from="one", to=manifests, 
         arrows=1, free=TRUE, values=.1, 
         labels=c("MeanX1", "MeanX2", "MeanY")),
  mxData(observed=multiData1, type="raw"))

biRegModelRawOut <- mxRun(biRegModelRaw)

boot <- mxBootstrap(biRegModelRawOut, 10)   # start with 10
summary(boot)

# Looks good, now do the rest
boot <- mxBootstrap(boot)
summary(boot)

# examine replication 3
boot3 <- mxBootstrap(boot, only=3)

print(coef(boot3))
print(boot$compute$output$raw[3,names(coef(boot3))])

[Package OpenMx version 2.21.11 Index]