mxData {OpenMx} | R Documentation |
Create MxData Object
Description
This function creates a new MxData object. This can be used all forms of analysis (including WLS: see mxFitFunctionWLS). It packages observed data (e.g. a dataframe, matrix, or cov or cor matrix) into an object with additional information allowing it to be processed in an mxModel.
Usage
mxData(observed=NULL, type="none", means = NA, numObs = NA, acov=NA, fullWeight=NA,
thresholds=NA, ..., observedStats=NA, sort=NA, primaryKey = as.character(NA),
weight = as.character(NA), frequency = as.character(NA),
verbose = 0L, .parallel=TRUE, .noExoOptimize=TRUE,
minVariance=sqrt(.Machine$double.eps), algebra=c(),
warnNPDacov=TRUE, warnNPDuseWeight=TRUE, exoFree=NULL,
naAction=c("pass","fail","omit","exclude"),
fitTolerance=sqrt(as.numeric(mxOption(key="Optimality tolerance"))),
gradientTolerance=1e-2)
Arguments
Details
The mxData function creates MxData objects used in mxModels. The ‘observed’ argument may take either a data frame or a matrix, which is then described with the ‘type’ argument. Data types describe compatibility and usage with expectation functions in MxModel objects. Three data types are supported (acov is deprecated).
- raw
The contents of the ‘observed’ argument are treated as raw data. Missing values are permitted and must be designated as the system missing value. The ‘means’ and ‘numObs’ arguments cannot be specified, as the ‘means’ argument is not relevant and the ‘numObs’ argument is automatically populated with the number of rows in the data. Data of this type may use fit functions such as mxFitFunctionML or mxFitFunctionWLS. mxFitFunctionML will automatically use use full-information maximum likelihood for raw data.
- cov
The contents of the ‘observed’ argument are treated as a covariance matrix. The ‘means’ argument is not required, but may be included for estimations involving means. The ‘numObs’ argument is required, which should reflect the number of observations or rows in the data described by the covariance matrix. Cov data typically use the mxFitFunctionML fit function, depending on the specified model.
- acov
This type was used for WLS data as created by mxDataWLS. Unless you are using summary data, its use is deprecated. Instead, use type =‘raw’ and an mxFitFunctionWLS. If type ‘acov’ is set, the ‘observed’ argument will (usually) contain raw data and the ‘observedStats’ slot contain a list of observed statistics.
- cor
The contents of the ‘observed’ argument are treated as a correlation matrix. The ‘means’ argument is not required, but may be included for estimations involving means. The ‘numObs’ argument is required, which should reflect the number of observations or rows in the data described by the covariance matrix. Models with cor data typically use the mxFitFunctionML fit function.
Note on data handling: OpenMx uses the names of variables to map them onto other elements of your model, such as expectation functions.
Thus for data provided as a data.frame, ensure the columns have appropriate names
.
Covariance and correlation matrices need to have both the row and column names set and these must be
identical, for instance by using dimnames = list(varNames, varNames)
.
Correlation data
To obtain accurate parameter estimates and standard errors,
it is necessary to constrain the model implied covariance matrix to have
unit variances. This constraint is added automatically if you use an
mxModel
with type='RAM'
or type='LISREL'
.
Otherwise, you will need to add this constraint yourself.
WLS data
The observedStats
contains the following named objects: cov, slope, means, asymCov, useWeight, and thresholds.
‘cov’ The (polychoric) covariance matrix of raw data variables. An error is raised if any variance is smaller minVariance
.
‘slope’ The regression coefficients from all exogenous predictors to all observed variables. Required for exogenous predictors.
‘means’ The means of the data variables. Required for estimations involving means.
‘thresholds’ Thresholds of ordinal variables. Required for models including ordinal variables.
‘asymCov’ The asymptotic covariance matrix (all entries
non-zero). This matrix is sample size independent. Lavaan's NACOV
is
comparable to asymCov
multiplied by N^2.
‘useWeight’ (optional) The weight matrix used in the
mxFitFunctionWLS. Can be dense or diagonal for diagonally
weighted least squares. This matrix is scaled by the sample size.
Lavaan's WLS.V
is comparable to useWeight
.
A simple Newton Raphson optimizer is used to obtain the summary statistics from the raw data. There are two parameters that control the accuracy of the optimization. In a first pass, the fit function is optimized to ‘fitTolerance’. However, fit function becomes imprecise as the amount of data increases due to catastrophic cancellation. To fine-tune the fit, the gradient is optimized to ‘gradientTolerance’.
note: WLS data typically use the mxFitFunctionWLS function.
IMPORTANT: The WLS interface is under heavy development to support both very fast backend processing of raw data while continuing to support modeling applications which require direct access to the object in the front end. Some user-interface changes should be expected as we optimize both these workflows.
Missing values
For raw data, the ‘naAction’ option controls the treatment of missing values. When set to ‘pass’, the data is passed as-is. When set to ‘fail’, the presence of any missing value will trigger an error. When set to ‘omit’, missing data will be discarded row-wise. For example, a single missing value in a row will cause the whole row to be discarded. When set to ‘exclude’, rows with missing data are retained but their ‘frequency’ is set to zero.
Weights
In the case of raw data, the optional ‘weight’ argument names a column in the data that contains per-row weights. Similarly, the optional ‘frequency’ argument names a column in the ‘observed’ data that contains per-row frequencies. Frequencies must be integers but weights can be arbitrary real numbers. For data with many repeated response patterns, organizing the data into unique patterns and frequencies can reduce model evaluation time.
In some cases, the fit function can be evaluated more efficiently when data are sorted. When a primary key is provided, sorting is disabled. Otherwise, sort defaults to TRUE.
The mxData function does not currently place restrictions on the size, shape, or symmetry of matrices input into the ‘observed’ argument. While it is possible to specify MxData objects as covariance or correlation matrices that do not have the properties commonly associated with these matrices, failure to correctly specify these matrices will likely lead to problems in model estimation.
note: MxData objects may not be included in mxAlgebras nor in the mxFitFunctionAlgebra function. To reference data in these functions, use a mxMatrix or a definition variable (data.var) label.
Also, while column names are stored in the ‘observed’ slot of MxData objects, these names are not automatically recognized as variable names in mxPaths in RAM models. These models use the ‘manifestVars’ of the mxModel function to explicitly identify used variables used in the model.
Value
Returns a new MxData object.
References
The OpenMx User's guide can be found at https://openmx.ssri.psu.edu/documentation/.
See Also
To generate data, see mxGenerateData
; For objects which may be entered as arguments in the
‘observed’ slot, see matrix and data.frame. See MxData for the S4 class created by mxData.
For WLS data, see mxDataWLS (deprecated). More information about the OpenMx package may be found
here.
Examples
library(OpenMx)
# Simple covariance model. See other mxFitFunctions for examples with different data types
# 1. Create a covariance matrix x and y
covMatrix <- matrix(nrow = 2, ncol = 2, byrow = TRUE,
c(0.77642931, 0.39590663,
0.39590663, 0.49115615)
)
covNames <- c("x", "y")
dimList <- list(covNames, covNames)
dimnames(covMatrix) <- dimList
# 2. Create an MxData object from covMatrix
testData <- mxData(observed=covMatrix, type="cov", numObs = 100)
testModel <- mxModel(model="testModel2",
mxMatrix(name="expCov", type="Symm", nrow=2, ncol=2,
values=c(.2,.1,.2), free=TRUE, dimnames=dimList),
mxExpectationNormal("expCov", dimnames=covNames),
mxFitFunctionML(),
testData
)
outModel <- mxRun(testModel)
summary(outModel)