| stableSpec {stablespec} | R Documentation |
Stable specifications of constrained structural equation models.
Description
Search stable specifications (structures) of constrained structural equation models.
Usage
stableSpec(theData = NULL, nSubset = NULL, iteration = NULL,
nPop = NULL, mutRate = NULL, crossRate = NULL, longitudinal = NULL,
numTime = NULL, seed = NULL, co = NULL, consMatrix = NULL,
threshold = NULL, toPlot = NULL, mixture = NULL, log = NULL)
Arguments
theData |
a data frame containing the data to which the model will be
be fit. If argument |
nSubset |
number of subsets to draw. In practice, it is suggested to have at least 25 subsets. The default is 10. |
iteration |
number of iterations/generations for NSGA-II. |
nPop |
population size (number of models) in a generation. The default is 50. |
mutRate |
mutation rate. The default is 0.075. |
crossRate |
crossover rate. The default is 0.85. |
longitudinal |
|
numTime |
number of time slices. If the data is cross-sectional, this argument must be set to 1. |
seed |
integer vector representing seeds that are used to subsample data.
The default is an integer vector with range |
co |
whether to use |
consMatrix |
|
threshold |
threshold of stability selection. The default is 0.6. |
toPlot |
if |
mixture |
if the data contains both continuous and
categorical (or ordinal) variables, this argument can be set
to |
log |
an optional logfile to monitor the progress of the algorithm. |
Details
This function performs exploratory search over recursive (acyclic) SEM models. Models are scored along two objectives: the model fit and the model complexity. Since both objectives are often conflicting we use NSGA-II to search for Pareto optimal models. To handle the instability of small finite data samples, we repeatedly subsample the data and select those substructures that are both stable and parsimonious which are then used to infer a causal model.
Value
a list of the following elements:
-
listofFrontsis alistof optimal models for the whole range of model complexity of all subsets. -
causalStabis alistof causal path stability for the whole range of model complexity -
causalStab_l1is alistof causal path stability of length 1 for the whole range of model complexity -
edgeStabis alistof edge stability for the whole range of mdoel complexity -
relCausalPathisn by nmatrixof relevant causal path, wherenis the number of variables. Each positive elementi,jrepresents the stability of causal path fromitoj. -
relCausalPath_l1isn by nmatrixof relevant causal path with length 1, wherenis the number of variables. Each positive elementi,jrepresents the stability of causal path fromitojwith length 1. -
relEdgeisn by nmatrixof relevant edge, wherenis the number of variables. Each positive elementi,jrepresents the stability of edge betweenitoj. If argument
toPlot = TRUE, then a visualization of relevant model structures is generated. Otherwise an object of graph is returned. An arc represents a causal path, and an (undirected) edge represents strong association where the direction is undecidable. The graph is annotated with reliability scores, which are the highest selection probability in the top-left region of the edge stability graph.-
allSeedis an integer vector representing seeds that are used in subsampling data. This can be used to replicate the result in next computation.
Author(s)
Ridho Rahmadi r.rahmadi@cs.ru.nl, Perry Groot, Tom Heskes. Christoph Stich is the contributor for parallel support.
References
Rahmadi, R., Groot, P., Heins, M., Knoop, H., and Heskes, T. (2016) Causality on cross-sectional data: Stable specification search in constrained structural equation modeling. Applied Soft Computing, ISSN 1568-4946, http://www.sciencedirect.com/science/article/pii/S1568494616305130.
Rahmadi, R., Groot, P., Heins, M., Knoop, H., & Heskes, T. (2015). Causality on Longitudinal Data: Stable Specification Search in Constrained Structural Equation Modeling. Proceedings of AALTD 2015, 101.
Fox, J., Nie, Z., and Byrnes, J. (2015). sem: Structural Equation Models. R package version 3.1-6. https://CRAN.R-project.org/package=sem
Ching-Shih Tsou (2013). nsga2R: Elitist Non-dominated Sorting Genetic Algorithm based on R. R package version 1.0. https://CRAN.R-project.org/package=nsga2R
Kalisch, M., Machler, M., Colombo, D., Maathuis, M. H., and Buehlmann, P. (2012). Causal inference using graphical models with the R package pcalg. Journal of Statistical Software, 47(11), 1-26.
Meinshausen, N., and Buehlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417-473.
Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. (2002), A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Transactions on Evolutionary Computation, 6(2), 182-197.
Chickering, D. M. (2002). Learning equivalence classes of Bayesian-network structures. The Journal of Machine Learning Research, 2, 445-498.
Examples
# Cross-sectional data example,
# with an artificial data set of six continuous variables.
# Detail about the data set can be found in the documentation.
# As an example, we only run one subset.
# Note that stableSpec() uses foreach to support
# parallel computation, which could issue a warning
# when running sequentially as the following example. However
# the warning can be just ignored.
the_data <- crossdata6V
numSubset <- 1
num_iteration <- 5
num_pop <- 10
mut_rate <- 0.075
cross_rate <- 0.85
longi <- FALSE
num_time <- 1
the_seed <- NULL
the_co <- "covariance"
#assummed that variable 5 does not cause variables 1, 2, and 3
cons_matrix <- matrix(c(5, 1, 5, 2, 5, 3), 3, 2, byrow=TRUE)
th <- 0.1
to_plot <- FALSE
mix <- FALSE
result <- stableSpec(theData=the_data, nSubset=numSubset,
iteration=num_iteration,
nPop=num_pop, mutRate=mut_rate, crossRate=cross_rate,
longitudinal=longi, numTime=num_time, seed=the_seed,
co=the_co, consMatrix=cons_matrix, threshold=th,
toPlot=to_plot, mixture = mix)
##########################################################
## Parallel computation is possible by
## registering parallel backend, e.g., package doParallel.
## For example, add the following lines on top of
## the example above.
#
# library(parallel)
# library(doParallel)
# cl <- makeCluster(detectCores())
# registerDoParallel(cl)
#
## Then call stableSpec() as normal.
##
## Note that makeCluster() and detectCores() are
## from package parallel, and registerDoParallel()
## is from package doParallel. For more detail
## check the aforementioned packages' documentations.
###########################################################