pcgen {pcgen}R Documentation

Causal inference with genetic effects

Description

Reconstruction of directed networks with random genetic effects, based on phenotypic observations. The pcgen algorithm is a modification of the pc-stable algorithm of Colombo & Maathuis (2014) . It is assumed that there are replicates, and independent genetic effects.

Usage

pcgen(suffStat, covariates = NULL, QTLs = integer(), alpha = 0.01, m.max = Inf,
fixedEdges = NULL, fixedGaps = NULL, verbose = FALSE, use.res = FALSE, 
res.cor = NULL, max.iter = 50, stop.if.significant = TRUE, return.pvalues = FALSE)

Arguments

suffStat

A data.frame, of which the first column is the factor G (genotype), and subsequent columns contain the traits, and optionally some QTLs. The name of the first column should be G. Should not contain covariates.

covariates

A data.frame containing covariates, that should always be used in each conditional independence test Should be either NULL (default) or a data.frame with the same number of rows as suffStat. An intercept is already included for each trait in suffStat; covariates should not contain a column of ones.

QTLs

Column numbers in suffStat that correspond to QTLs.

alpha

The significance level used in each conditional independence test. Default is 0.01

m.max

Maximum size of the conditioning sets

fixedEdges

A logical matrix of dimension (p+1) \times (p+1), where p is the number of traits. The first row and column refer to the node G, and subsequent rows and columns to the traits. As in the pcalg package, the edge i - j is never considered for removal if the entry [i, j] or [j, i] (or both) are TRUE. In that case, the edge is guaranteed to be present in the resulting graph.

fixedGaps

A logical matrix of dimension (p+1) \times (p+1), where p is the number of traits. The first row and column refer to the node G, and subsequent rows and columns to the traits. As in the pcalg package, the edge i - j is removed before starting the algorithm if the entry [i, j] or [j, i] (or both) are TRUE. In that case, the edge is guaranteed to be absent in the resulting graph.

verbose

If TRUE, p-values for the conditional independence tests are printed

use.res

If TRUE, the test for conditional independence of 2 traits given a set of other traits and G is based on residuals from GBLUP. If FALSE (the default), it is based on bivariate mixed models.

res.cor

If use.res = TRUE, res.cor should be the correlation matrix of the residuals from the GBLUP. These can be obtained with the getResiduals function. See the example below.

max.iter

Maximum number of iterations in the EM-algorithm, used to fit the bivariate mixed model (when use.res = FALSE).

stop.if.significant

If TRUE, the EM-algorithm used in some of the conditional independence tests (when use.res = FALSE) will be stopped whenever the p-value becomes significant, i.e. below alpha. This will speed up calculations, and can be done because (1) the PC algorithm only needs an accept/reject decision (2) In EM the likelihood is nondecreasing. Should be put to FALSE if the precise p-values are of interest.

return.pvalues

If TRUE, the maximal p-value for each edge is returned.

Details

The pcgen function is based on the pc function from the pcalg package (Kalisch et al. (2012) and Hauser and Buhlmann (2012)).

Value

If return.pvalues = FALSE, the output is a graph (an object with S3 class "pcgen"). If return.pvalues = TRUE, the output is a list with elements gr (the graph) and pMax (a matrix with the p-values).

Author(s)

Willem Kruijer and Pariya Behrouzi. Maintainers: Willem Kruijer willem.kruijer@wur.nl and Pariya Behrouzi pariya.behrouzi@gmail.com

References

1. Kruijer, W., Behrouzi, P., Rodriguez-Alvarez, M. X., Wit, E. C., Mahmoudi, S. M., Yandell, B., Van Eeuwijk, F., (2018, in preparation), Reconstruction of networks with direct and indirect genetic effects.
2. Colombo, D. and Maathuis, M.H., 2014. Order-independent constraint-based causal structure learning. The Journal of Machine Learning Research, 15(1), pp.3741-3782.
3. Kalisch, M., Machler, M., Colombo, D., Maathuis, M.H. and Buhlmann, P., 2012. Causal inference using graphical models with the R package pcalg. Journal of Statistical Software, 47(11), pp.1-26.
4. Hauser, A. and Buhlmann, P., 2012. Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research, 13(Aug), pp.2409-2464.

See Also

getResiduals

Examples


data(simdata)
out <- pcgen(simdata) 

data(simdata)
rs <- getResiduals(suffStat = simdata)
pc.fit1 <- pcgen(suffStat = simdata, alpha = 0.01, verbose = TRUE, 
           use.res = TRUE, res.cor = cor(rs))

[Package pcgen version 0.2.0 Index]