jointIda {pcalg} | R Documentation |
Estimate Multiset of Possible Total Joint Effects
Description
jointIda()
estimates the multiset of possible total joint effects
of a set of intervention variables (X
) on another variable (Y
)
from observational data. This is a version of ida
that
allows multiple simultaneous interventions.
Usage
jointIda(x.pos, y.pos, mcov, graphEst = NULL, all.pasets = NULL,
technique = c("RRC", "MCD"), type = c("pdag", "cpdag", "dag"))
Arguments
x.pos |
(integer vector) positions of the intervention variables
|
y.pos |
(integer) position of variable |
mcov |
(estimated) covariance matrix. |
graphEst |
(graphNEL object) Estimated CPDAG or PDAG. The CPDAG is typically from |
all.pasets |
(an optional argument and the default is
|
technique |
character string specifying the technique that will be used to estimate the total joint causal effects (given the parent sets), see details below.
|
type |
Type of graph |
Details
It is assumed that we have observational data that are multivariate
Gaussian and faithful to the true (but unknown) underlying causal DAG
(without hidden variables). Under these assumptions, this function
estimates the multiset of possible total joint effects of X
on
Y
. Here the total joint effect of X = (X_1,X_2)
on
Y
is defined via Pearl's do-calculus as the vector
(E[Y|do(X_1=x_1+1,X_2=x_2)]-E[Y|do(X_1=x_1,X_2=x_2)], E[Y|do(X_1=x_1,X_2=x_2+1)]-E[Y|do(X_1=x_1,X_2=x_2)])
,
with a similar definition for more than two variables. These values
are equal to the partial derivatives (evaluated at (x_1,x_2)
) of
E[Y|do(X=x_1',X_2=x_2')]
with respect to x_1'
and
x_2'
. Moreover, under the Gaussian assumption, these partial
derivatives do not depend on the values at which they are evaluated.
We estimate a multiset of possible total joint effects instead of the unique total joint effect, since it is typically impossible to identify the latter when the true underlying causal DAG is unknown (even with an infinite amount of data).
Conceptually, the method
works as follows. First, we estimate the CPDAG or PDAG based on the data.
The CPDAG represents the equivalence class of DAGs and can be estimated
from observational data with the function pc
(see the help file of this function).
The PDAG contains more orientations than the CPDAG and thus, represents a
smaller equivalence class of DAGs, compared to the CPDAG. We can obtain a PDAG if
we have background knowledge of, for example, certain edge orientations of
undirected edges in the CPDAG. We obtain the PDAG by adding these
orientations to the CPDAG using the function addBgKnowledge
(see the help file of this function).
Then using the CPDAG or PDAG we extract a collection of "jointly valid" parent sets of the intervention variables from the
estimated CPDAG. For each set of jointly valid parent sets we apply
RRC (recursive regressions for causal effects) or MCD (modifying the
Cholesky decomposition) to estimate the total joint effect of X
on Y
from the sample covariance matrix (see Section 3 of Nandy et. al, 2015).
Value
A matrix representing the multiset containing the estimated
possible total joint effects of X
on Y
. The number of
rows is equal to length(x.pos)
, i.e., each column represents a
vector of possible joint causal effects.
Note
For a single variable X
, jointIda()
estimates the
same quantities as ida()
.
If graphEst
is of type = "cpdag"
, jointIda()
obtains
all.pasets
by using the semi-local approach described in Section 5 in Nandy et. al, (2015).
Nandy et. al, (2015) show that jointIda()
yields
correct multiplicities of the distinct elements of the resulting multiset (in the sense that it matches
ida()
with method="global"
up to a constant factor).
If graphEst
is of type = "pdag"
, jointIda()
obtains
all.pasets
by using the semi-local approach described in Algorithm 2,
Section 4.2 in Perkovic et. al (2017). For this case, jointIda()
does not necessarily yield
the correct multiplicities of the distinct elements of the resulting multiset (it behaves similarly to
ida()
with method="local"
).
jointIda()
(like idaFast
) also allows direct
computation of the total joint effect of a set of intervention
variables X
on another set of target variables Y
. In
this case, y.pos
must be an integer vector containing positions
of the target variables Y
in the covariance matrix and the
output is a list of matrices that correspond to the variables in
Y
in the same order. This method is slightly more efficient
than looping over jointIda()
with single target variables, if
all.pasets
is not specified.
Author(s)
Preetam Nandy, Emilija Perkovic
References
P. Nandy, M.H. Maathuis and T.S. Richardson (2017). Estimating the effect of joint interventions from observational data in sparse high-dimensional settings. In Annals of Statistics.
E. Perkovic, M. Kalisch and M.H. Maathuis (2017). Interpreting and using CPDAGs with background knowledge. In Proceedings of UAI 2017.
See Also
ida
, the simple version;
pc
for estimating a CPDAG.
Examples
## Create a weighted DAG
p <- 6
V <- as.character(1:p)
edL <- list(
"1" = list(edges=c(3,4), weights=c(1.1,0.3)),
"2" = list(edges=c(6), weights=c(0.4)),
"3" = list(edges=c(2,4,6),weights=c(0.6,0.8,0.9)),
"4" = list(edges=c(2),weights=c(0.5)),
"5" = list(edges=c(1,4),weights=c(0.2,0.7)),
"6" = NULL)
myDAG <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## true DAG
myCPDAG <- dag2cpdag(myDAG) ## true CPDAG
myPDAG <- addBgKnowledge(myCPDAG,1,3) ## true PDAG with background knowledge 1 -> 3
covTrue <- trueCov(myDAG) ## true covariance matrix
n <- 1000
## simulate Gaussian data from the true DAG
dat <- if (require("mvtnorm")) {
set.seed(123)
rmvnorm(n, mean=rep(0,p), sigma=covTrue)
} else readRDS(system.file(package="pcalg", "external", "N_6_1000.rds"))
## estimate CPDAG and PDAG -- see help(pc), help(addBgKnoweldge)
suffStat <- list(C = cor(dat), n = n)
pc.fit <- pc(suffStat, indepTest = gaussCItest, p = p, alpha = 0.01, u2pd="relaxed")
pc.fit.pdag <- addBgKnowledge(pc.fit@graph,1,3)
if (require(Rgraphviz)) {
## plot the true and estimated graphs
par(mfrow = c(1,3))
plot(myDAG, main = "True DAG")
plot(pc.fit, main = "Estimated CPDAG")
plot(pc.fit.pdag, main = "Estimated PDAG")
}
## Suppose that we know the true CPDAG and covariance matrix
jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myCPDAG, technique="RRC", type = "cpdag")
jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myCPDAG, technique="MCD", type = "cpdag")
## Suppose that we know the true PDAG and covariance matrix
jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myPDAG, technique="RRC", type = "pdag")
jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myPDAG, technique="MCD", type = "pdag")
## Instead of knowing the true CPDAG or PDAG, it is enough to know only
## the jointly valid parent sets of the intervention variables
## to use RRC or MCD
## all.jointly.valid.pasets:
ajv.pasets <- list(list(5,c(3,4)),list(integer(0),c(3,4)),list(3,c(3,4)))
jointIda(x.pos=c(1,2), y.pos=6, covTrue, all.pasets=ajv.pasets, technique="RRC")
jointIda(x.pos=c(1,2), y.pos=6, covTrue, all.pasets=ajv.pasets, technique="MCD")
## From the true DAG, we can compute the true total joint effects
## using RRC or MCD
cat("Dim covTrue: ", dim(covTrue),"\n")
jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myDAG, technique="RRC", type = "dag")
jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myDAG, technique="MCD", type = "dag")
## When working with data, we have to use the estimated CPDAG or PDAG
## and the sample covariance matrix
jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit@graph, technique="RRC", type = "cpdag")
jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit@graph, technique="MCD", type = "cpdag")
jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit.pdag, technique="RRC", type = "pdag")
jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit.pdag, technique="MCD", type = "pdag")
## RRC and MCD can produce different results when working with data
## jointIda also works when x.pos has length 1 and in the following example
## it gives the same result as ida() (see Note)
##
## When the CPDAG is known
jointIda(x.pos=1, y.pos=6, covTrue, graphEst=myCPDAG, technique="RRC", type = "cpdag")
ida(x.pos=1, y.pos=6, covTrue, graphEst=myCPDAG, method="global", type = "cpdag")
## When the PDAG is known
jointIda(x.pos=1, y.pos=6, covTrue, graphEst=myPDAG, technique="RRC", type = "pdag")
ida(x.pos=1, y.pos=6, covTrue, graphEst=myPDAG, method="global", type = "pdag")
## When the DAG is known
jointIda(x.pos=1, y.pos=6, covTrue, graphEst=myDAG, technique="RRC", type = "dag")
ida(x.pos=1, y.pos=6, covTrue, graphEst=myDAG, method="global")
## Note that, causalEffect(myDAG,y=6,x=1) does not give the correct value in this case,
## since this function requires that the variables are in a causal order.