is.submodel {cna} | R Documentation |

The function `is.submodel`

checks for each element of a vector of `cna`

solution formulas whether it is a submodel of a specified target model `y`

. If `y`

is the true model in an inverse search (i.e. the ground truth), `is.submodel`

identifies the correct models in the `cna`

output (see Baumgartner and Thiem 2020, Baumgartner and Ambuehl 2020).

is.submodel(x, y, strict = FALSE) identical.model(x, y)

`x` |
Character vector of atomic and/or complex solution formulas (asf/csf). Must be of length 1 in |

`y` |
Character string of length 1 specifying the target asf or csf. |

`strict` |
Logical; if |

To benchmark the reliability of a method of causal inference it must be tested to what degree the method recovers the true data generating structure *Δ* or proper substructures of *Δ* from data of varying quality. Reliability benchmarking is done in so-called *inverse searches*, which reverse the order of causal discovery as normally conducted in scientific practice. An inverse search comprises three steps: (1) a causal structure *Δ* is drawn/presupposed (as ground truth), (2) artificial data *δ* is simulated from *Δ*, possibly featuring various deficiencies (e.g. noise, limited diversity, measurement error etc.), and (3) *δ* is processed by the benchmarked method in order to check whether its output meets the tested reliability benchmark (e.g. whether the output is true of or identical to *Δ*).

The main purpose of `is.submodel`

is to execute step (3) of an inverse search that is tailor-made to test the reliability of `cna`

[with `randomConds`

and `selectCases`

designed for steps (1) and (2), respectively]. A solution formula `x`

being a submodel of a target formula `y`

means that all the causal claims entailed by `x`

are true of `y`

, which is the case if a causal interpretation of `x`

entails conjunctive and disjunctive causal relevance relations that are all likewise entailed by a causal interpretation of `y`

. More specifically, `x`

is a submodel of `y`

if, and only if, the following conditions are satisfied: (i) all factor values causally relevant according to `x`

are also causally relevant according to `y`

, (ii) all factor values contained in two different disjuncts in `x`

are also contained in two different disjuncts in `y`

, (iii) all factor values contained in the same conjunct in `x`

are also contained in the same conjunct in `y`

, and (iv) if `x`

is a csf with more than one asf, (i) to (iii) are satisfied for all asfs in `x`

. For more details see Baumgartner and Thiem (2020) or Baumgartner and Ambuehl (2020, online appendix).

`is.submodel`

requires two inputs `x`

and `y`

, where `x`

is a character vector of `cna`

solution formulas (asf or csf) and `y`

is one asf or csf (i.e. a character string of length 1), viz. the target structure or ground truth. The function returns `TRUE`

for elements of `x`

that are a submodel of `y`

according to the definition of submodel-hood given in the previous paragraph. If `strict = TRUE`

, `x`

counts as a submodel of `y`

only if `x`

is a proper part of `y`

(i.e. `x`

is not identical to `y`

).

The function `identical.model`

returns `TRUE`

only if `x`

(which must be of length 1) and `y`

are identical. It can be used to test whether `y`

is completely recovered in an inverse search.

Logical vector of the same length as `x`

.

Baumgartner, Michael and Mathias Ambuehl. 2020. “Causal Modeling with Multi-Value and Fuzzy-Set Coincidence Analysis.” *Political Science Research and Methods*. 8:526–542.

Baumgartner, Michael and Alrik Thiem. 2020. “Often Trusted But Never (Properly) Tested: Evaluating Qualitative Comparative Analysis”. *Sociological Methods & Research* 49:279-311.

`randomConds`

, `selectCases`

, `cna`

.

# Binary expressions # ------------------ trueModel.1 <- "(A*b + a*B <-> C)*(C*d + c*D <-> E)" candidates.1 <- c("(A + B <-> C)*(C + c*D <-> E)", "A + B <-> C", "(A <-> C)*(C <-> E)", "C <-> E") candidates.2 <- c("(A*B + a*b <-> C)*(C*d + c*D <-> E)", "A*b*D + a*B <-> C", "(A*b + a*B <-> C)*(C*A*D <-> E)", "D <-> C", "(A*b + a*B + E <-> C)*(C*d + c*D <-> E)") is.submodel(candidates.1, trueModel.1) is.submodel(candidates.2, trueModel.1) is.submodel(c(candidates.1, candidates.2), trueModel.1) is.submodel("C + b*A <-> D", "A*b + C <-> D") is.submodel("C + b*A <-> D", "A*b + C <-> D", strict = TRUE) identical.model("C + b*A <-> D", "A*b + C <-> D") target.1 <- "(A*b + a*B <-> C)*(C*d + c*D <-> E)" testformula.1 <- "(A*b + a*B <-> C)*(C*d + c*D <-> E)*(A + B <-> C)" is.submodel(testformula.1, target.1) # Multi-value expressions # ----------------------- trueModel.2 <- "(A=1*B=2 + B=3*A=2 <-> C=3)*(C=1 + D=3 <-> E=2)" is.submodel("(A=1*B=2 + B=3 <-> C=3)*(D=3 <-> E=2)", trueModel.2) is.submodel("(A=1*B=1 + B=3 <-> C=3)*(D=3 <-> E=2)", trueModel.2) is.submodel(trueModel.2, trueModel.2) is.submodel(trueModel.2, trueModel.2, strict = TRUE) target.2 <- "C=2*D=1*B=3 + A=1 <-> E=5" testformula.2 <- c("C=2 + D=1 <-> E=5","C=2 + D=1*B=3 <-> E=5","A=1+B=3*D=1*C=2 <-> E=5", "C=2 + D=1*B=3 + A=1 <-> E=5","C=2*B=3 + D=1 + B=3 + A=1 <-> E=5") is.submodel(testformula.2, target.2) identical.model(testformula.2[3], target.2) identical.model(testformula.2[1], target.2)

[Package *cna* version 3.2.0 Index]