is.submodel {cna} | R Documentation |

## Identify correctness-preserving submodel relations

### Description

The function `is.submodel`

checks for each element of a vector of `cna`

solution formulas whether it is a submodel of a specified target model `y`

. If `y`

is the true model in an inverse search (i.e. the ground truth), `is.submodel`

identifies correct models in the `cna`

output (see Baumgartner and Thiem 2020, Baumgartner and Ambuehl 2020).

### Usage

```
is.submodel(x, y, strict = FALSE)
identical.model(x, y)
```

### Arguments

`x` |
Character vector of atomic and/or complex solution formulas (asf/csf). Must be of length 1 in |

`y` |
Character string of length 1 specifying the target asf or csf. |

`strict` |
Logical; if |

### Details

To benchmark the reliability of a method of causal inference it must be tested to what degree the method recovers the true data generating structure `\Delta`

or proper substructures of `\Delta`

from data of varying quality. Reliability benchmarking is done in so-called *inverse searches*, which reverse the order of causal discovery as normally conducted in scientific practice. An inverse search comprises three steps: (1) a causal structure `\Delta`

is drawn/presupposed (as ground truth), (2) artificial data `\delta`

is simulated from `\Delta`

, possibly featuring various deficiencies (e.g. noise, fragmentation, measurement error etc.), and (3) `\delta`

is processed by the benchmarked method in order to check whether its output meets the tested reliability benchmark (e.g. whether the output is true of or identical to `\Delta`

).

The main purpose of `is.submodel`

is to execute step (3) of an inverse search that is tailor-made to test the reliability of `cna`

[with `randomConds`

and `selectCases`

designed for steps (1) and (2), respectively]. A solution formula `x`

being a submodel of a target formula `y`

means that all the causal claims entailed by `x`

are true of `y`

, which is the case if a causal interpretation of `x`

entails conjunctive and disjunctive causal relevance relations that are all likewise entailed by a causal interpretation of `y`

. More specifically, `x`

is a submodel of `y`

if, and only if, the following conditions are satisfied: (i) all factor values causally relevant according to `x`

are also causally relevant according to `y`

, (ii) all factor values contained in two different disjuncts in `x`

are also contained in two different disjuncts in `y`

, (iii) all factor values contained in the same conjunct in `x`

are also contained in the same conjunct in `y`

, and (iv) if `x`

is a csf with more than one asf, (i) to (iii) are satisfied for all asfs in `x`

. For more details see Baumgartner and Thiem (2020) or Baumgartner and Ambuehl (2020, online appendix).

`is.submodel`

requires two inputs `x`

and `y`

, where `x`

is a character vector of `cna`

solution formulas (asf or csf) and `y`

is one asf or csf (i.e. a character string of length 1), viz. the target structure or ground truth. The function returns `TRUE`

for elements of `x`

that are a submodel of `y`

according to the definition of submodel-hood given in the previous paragraph. If `strict = TRUE`

, `x`

counts as a submodel of `y`

only if `x`

is a proper part of `y`

(i.e. `x`

is not identical to `y`

).

The function `identical.model`

returns `TRUE`

only if `x`

(which must be of length 1) and `y`

are identical. It can be used to test whether `y`

is completely recovered in an inverse search.

### Value

Logical vector of the same length as `x`

.

### References

Baumgartner, Michael and Mathias Ambuehl. 2020. “Causal Modeling with Multi-Value and Fuzzy-Set Coincidence Analysis.” *Political Science Research and Methods*. 8:526–542.

Baumgartner, Michael and Alrik Thiem. 2020. “Often Trusted But Never (Properly) Tested: Evaluating Qualitative Comparative Analysis”. *Sociological Methods & Research* 49:279-311.

### See Also

`randomConds`

, `selectCases`

, `cna`

.

### Examples

```
# Binary expressions
# ------------------
trueModel.1 <- "(A*b + a*B <-> C)*(C*d + c*D <-> E)"
candidates.1 <- c("(A + B <-> C)*(C + c*D <-> E)", "A + B <-> C",
"(A <-> C)*(C <-> E)", "C <-> E")
candidates.2 <- c("(A*B + a*b <-> C)*(C*d + c*D <-> E)", "A*b*D + a*B <-> C",
"(A*b + a*B <-> C)*(C*A*D <-> E)", "D <-> C",
"(A*b + a*B + E <-> C)*(C*d + c*D <-> E)")
is.submodel(candidates.1, trueModel.1)
is.submodel(candidates.2, trueModel.1)
is.submodel(c(candidates.1, candidates.2), trueModel.1)
is.submodel("C + b*A <-> D", "A*b + C <-> D")
is.submodel("C + b*A <-> D", "A*b + C <-> D", strict = TRUE)
identical.model("C + b*A <-> D", "A*b + C <-> D")
target.1 <- "(A*b + a*B <-> C)*(C*d + c*D <-> E)"
testformula.1 <- "(A*b + a*B <-> C)*(C*d + c*D <-> E)*(A + B <-> C)"
is.submodel(testformula.1, target.1)
# Multi-value expressions
# -----------------------
trueModel.2 <- "(A=1*B=2 + B=3*A=2 <-> C=3)*(C=1 + D=3 <-> E=2)"
is.submodel("(A=1*B=2 + B=3 <-> C=3)*(D=3 <-> E=2)", trueModel.2)
is.submodel("(A=1*B=1 + B=3 <-> C=3)*(D=3 <-> E=2)", trueModel.2)
is.submodel(trueModel.2, trueModel.2)
is.submodel(trueModel.2, trueModel.2, strict = TRUE)
target.2 <- "C=2*D=1*B=3 + A=1 <-> E=5"
testformula.2 <- c("C=2 + D=1 <-> E=5","C=2 + D=1*B=3 <-> E=5","A=1+B=3*D=1*C=2 <-> E=5",
"C=2 + D=1*B=3 + A=1 <-> E=5","C=2*B=3 + D=1 + B=3 + A=1 <-> E=5")
is.submodel(testformula.2, target.2)
identical.model(testformula.2[3], target.2)
identical.model(testformula.2[1], target.2)
```

*cna*version 3.6.2 Index]