selectCases {cna} | R Documentation |

## Select the cases/configurations compatible with a data generating causal structure

### Description

`selectCases`

selects the cases/configurations that are compatible with a Boolean function, in particular (but not exclusively), a data generating causal structure, from a data frame or `configTable`

.

`selectCases1`

allows for setting consistency (`con`

) and coverage (`cov`

) thresholds. It then selects cases/configurations that are compatible with the data generating structure to degrees `con`

and `cov`

.

### Usage

```
selectCases(cond, x = full.ct(cond), type = "auto", cutoff = 0.5,
rm.dup.factors = FALSE, rm.const.factors = FALSE)
selectCases1(cond, x = full.ct(cond), type = "auto", con = 1, cov = 1,
rm.dup.factors = FALSE, rm.const.factors = FALSE)
```

### Arguments

`cond` |
Character string specifying the Boolean function for which compatible cases are to be selected. |

`x` |
Data frame or |

`type` |
Character vector specifying the type of |

`cutoff` |
Cutoff value in case of |

`rm.dup.factors` |
Logical; if |

`rm.const.factors` |
Logical; if |

`con` , `cov` |
Numeric scalars between 0 and 1 to set the minimum consistency and coverage thresholds. |

### Details

In combination with `allCombs`

, `full.ct`

, `randomConds`

and `makeFuzzy`

, `selectCases`

is useful for simulating data, which are needed for inverse search trials benchmarking the output of the `cna`

function.

`selectCases`

draws those cases/configurations from a data frame or `configTable`

`x`

that are compatible with a data generating causal structure (or any other Boolean or set-theoretic function), which is given to `selectCases`

as a character string `cond`

. If the argument `x`

is not specified, configurations are drawn from `full.ct(cond)`

. `cond`

can be a condition of any of the three types of conditions, *boolean*, *atomic* or *complex* (see `condition`

). To illustrate, if the data generating structure is "A + B <-> C", then a case featuring A=1, B=0, and C=1 is selected by `selectCases`

, whereas a case featuring A=1, B=0, and C=0 is not (because according to the data generating structure, A=1 must be associated with C=1, which is violated in the latter case). The type of the data frame is automatically detected by default, but can be manually specified by giving the argument `type`

one of its non-default values: `"cs"`

(crisp-set), `"mv"`

(multi-value), and `"fs"`

(fuzzy-set).

`selectCases1`

allows for providing consistency (`con`

) and coverage (`cov`

) thresholds, such that some cases that are incompatible with `cond`

are also drawn, as long as `con`

and `cov`

remain satisfied. The solution is identified by an algorithm aiming to find a subset of maximal size meeting the `con`

and `cov`

requirements. In contrast to `selectCases`

, `selectCases1`

only accepts a condition of type *atomic* as its `cond`

argument, i.e. an atomic solution formula. Data drawn by `selectCases1`

can only be modeled with consistency = `con`

and coverage = `cov`

.

### Value

A `configTable`

.

### See Also

`allCombs`

, `full.ct`

, `randomConds`

, `makeFuzzy`

, `configTable`

, `condition`

, `cna`

, `d.jobsecurity`

### Examples

```
# Generate all configurations of 5 dichotomous factors that are compatible with the causal
# chain (A*b + a*B <-> C) * (C*d + c*D <-> E).
groundTruth.1 <- "(A*b + a*B <-> C) * (C*d + c*D <-> E)"
(dat1 <- selectCases(groundTruth.1))
condition(groundTruth.1, dat1)
# Randomly draw a multi-value ground truth and generate all configurations compatible with it.
dat1 <- allCombs(c(3, 3, 4, 4, 3))
groundTruth.2 <- randomCsf(dat1, n.asf=2)
(dat2 <- selectCases(groundTruth.2, dat1))
condition(groundTruth.2, dat2)
# Generate all configurations of 5 fuzzy-set factors compatible with the causal structure
# A*b + C*D <-> E, such that con = .8 and cov = .8.
dat1 <- allCombs(c(2, 2, 2, 2, 2)) - 1
dat2 <- makeFuzzy(dat1, fuzzvalues = seq(0, 0.45, 0.01))
(dat3 <- selectCases1("A*b + C*D <-> E", con = .8, cov = .8, dat2))
condition("A*b + C*D <-> E", dat3)
# Inverse search for the data generating causal structure A*b + a*B + C*D <-> E from
# fuzzy-set data with non-perfect consistency and coverage scores.
dat1 <- allCombs(c(2, 2, 2, 2, 2)) - 1
set.seed(7)
dat2 <- makeFuzzy(dat1, fuzzvalues = 0:4/10)
dat3 <- selectCases1("A*b + a*B + C*D <-> E", con = .8, cov = .8, dat2)
cna(dat3, outcome = "E", con = .8, cov = .8)
# Draw cases satisfying specific conditions from real-life fuzzy-set data.
ct.js <- configTable(d.jobsecurity)
selectCases("S -> C", ct.js) # Cases with higher membership scores in C than in S.
selectCases("S -> C", d.jobsecurity) # Same.
selectCases("S <-> C", ct.js) # Cases with identical membership scores in C and in S.
selectCases1("S -> C", con = .8, cov = .8, ct.js) # selectCases1() makes no distinction
# between "->" and "<->".
condition("S -> C", selectCases1("S -> C", con = .8, cov = .8, ct.js))
# selectCases() not only draws cases compatible with Boolean causal models. Any Boolean
# function of factor values appearing in the data can be given as cond.
selectCases("C=1*B=3", allCombs(2:4))
selectCases("A=1 * !(C=2 + B=3)", allCombs(2:4), type = "mv")
selectCases("A=1 + (C=3 <-> B=1)*D=3", allCombs(c(3,3,3,3)), type = "mv")
```

*cna*version 3.6.2 Index]