randomConds {cna} | R Documentation |

Based on a set of factors and a corresponding data type—given as a data frame or `configTable`

—, `randomAsf`

generates a random atomic solution formula (asf) and `randomCsf`

a random (acyclic) complex solution formula (csf).

randomAsf(x, outcome = NULL, maxVarNum = if (type == "mv") 8 else 16, compl = NULL, how = c("inus", "minimal")) randomCsf(x, outcome = NULL, n.asf = NULL, compl = NULL, maxVarNum = if (type == "mv") 8 else 16)

`x` |
Data frame or |

`outcome` |
Optional character vector (of length 1 in |

`maxVarNum` |
Maximal number of factors that can appear in a generated asf or csf. |

`compl` |
Integer vector specifying the maximal complexity of the formula (i.e. number of factors in msc; number of msc in asf). |

`how` |
Character string, either |

`n.asf` |
Integer scalar specifying the number of asf in the csf. Is overridden by |

`randomAsf`

and `randomCsf`

can be used to randomly draw data generating structures (ground truths) in inverse search trials benchmarking the output of `cna`

. In the regularity theoretic context in which the CNA method is embedded, a causal structure is a redundancy-free Boolean dependency structure. Hence, `randomAsf`

and `randomCsf`

both produce redundancy-free Boolean dependency structures. `randomAsf`

generates structures with one outcome, i.e. atomic solution formulas (asf), `randomCsf`

generates structures with multiple outcomes, i.e. complex solution formulas (csf), that are free of cyclic substructures. In a nutshell, `randomAsf`

proceeds by, first, randomly drawing disjunctive normal forms (DNFs) and by, second, eliminating redundancies from these DNFs. `randomCsf`

essentially consists in repeated executions of `randomAsf`

.

The only mandatory argument of `randomAsf`

and `randomCsf`

is a data frame or a `configTable`

`x`

defining the factors (with their possible values) from which the generated asf and csf shall be drawn.

The optional argument `outcome`

determines which factors in `x`

shall be treated as outcomes. If `outcome`

is at its default value `NULL`

, `randomAsf`

and `randomCsf`

randomly draw factor(s) from `x`

to be treated as outcome(s). If `type`

is `"mv"`

, then values of distinct factors are chosen as outcomes.

The maximal number of factors included in the generated asf and csf can be controlled via the argument `maxVarNum`

. This is particularly relevant when `x`

is of high dimension, as generating solution formulas with more than 20 factors is computationally demanding and, accordingly, may take a long time (or even exhaust computer memory).

The argument `compl`

controls the complexity of the generated asf and csf. More specifically, the *initial* complexity of asf and csf (i.e. the number of factors included in msc and the number of msc included in asf prior to redundancy elimination) is drawn from the vector `compl`

. As this complexity might be reduced in the subsequent process of redundancy elimination, issued asf or csf will often have lower complexity than specified in `compl`

. The default value of `compl`

is determined by the number of columns in `x`

. Assigning unduly high values to `compl`

results in an error.

`randomAsf`

has the additional argument `how`

with the two possible values `"inus"`

and `"minimal"`

. `how = "inus"`

determines that the generated asf is redundancy-free relative to all logically possible configurations of the factors in `x`

, i.e. relative to `full.ct(x)`

, whereas in case of `how = "minimal"`

redundancy-freeness is imposed only relative to all configurations actually contained in `x`

, i.e. relative to `x`

itself. Typically `"inus"`

should be used; the value `"minimal"`

is relevant mainly in repeated `randomAsf`

calls from within `randomCsf`

. Moreover, setting `how = "minimal"`

will return an error if `x`

is a `configTable`

of type `"fs"`

.

The argument `n.asf`

controls the number of asf in the generated csf. Its value is limited to `ncol(x)-2`

and overridden by `length(outcome)`

, if `outcome`

is not `NULL`

. Analogously to `compl`

, `n.asf`

specifies the number of asf prior to redundancy elimination, which, in turn, may further reduce these numbers. That is, `n.asf`

provides an upper bound for the number of asf in the resulting csf.

The randomly generated formula, a character string.

`is.submodel`

, `selectCases`

, `full.ct`

, `configTable`

, `cna`

.

# randomAsf # --------- # Asf generated from explicitly specified binary factors. randomAsf(full.ct("H*I*T*R*K")) randomAsf(full.ct("Johnny*Debby*Aurora*Mars*James*Sonja")) # Asf generated from a specified number of binary factors. randomAsf(full.ct(7)) # Asf generated from an existing data frame. randomAsf(d.educate) # Specify the outcome. randomAsf(d.educate, outcome = "G") # Specify the complexity. randomAsf(full.ct(7), compl = 2) randomAsf(full.ct(7), compl = 3:4) # Redundancy-freeness relative to x instead of full.ct(x). randomAsf(d.educate, outcome = "G", how = "minimal") # Asf with multi-value factors. randomAsf(allCombs(c(3,4,3,5,3,4))) # Asf from fuzzy-set data. randomAsf(d.jobsecurity) randomAsf(d.jobsecurity, outcome = "JSR") # Generate 20 asf. replicate(20, randomAsf(full.ct(7), compl = 2:3)) # randomCsf # --------- # Csf generated from explicitly specified binary factors. randomCsf(full.ct("H*I*T*R*K*Q*P")) # Csf generated from a specified number of binary factors. randomCsf(full.ct(7)) # Specify the outcomes. randomCsf(d.volatile, outcome = c("RB","SE")) # Specify the complexity. randomCsf(d.volatile, outcome = c("RB","SE"), compl = 2) randomCsf(full.ct(7), compl = 3:4) # Specify the maximal number of factors. randomCsf(d.highdim, maxVarNum = 10) randomCsf(d.highdim, maxVarNum = 20) # takes a while to complete # Specify the number of asf. randomCsf(full.ct(7), n.asf = 3) # Csf with multi-value factors. randomCsf(allCombs(c(3,4,3,5,3,4))) # Generate 20 csf. replicate(20, randomCsf(full.ct(7), n.asf = 2, compl = 2:3)) # Inverse searches # ---------------- # === Ideal Data === # Draw the data generating structure. (Every run yields different # targets and data.) target <- randomCsf(full.ct(5), n.asf = 2) target # Select the cases compatible with the target. x <- selectCases(target) # Run CNA without an ordering. mycna <- cna(x, rm.dup.factors = FALSE) # Extract the csf. csfs <- csf(mycna) # Check whether the target is completely returned. any(unlist(lapply(csfs$condition, identical.model, target))) # === Data fragmentation (20% missing observations) === # Draw the data generating structure. (Every run yields different # targets and data.) target <- randomCsf(full.ct(7), n.asf = 2) target # Generate the ideal data. x <- ct2df(selectCases(target)) # Introduce fragmentation. x <- x[-sample(1:nrow(x), nrow(x)*0.2), ] # Run CNA without an ordering. mycna <- cna(x, rm.dup.factors = FALSE) # Extract the csf. csfs <- csf(mycna) # Check whether (a submodel of) the target is returned. any(is.submodel(csfs$condition, target)) # === Data fragmentation and noise (20% missing observations, noise ratio of 0.05) === # Multi-value data. # Draw the data generating structure. (Every run yields different # targets and data.) fullData <- allCombs(c(4,4,4,4,4)) target <- randomCsf(fullData, n.asf=2, compl = 2:3) target # Generate the ideal data. x <- ct2df(selectCases(target, fullData)) # Introduce fragmentation. x <- x[-sample(1:nrow(x), nrow(x)*0.2), ] # Introduce random noise. x <- rbind(ct2df(fullData[sample(1:nrow(fullData), nrow(x)*0.05), ]), x) # Run CNA without an ordering. mycna <- cna(x, con = .75, cov = .75, maxstep = c(3, 3, 12), rm.dup.factors = F) # Extract the csf. csfs <- csf(mycna) # Check whether no causal fallacy (no false positive) is returned. if(nrow(csfs)==0) { TRUE } else {any(is.submodel(csfs$condition, target))}

[Package *cna* version 3.2.0 Index]