Backward phase of MMPC {MXM} | R Documentation |
Backward phase of MMPC
Description
Backward phase of MMPC.
Usage
mmpcbackphase(target, dataset, max_k = 3, threshold = 0.05, test = NULL,
wei = NULL, R = 1)
Arguments
target |
The class variable. Provide either a string, an integer, a numeric value, a vector, a factor, an ordered factor or a Surv object. See also Details. |
dataset |
The data-set; provide either a data frame or a matrix (columns = variables , rows = samples). Alternatively, provide an ExpressionSet (in which case rows are samples and columns are features, see bioconductor for details). |
max_k |
The maximum conditioning set to use in the conditional indepedence test (see Details). Integer, default value is 3. |
threshold |
Threshold (suitable values in (0, 1)) for assessing p-values significance. Default value is 0.05. |
test |
The conditional independence test to use. Type the test without " ", e.g. type testIndFisher, Not "testIndFisher".
Default value is NULL. See also |
wei |
A vector of weights to be used for weighted regression. The default value is NULL. |
R |
The number of permutations, set to 1 by default (no permutations based test). There is a trick to avoind doing all permutations. As soon as the number of times the permuted test statistic is more than the observed test statistic is more than 50 (if threshold = 0.05 and R = 999), the p-value has exceeded the signifiance level (threshold value) and hence the predictor variable is not significant. There is no need to continue do the extra permutations, as a decision has already been made. |
Details
For each of the selected variables (dataset) the function performs conditional independence tests where the
conditioning sets are formed from the other variables. All possible combinations are tried until the variable
becomes non significant. The maximum size of the conditioning set is equal to max_k. This is called in the
MMPC
when the backward phase is requested.
Value
A list including:
met |
A numerical vector of size equal to the number of columns of the dataset. |
counter |
The number of tests performed. |
pvalues |
The maximum logged p-value for the association of each predictor variable. |
Author(s)
Ioannis Tsamardinos, Michail Tsagris
R implementation and documentation: Michail Tsagris mtsagris@uoc.gr
References
Tsamardinos, Brown and Aliferis (2006). The max-min hill-climbing Bayesian network structure learning algorithm. Machine learning, 65(1), 31-78.
See Also
MMPC, mmhc.skel, CondIndTests, cv.mmpc
Examples
set.seed(123)
#simulate a dataset with continuous data
dataset <- matrix(runif(500 * 100, 1, 100), ncol = 100)
#define a simulated class variable
target <- 3 * dataset[, 10] + 2 * dataset[, 100] + 3 * dataset[, 20] + rnorm(500, 0, 5)
# MMPC algorithm
m1 <- MMPC(target, dataset, max_k = 3, threshold = 0.05, test="testIndFisher");
m2 <- MMPC(target, dataset, max_k = 3, threshold = 0.05, test="testIndFisher", backward = TRUE);
x <- dataset[, m1@selectedVars]
mmpcbackphase(target, x, test = testIndFisher)