sd_sis {semidist} | R Documentation |
Feature screening via semi-distance correlation
Description
Implement the (grouped) feature screening for the classification problem via semi-distance correlation.
Usage
sd_sis(X, y, group_info = NULL, d = NULL, parallel = FALSE)
Arguments
X |
Data of multivariate covariates, which should be an
|
y |
Data of categorical response, which should be a factor of length
|
group_info |
A list specifying the group information, with elements
being sets of indicies of covariates in a same group. For example,
Defaults to If The names of the list can help recoginize the group. For example,
|
d |
An integer specifying at least how many (single) features should
be kept after screening. For example, if Defaults to |
parallel |
A boolean indicating whether to calculate parallelly via
|
Value
A list of the objects about the implemented feature screening:
-
group_info
: group information; -
measurement
: sample semi-distance correlations calculated for the groups specified ingroup_info
; -
selected
: indicies/names of (single) covariates that are selected after feature screening; -
ordering
: order of the calculated measurements of the groups specified ingroup_info
. The first one is the largest, and the last is the smallest.
See Also
sdcor()
for calculating the sample semi-distance correlation.
Examples
X <- mtcars[, c("mpg", "disp", "hp", "drat", "wt", "qsec")]
y <- factor(mtcars[, "am"])
sd_sis(X, y, d = 4)
# Suppose we have prior information for the group structure as
# ("mpg", "drat"), ("disp", "hp") and ("wt", "qsec")
group_info <- list(
mpg_drat = c("mpg", "drat"),
disp_hp = c("disp", "hp"),
wt_qsec = c("wt", "qsec")
)
sd_sis(X, y, group_info, d = 4)