R-squared pruning {SFSI} | R Documentation |
R-squared pruning
Description
Pruning features using an R-squared threshold and maximum distance
Usage
Prune(X, alpha = 0.95,
pos = NULL, d.max = NULL,
centered = FALSE, scaled = FALSE,
verbose = FALSE)
Arguments
X |
(numeric matrix) A matrix with observations in rows and features (e.g., SNPs) in columns |
alpha |
(numeric) R-squared threshold used to determine connected sets |
pos |
(numeric vector) Optional vector with positions (e.g., bp) of features |
d.max |
(numeric) Maximum distance that connected sets are apart |
centered |
|
scaled |
|
verbose |
|
Details
The algorithm identifies sets of connected features as those that share an R2 > α and retains only one feature (first appearance) for each set.
The sets can be limited to lie within a distance less or equal to a d.max
value.
Value
Returns a list object that contains the elements:
-
prune.in
: (vector) indices of selected (unconnected) features. -
prune.out
: (vector) indices of dropped out features.
Examples
require(SFSI)
data(wheatHTP)
index = c(154:156,201:205,306:312,381:387,540:544)
X = M[,index] # Subset markers
colnames(X) = 1:ncol(X)
# See connected sets using R^2=0.8
R2thr = 0.8
R2 = cor(X)^2
nw1 = net(R2, delta=R2thr)
plot(nw1, show.names=TRUE)
# Get pruned features
res = Prune(X, alpha=R2thr)
# See selected (unconnected) features
nw2 = net(R2[res$prune.in,res$prune.in], delta=R2thr)
nw2$xy = nw1$xy[res$prune.in,]
plot(nw2, show.names=TRUE)