notablyDistant {yaImpute} | R Documentation |
Find notably distant targets
Description
Notably distant targets are those with relatively large distances from the closest reference observation. A suitable threshold is used to detect large distances.
Usage
notablyDistant(object,kth=1,threshold=NULL,p=0.01,method="distribution")
Arguments
object |
an object of class |
kth |
the kth neighbor is used. |
threshold |
the thereshold distance that identifies notably large distances between observations. |
p |
|
method |
the method used to compute the threshold, see details. |
Details
When threshold
is NULL, the function computes one using one of
two methods. When method
is "distribution", assumption is made that
distances follow the lognormal distribution, unless the method used
to find neighbors is randomForest
, in which case the distances
are assumed to follow the beta distribution. A specified p
value
is used to compute the threshold
, which is the point in the distribution
where a fraction, p
, of the neighbors are larger than the threshold
.
When method
is "quantile", the function uses the quantile
function with probs=1-p
.
Value
List of two data frames that contain 1) the references that are notably distant from other references, 2) the targets that are notably distant from the references, 3) the threshold used, and 4) the method used.
Author(s)
Nicholas L. Crookston ncrookston.fs@gmail.com
See Also
Examples
data(iris)
set.seed(12345)
# form some test data
refs=sample(rownames(iris),50)
x <- iris[,1:3] # Sepal.Length Sepal.Width Petal.Length
y <- iris[refs,4:5] # Petal.Width Species
# build an msn run, first build dummy variables for species.
sp1 <- as.integer(iris$Species=="setosa")
sp2 <- as.integer(iris$Species=="versicolor")
y2 <- data.frame(cbind(iris[,4],sp1,sp2),row.names=rownames(iris))
y2 <- y2[refs,]
names(y2) <- c("Petal.Width","Sp1","Sp2")
msn <- yai(x=x,y=y2,method="msn")
notablyDistant(msn)