knn_domain_score {viraldomain} | R Documentation |
Calculate the K-Nearest Neighbor model domain applicability score
Description
This function fits a K-Nearest Neighbor (KNN) model to the provided data and computes a domain applicability score based on PCA distances.
Usage
knn_domain_score(
featured,
train_data,
knn_hyperparameters,
test_data,
threshold_value
)
Arguments
featured |
The name of the response variable to predict. |
train_data |
The training dataset containing predictor variables and the response variable. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
test_data |
The test dataset for making predictions. |
threshold_value |
The threshold value used for computing domain scores. |
Value
A data frame containing the computed domain scores for each observation in the test dataset.
Examples
set.seed(123)
library(dplyr)
featured <- "cd_2022"
# Adding jitter to original features
train_data = viral |>
transmute(cd_2022 = jitter(cd_2022), vl_2022 = jitter(vl_2022))
test_data = sero |>
transmute(cd_2022 = jitter(cd_2022), vl_2022 = jitter(vl_2022))
knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783)
threshold_value <- 0.99
# Call the function
knn_domain_score(featured, train_data, knn_hyperparameters, test_data, threshold_value)
[Package viraldomain version 0.0.3 Index]