perform_clustering {clap} | R Documentation |
Perform clustering based on nearest neighbor distances
Description
Perform clustering based on nearest neighbor distances
Usage
perform_clustering(data, class_column = NULL)
Arguments
data |
A numeric matrix or data frame of data points. |
class_column |
A character string or unquoted name specifying the name of the column containing class labels. |
Details
This function first removes the specified class column from the data, calculates the nearest neighbor distances, and then performs clustering using a radius based on the maximum nearest neighbor distance.
Value
An object of class 'clap' containing:
- members
A list of clusters with their respective data point IDs.
- cluster_df
A data frame with cluster assignments for each data point.
- data
The original dataset.
Examples
if (requireNamespace("ggplot2", quietly = TRUE)) {
# Generate dummy data
class1 <- matrix(rnorm(100, mean = 0, sd = 1), ncol = 2) +
matrix(rep(c(1, 1), each = 50), ncol = 2)
class2 <- matrix(rnorm(100, mean = 0, sd = 1), ncol = 2) +
matrix(rep(c(-1, -1), each = 50), ncol = 2)
datanew <- rbind(class1, class2)
training <- data.frame(datanew, class = factor(c(rep(1, 50), rep(2, 50))))
# Plot the dummy data to visualize overlaps
p <- ggplot2::ggplot(training, ggplot2::aes(x = X1, y = X2, color = class)) +
ggplot2::geom_point() +
ggplot2::labs(title = "Dummy Data with Overlapping Classes")
print(p)
# Perform clustering
cluster_result <- perform_clustering(training, class_column = class)
}
[Package clap version 0.1.0 Index]