KMEANS_FUNCTION {KMEANS.KNN} | R Documentation |
KMEANS_FUNCTION
Description
This function implements the K-Means algorithm for data clustering. It provides options for data preprocessing, such as normalization and imputation of missing values.
Usage
KMEANS_FUNCTION(
data,
k,
max_iter = 100,
nstart = 25,
distance_metric = "euclidean",
scale_data = FALSE,
impute_data = "mean"
)
Arguments
data |
A dataframe containing the numerical data to be clustered. |
k |
The number of clusters to form. |
max_iter |
The maximum number of iterations for the K-Means algorithm. |
nstart |
The number of times to randomly initialize the centroids. |
distance_metric |
The distance metric to use ('euclidean' or 'manhattan'). |
scale_data |
A boolean indicating whether the data should be normalized. |
impute_data |
The imputation method for missing values ('mean', 'median', 'mode'). |
Value
A list containing the following elements: - clusters: A vector indicating the cluster of each point. - centers: The coordinates of the centroids of each cluster. - additional_info: Additional information such as total distance and number of iterations.
Examples
data(iris)
data_iris <- iris[, -5] # Exclude the species column
results <- KMEANS_FUNCTION(data_iris, k = 3)
print(results$clusters)