KMEANS_FUNCTION {KMEANS.KNN}R Documentation

KMEANS_FUNCTION

Description

This function implements the K-Means algorithm for data clustering. It provides options for data preprocessing, such as normalization and imputation of missing values.

Usage

KMEANS_FUNCTION(
  data,
  k,
  max_iter = 100,
  nstart = 25,
  distance_metric = "euclidean",
  scale_data = FALSE,
  impute_data = "mean"
)

Arguments

data

A dataframe containing the numerical data to be clustered.

k

The number of clusters to form.

max_iter

The maximum number of iterations for the K-Means algorithm.

nstart

The number of times to randomly initialize the centroids.

distance_metric

The distance metric to use ('euclidean' or 'manhattan').

scale_data

A boolean indicating whether the data should be normalized.

impute_data

The imputation method for missing values ('mean', 'median', 'mode').

Value

A list containing the following elements: - clusters: A vector indicating the cluster of each point. - centers: The coordinates of the centroids of each cluster. - additional_info: Additional information such as total distance and number of iterations.

Examples

data(iris)
data_iris <- iris[, -5] # Exclude the species column
results <- KMEANS_FUNCTION(data_iris, k = 3)
print(results$clusters)

[Package KMEANS.KNN version 0.1.0 Index]