generate_clusters {rearrr} | R Documentation |
Generate n-dimensional clusters
Description
Generates data.frame
(tibble
) with clustered groups.
Usage
generate_clusters(
num_rows,
num_cols,
num_clusters,
compactness = 1.6,
generator = runif,
name_prefix = "D",
cluster_col_name = ".cluster"
)
Arguments
num_rows |
Number of rows. |
num_cols |
Number of columns (dimensions). |
num_clusters |
Number of clusters. |
compactness |
How compact the clusters should be. A larger value leads to more compact clusters (on average). Technically, it is passed to the |
generator |
Function to generate the numeric values. Must have the number of values to generate as its first (and only required) argument, as that is the only argument we pass to it. |
name_prefix |
Prefix string for naming columns. |
cluster_col_name |
Name of cluster factor. |
Details
Generates
data.frame
with random values using the`generator`
.Divides the rows into groups (the clusters).
Contracts the distance from each data point to the centroid of its group.
Performs MinMax scaling such that the scale of the data points is similar to the generated data.
Value
data.frame
(tibble
) with the clustered columns and the cluster grouping factor.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other clustering functions:
cluster_groups()
,
transfer_centroids()
Examples
# Attach packages
library(rearrr)
library(dplyr)
has_ggplot <- require(ggplot2) # Attach if installed
# Set seed
set.seed(10)
# Generate clusters
generate_clusters(num_rows = 20, num_cols = 3, num_clusters = 3, compactness = 1.6)
generate_clusters(num_rows = 20, num_cols = 5, num_clusters = 6, compactness = 2.5)
# Generate clusters and plot them
# Tip: Call this multiple times
# to see the behavior of `generate_clusters()`
if (has_ggplot){
generate_clusters(
num_rows = 50, num_cols = 2,
num_clusters = 5, compactness = 1.6
) %>%
ggplot(
aes(x = D1, y = D2, color = .cluster)
) +
geom_point() +
theme_minimal() +
labs(x = "D1", y = "D2", color = "Cluster")
}
#
# Plot clusters in 3d view
#
# Generate clusters
clusters <- generate_clusters(
num_rows = 50, num_cols = 3,
num_clusters = 5, compactness = 1.6
)
## Not run:
# Plot 3d with plotly
plotly::plot_ly(
x = clusters$D1,
y = clusters$D2,
z = clusters$D3,
type = "scatter3d",
mode = "markers",
color = clusters$.cluster
)
## End(Not run)