R: (Weighted) clustering algorithm

ClusteringAlgo {sharp}

R Documentation

(Weighted) clustering algorithm

Description

Runs the (weighted) clustering algorithm specified in the argument implementation and returns matrices of variable weights, and the co-membership structure. This function is not using stability.

Usage

ClusteringAlgo(
  xdata,
  nc = NULL,
  eps = NULL,
  Lambda = NULL,
  scale = TRUE,
  row = TRUE,
  implementation = HierarchicalClustering,
  ...
)

Arguments

`xdata`	data matrix with observations as rows and variables as columns.
`nc`	matrix of parameters controlling the number of clusters in the underlying algorithm specified in `implementation`. If `nc` is not provided, it is set to `seq(1, nrow(xdata))`.
`eps`	radius in density-based clustering, see `dbscan`. Only used if `implementation=DBSCANClustering`.
`Lambda`	vector of penalty parameters.
`scale`	logical indicating if the data should be scaled to ensure that all variables contribute equally to the clustering of the observations.
`row`	logical indicating if rows (if `row=TRUE`) or columns (if `row=FALSE`) contain the items to cluster.
`implementation`	function to use for clustering. Possible functions include `HierarchicalClustering` (hierarchical clustering), `PAMClustering` (Partitioning Around Medoids), `KMeansClustering` (k-means) and `GMMClustering` (Gaussian Mixture Models). Alternatively, a user-defined function taking `xdata` and `Lambda` as arguments and returning a binary and symmetric matrix for which diagonal elements are equal to zero can be used.
`...`	additional parameters passed to the function provided in `implementation`.

Value

A list with:

`selected`	matrix of binary selection status. Rows correspond to different model parameters. Columns correspond to predictors.
`weight`	array of model coefficients. Rows correspond to different model parameters. Columns correspond to predictors. Indices along the third dimension correspond to outcome variable(s).
`comembership`	array of model coefficients. Rows correspond to different model parameters. Columns correspond to predictors. Indices along the third dimension correspond to outcome variable(s).

Examples


# Simulation of 15 observations belonging to 3 groups
set.seed(1)
simul <- SimulateClustering(
  n = c(5, 5, 5), pk = 100
)

# Running hierarchical clustering
myclust <- ClusteringAlgo(
  xdata = simul$data, nc = 2:5,
  implementation = HierarchicalClustering
)