clusterabilitytest {clusterability}  R Documentation 
Perform a test of clusterability
Description
Performs tests for clusterability of a data set and returns results in a clusterability object. Can do data reduction via PCA or pairwise distances and standardize data prior to performing the test.
Usage
clusterabilitytest(data, test, reduction = "pca",
distance_metric = "euclidean", distance_standardize = "std",
pca_center = TRUE, pca_scale = TRUE, is_dist_matrix = FALSE,
completecase = FALSE, d_simulatepvalue = FALSE, d_reps = 2000,
s_m = 999, s_adjust = TRUE, s_digits = 6, s_setseed = NULL,
s_outseed = FALSE)
Arguments
data 
the data set to be used in the test. Must contain only numeric data. 
test 
the test to be performed. Either 
reduction 
any dimension reduction that is to be performed.
For multivariate 
distance_metric 
if applicable, the metric to be used in computing pairwise distances. The Additional choices are:
CAUTION: Not all of these have been tested, but instead are provided to potentially be useful. If in doubt, use the default 
distance_standardize 
how the variables should be standardized, if at all.

pca_center 
if applicable, a logical value indicating whether the variables should be shifted to be zero centered (see 
pca_scale 
if applicable, a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place (see 
is_dist_matrix 
a logical value indicating whether the 
completecase 
a logical value indicating whether a complete case analysis should be performed. For both tests, missing data must be removed before the test can be performed. This can be done manually by the user or by setting 
d_simulatepvalue 
for Dip Test, a logical value indicating whether 
d_reps 
for Dip Test, a positive integer. The number of replicates used in Monte Carlo simulation. Only used if 
s_m 
for Silverman Test, a positive integer. The number of bootstrap replicates used in the test. Default is 
s_adjust 
for Silverman Test, a logical value indicating whether pvalues are adjusted using work by Hall and York. 
s_digits 
for Silverman Test, a positive integer indicating the number of digits to round the p value. Default is 
s_setseed 
for Silverman Test, an integer used to set the seed of the random number generator. If the default value of 
s_outseed 
for Silverman Test, a logical value indicating whether to return the state of the random number generator as part of the output. This is used in limited cases for troubleshooting, so the default is 
Value
clusterabilitytest
returns a clusterability
object containing information on the test performed and results. Can be printed using the print.clusterability
function.
References
Hall, P. and York, M., 2001. On the calibration of Silverman's test for multimodality. Statistica Sinica, pp.515536.
Silverman, B.W., 1981. Using kernel density estimates to investigate multimodality. Journal of the Royal Statistical Society. Series B (Methodological), pp.9799.
Martin Maechler (2016). diptest: Hartigan's Dip Test Statistic for Unimodality  Corrected. R package version 0.757. https://CRAN.Rproject.org/package=diptest
Schwaiger F, Holzmann H. Package which implements the silvermantest; 2013. Available from: https://www.mathematik.unimarburg.de/stochastik/R packages/.
See Also
Examples
### Quick start ###
# Load data and remove Species
data(iris)
iris_num < iris[,5]
plot(iris_num)
# Run test using default options
clust_result < clusterabilitytest(iris_num, "dip")
# Print results
print(clust_result)
### Longer Example: Specifying Parameters ###
# Load data and plot to visualize
data(normals2)
plot(normals2)
# Using Silverman's test, pairwise distances to reduce dimension,
# 1,000 bootstrap replicates, with an RNG seed of 12345
clust_result2 < clusterabilitytest(normals2, "silverman", reduction = "distance",
s_m = 1000, s_setseed = 12345)
# Print result
print(clust_result2)