dcem_predict {DCEM} | R Documentation |
dcem_predict: Part of DCEM package.
Description
Predict the cluster membership of test data based on the learned parameters i.e, output from
dcem_train
or dcem_star_train
.
Usage
dcem_predict(param_list, data)
Arguments
param_list |
(list): List of distribution parameters. The list contains the learned parameteres of the distribution. |
data |
(vector or dataframe): A vector of data for univariate data. A dataframe (rows represent the data and columns represent the features) for multivariate data. |
Value
A list containing the cluster membership for the test data.
References
Parichit Sharma, Hasan Kurban, Mehmet Dalkilic DCEM: An R package for clustering big data via data-centric modification of Expectation Maximization, SoftwareX, 17, 100944 URL https://doi.org/10.1016/j.softx.2021.100944
Examples
# Simulating a mixture of univariate samples from three distributions
# with meu as 20, 70 and 100 and standard deviation as 10, 100 and 40 respectively.
sample_uv_data = as.data.frame(c(rnorm(100, 20, 5), rnorm(70, 70, 1), rnorm(50, 100, 2)))
# Select first few points from each distribution as test data
test_data = as.vector(sample_uv_data[c(1:5, 101:105, 171:175),])
# Remove the test data from the training set
sample_uv_data = as.data.frame(sample_uv_data[-c(1:5, 101:105, 171:175), ])
# Randomly shuffle the samples.
sample_uv_data = as.data.frame(sample_uv_data[sample(nrow(sample_uv_data)),])
# Calling the dcem_train() function on the simulated data with threshold of
# 0.000001, iteration count of 1000 and random seeding respectively.
sample_uv_out = dcem_train(sample_uv_data, num_clusters = 3, iteration_count = 100,
threshold = 0.001)
# Predict the membership for test data
test_data_membership <- dcem_predict(sample_uv_out, test_data)
# Access the output
print(test_data_membership)
[Package DCEM version 2.0.5 Index]