cd.cluster {ACTCD} | R Documentation |
Cluster analysis for cognitive diagnosis based on the Asymptotic Classification Theory
Description
cd.cluster
is used to classify examinees into unlabeled clusters based on cluster analysis. Available options include K-means and Hierarchical Agglomerative Cluster Analysis (HACA) with various links.
Usage
cd.cluster (Y, Q, method = c("HACA", "Kmeans"), Kmeans.centers = NULL,
Kmeans.itermax = 10, Kmeans.nstart = 1, HACA.link = c("complete", "ward", "single",
"average", "mcquitty", "median", "centroid"), HACA.cut = NULL)
Arguments
Y |
A required |
Q |
A required |
method |
The clustering algorithm used to classify data. Two options are available, including |
Kmeans.centers |
The number of clusters when |
Kmeans.itermax |
The maximum number of iterations allowed when |
Kmeans.nstart |
The number of random sets to be chosen when |
HACA.link |
The link to be used with HACA. It must be one of |
HACA.cut |
The number of clusters when |
Details
Based on the Asymptotic Classification Theory (Chiu, Douglas & Li, 2009), A sample statistic \bm{W}
(See ACTCD
) is calculated using the response matrix and Q-matrix provided by the users and then taken as the input for cluster analysis (i.e. K
-means and HACA).
The number of latent clusters can be specified by the users in Kmeans.centers
or HACA.cut
. It must be not less than 2 and not greater than 2^K
, where K
is the number of attributes. Note that if the number of latent clusters is less than the default value (2^K
), the clusters cannot be labeled in labeling
using method="1"
and method="3"
algorithms. See labeling
for more information.
Value
W |
The |
size |
A set of integers, indicating the sizes of latent clusters. |
mean.w |
A matrix of cluster centers, representing the average |
wss.w |
The vector of within-cluster sum of squares of |
sqmwss.w |
The vector of square root of mean of within-cluster sum of squares of |
mean.y |
The vector of the mean of sum scores of the clusters. |
class |
The vector of estimated memberships for examinees. |
References
Chiu, C. Y., Douglas, J. A., & Li, X. (2009). Cluster analysis for cognitive diagnosis: theory and applications. Psychometrika, 74(4), 633-665.
See Also
print.cd.cluster
, labeling
, npar.CDM
, ACTCD
Examples
# Classification based on the simulated data and Q matrix
data(sim.dat)
data(sim.Q)
# Information about the dataset
N <- nrow(sim.dat) #number of examinees
J <- nrow(sim.Q) #number of items
K <- ncol(sim.Q) #number of attributes
#the default number of latent clusters is 2^K
cluster.obj <- cd.cluster(sim.dat, sim.Q)
#cluster size
sizeofc <- cluster.obj$size
#W statistics
W <- cluster.obj$W
#User-specified number of latent clusters
M <- 5 # the number of clusters is fixed to 5
cluster.obj <- cd.cluster(sim.dat, sim.Q, method="HACA", HACA.cut=M)
#cluster size
sizeofc <- cluster.obj$size
#W statistics
W <- cluster.obj$W
M <- 5 # the number of clusters is fixed to 5
cluster.obj <- cd.cluster(sim.dat, sim.Q, method="Kmeans", Kmeans.centers =M)
#cluster size
sizeofc <- cluster.obj$size
#W statistics
W <- cluster.obj$W