R: Constrained Dual Scaling for Successive Categories with...

cds {cds}

R Documentation

Constrained Dual Scaling for Successive Categories with Groups

Description

Uses an alternating nonnegative least squares algorithm combined with a k-means-type algorithm to optimize the constrained group dual scaling criterion outlined in the reference. Parallel computations for random starts of the grouping matrix is supported via package parallel.

Usage

cds(x, K = 4, q = NULL, eps.ALS = 0.001, eps.G = 1e-07,
  nr.starts.G = 20, nr.starts.a = 5, maxit.ALS = 20, maxit = 50,
  Gstarts = NULL, astarts = NULL, parallel = FALSE, random.G = FALSE,
  times.a.multistart = 1, info.level = 1, mc.preschedule = TRUE,
  seed = NULL, LB = FALSE, reorder.grps = TRUE, rescale.a = TRUE,
  tol = sqrt(.Machine$double.eps), update.G = TRUE)

Arguments

`x`	an object of class `"dsdata"` (see `cds.sim()`), or a matrix (or object coercible to a matrix) containing the data for n individuals on m objects. The data does not yet contain any additional columns for the rating scale.
`K`	The number of response style groups to look for. If a vector of length greater than one is given, the algorithm is run for each element and a list of class `cdslist` is returned.
`q`	The maximum rating (the scale is assumed to be `1:q`).
`eps.ALS`	Numerical convergence criterion for the alternating least squares part of the algorithm (updates for row and column scores).
`eps.G`	Numerical convergence criterion for the k-means part of the algorithm.
`nr.starts.G`	Number of random starts for the grouping matrix.
`nr.starts.a`	Number of random starts for the row scores.
`maxit.ALS`	Maximum number of iterations for the ALS part of the algorithm. A warning is given if this maximum is reached. Often it is not a concern if this maximum is reached.
`maxit`	Maximum number of iterations for the k-means part of the algorithm.
`Gstarts`	Facility to supply a list of explicit starting values for the grouping matrix G. Each start consists of a two element list: `i` giving and integer number the start, and `G` giving the starting configuration as an indicator matrix.
`astarts`	Supply explicit starts for the a vectors, as a list.
`parallel`	logical. Should parallelization over starts for the grouping matrix be used?
`random.G`	logical. Should the k-means part consider the individuals in a random order?
`times.a.multistart`	The number of times that random starts for the row scores are used. If == 1, then random starts are only used once for each start of the grouping matrix.
`info.level`	Verbosity of the output. Options are 1, 2, 3 and 4.
`mc.preschedule`	Argument to mclapply under Unix.
`seed`	Random seed for random number generators. Only partially implemented.
`LB`	logical. Load-balancing used in parallelization or not? Windows only.
`reorder.grps`	logical. Use the Hungarian algorithm to reorder group names so that the trace of the confusion matrix is maximized.
`rescale.a`	logical. Rescale row score to length sqrt(2n) if TRUE (after the algorithm has converged).
`tol`	tolerance `tol` passed to `lsei` of the limSolve package. Defaults to `sqrt(.Machine$double.eps)`
`update.G`	Logical indicating whether or not to update the G matrix from its starting configuration. Useful when clustering is known apriori or not desired.

Details

See the reference for more details.

Value

Object of class ds with elements:

`G`	Grouping indicator matrix.
`K`	Number of groups K.
`opt.crit`	Optimum value of the criterion.
`a`	The 2n-vector of row scores.
`bstar`	The m-vector of object scores.
`bkmat`	The matrix of group-specific boundary scores for the ratings.
`alphamat`	The estimated spline coefficients for each group.
`iter`	The number of iterations used for the optimal random start wrt the grouping matrix.
`time.G.start`	The number of seconds it took for the algorithm to converge for this optimal random start.
`grp`	The grouping of the individuals as obtained by the algorithm.
`kloss`	Loss value from G update (not equivalent to that of ALS updates).
`hitrate`, `confusion`	Confusion and hitrates of original data object contained a grouping vector.
`loss.G`	Optimality criterion values for the random starts of G.
`q`	The number of ratings in the Likert scale `1:q`
`time.total`	Total time taken for the algorithm over all random starts
`call`	The function call.
`data`	The input data object.

Author(s)

Pieter C. Schoonees

References

Schoonees, P.C., Velden, M. van de & Groenen, P.J.F. (2013). Constrained Dual Scaling for Detecting Response Styles in Categorical Data. (EI report series EI 2013-10). Rotterdam: Econometric Institute.

Examples


set.seed(1234)
dat <- cds.sim()
out <- cds(dat)

[Package cds version 1.0.3 Index]