| poLCA.simdata {poLCA} | R Documentation | 
Create simulated cross-classification data
Description
Uses the latent class model's assumed data-generating process to create a simulated dataset that can be used to test the properties of the poLCA latent class and latent class regression estimator.
Usage
poLCA.simdata(N = 5000, probs = NULL, nclass = 2, ndv = 4, 
              nresp = NULL, x = NULL, niv = 0, b = NULL, 
              P = NULL, missval = FALSE, pctmiss = NULL)
Arguments
| N | number of observations. | 
| probs | a list of matrices of dimension  | 
| nclass | number of latent classes. If | 
| ndv | number of manifest variables.  If  | 
| nresp | number of possible outcomes for each manifest variable. If  | 
| x | a matrix of concomicant variables with  | 
| niv | number of concomitant variables (covariates).  Setting  | 
| b | when using covariates, an  | 
| P | a vector of mixing proportions (class population shares) of length  | 
| missval | logical. If  | 
| pctmiss | percentage of values to be dropped as missing, if  | 
Details
Note that entering probs overrides nclass, ndv, and nresp.  It also overrides P if the length of the P vector is not equal to the length of the probs list.  Likewise, if probs=NULL, then length(nresp) overrides ndv and length(P) overrides nclass.  Setting niv>1 causes any user-entered value of P to be disregarded.
Value
| dat | a data frame containing the simulated variables. Variable names for manifest variables are Y1, Y2, etc. Variable names for concomitant variables are X1, X2, etc. | 
| probs | a list of matrices of dimension  | 
| nresp | a vector containing the number of possible outcomes for each manifest variable. | 
| b | coefficients on covariates, if used. | 
| P | mixing proportions corresponding to each latent class. | 
| pctmiss | percent of observations missing. | 
| trueclass | 
 | 
See Also
Examples
# Create a sample data set with 3 classes and no covariates 
# and run poLCA to recover the specified parameters.
probs <- list(matrix(c(0.6, 0.1, 0.3,
                       0.6, 0.3, 0.1,
		       0.3, 0.1, 0.6),
		     ncol = 3,byrow = TRUE), # conditional resp prob to Y1
              matrix(c(0.2, 0.8,
	               0.7, 0.3,
		       0.3, 0.7),
		     ncol = 2, byrow = TRUE), # conditional resp prob to Y2
              matrix(c(0.3, 0.6, 0.1,
	               0.1, 0.3, 0.6,
		       0.3, 0.6, 0.1),
		     ncol = 3,byrow = TRUE), # conditional resp prob to Y3
              matrix(c(0.1, 0.1, 0.5, 0.3,
	               0.5, 0.3, 0.1, 0.1,
		       0.3, 0.1, 0.1, 0.5),
		     ncol = 4,byrow = TRUE), # conditional resp prob to Y4
              matrix(c(0.1, 0.1, 0.8,
	               0.1, 0.8, 0.1,
		       0.8, 0.1, 0.1),
		     ncol = 3,
		     byrow = TRUE)) # conditional resp prob to Y5
simdat <- poLCA.simdata(N=1000,probs,P=c(0.2,0.3,0.5))
f1 <- cbind(Y1,Y2,Y3,Y4,Y5)~1
lc1 <- poLCA(f1,simdat$dat,nclass=3)
table(lc1$predclass,simdat$trueclass)
# Create a sample dataset with 2 classes and three covariates.
# Then compare predicted class memberships when the model is 
# estimated "correctly" with covariates to when it is estimated
# "incorrectly" without covariates.
simdat2 <- poLCA.simdata(N=1000,ndv=7,niv=3,nclass=2,b=matrix(c(1,-2,1,-1)))
f2a <- cbind(Y1,Y2,Y3,Y4,Y5,Y6,Y7)~X1+X2+X3
lc2a <- poLCA(f2a,simdat2$dat,nclass=2)
f2b <- cbind(Y1,Y2,Y3,Y4,Y5,Y6,Y7)~1
lc2b <- poLCA(f2b,simdat2$dat,nclass=2)
table(lc2a$predclass,lc2b$predclass)