poLCA.simdata {poLCA} | R Documentation |
Create simulated cross-classification data
Description
Uses the latent class model's assumed data-generating process to create a simulated dataset that can be used to test the properties of the poLCA latent class and latent class regression estimator.
Usage
poLCA.simdata(N = 5000, probs = NULL, nclass = 2, ndv = 4,
nresp = NULL, x = NULL, niv = 0, b = NULL,
P = NULL, missval = FALSE, pctmiss = NULL)
Arguments
N |
number of observations. |
probs |
a list of matrices of dimension |
nclass |
number of latent classes. If |
ndv |
number of manifest variables. If |
nresp |
number of possible outcomes for each manifest variable. If |
x |
a matrix of concomicant variables with |
niv |
number of concomitant variables (covariates). Setting |
b |
when using covariates, an |
P |
a vector of mixing proportions (class population shares) of length |
missval |
logical. If |
pctmiss |
percentage of values to be dropped as missing, if |
Details
Note that entering probs
overrides nclass
, ndv
, and nresp
. It also overrides P
if the length of the P
vector is not equal to the length of the probs
list. Likewise, if probs=NULL
, then length(nresp)
overrides ndv
and length(P)
overrides nclass
. Setting niv>1
causes any user-entered value of P
to be disregarded.
Value
dat |
a data frame containing the simulated variables. Variable names for manifest variables are Y1, Y2, etc. Variable names for concomitant variables are X1, X2, etc. |
probs |
a list of matrices of dimension |
nresp |
a vector containing the number of possible outcomes for each manifest variable. |
b |
coefficients on covariates, if used. |
P |
mixing proportions corresponding to each latent class. |
pctmiss |
percent of observations missing. |
trueclass |
|
See Also
Examples
# Create a sample data set with 3 classes and no covariates
# and run poLCA to recover the specified parameters.
probs <- list(matrix(c(0.6, 0.1, 0.3,
0.6, 0.3, 0.1,
0.3, 0.1, 0.6),
ncol = 3,byrow = TRUE), # conditional resp prob to Y1
matrix(c(0.2, 0.8,
0.7, 0.3,
0.3, 0.7),
ncol = 2, byrow = TRUE), # conditional resp prob to Y2
matrix(c(0.3, 0.6, 0.1,
0.1, 0.3, 0.6,
0.3, 0.6, 0.1),
ncol = 3,byrow = TRUE), # conditional resp prob to Y3
matrix(c(0.1, 0.1, 0.5, 0.3,
0.5, 0.3, 0.1, 0.1,
0.3, 0.1, 0.1, 0.5),
ncol = 4,byrow = TRUE), # conditional resp prob to Y4
matrix(c(0.1, 0.1, 0.8,
0.1, 0.8, 0.1,
0.8, 0.1, 0.1),
ncol = 3,
byrow = TRUE)) # conditional resp prob to Y5
simdat <- poLCA.simdata(N=1000,probs,P=c(0.2,0.3,0.5))
f1 <- cbind(Y1,Y2,Y3,Y4,Y5)~1
lc1 <- poLCA(f1,simdat$dat,nclass=3)
table(lc1$predclass,simdat$trueclass)
# Create a sample dataset with 2 classes and three covariates.
# Then compare predicted class memberships when the model is
# estimated "correctly" with covariates to when it is estimated
# "incorrectly" without covariates.
simdat2 <- poLCA.simdata(N=1000,ndv=7,niv=3,nclass=2,b=matrix(c(1,-2,1,-1)))
f2a <- cbind(Y1,Y2,Y3,Y4,Y5,Y6,Y7)~X1+X2+X3
lc2a <- poLCA(f2a,simdat2$dat,nclass=2)
f2b <- cbind(Y1,Y2,Y3,Y4,Y5,Y6,Y7)~1
lc2b <- poLCA(f2b,simdat2$dat,nclass=2)
table(lc2a$predclass,lc2b$predclass)