lmest {LMest} | R Documentation |
Estimate Latent Markov models for categorical responses
Description
Main function for estimating Latent Markov (LM) models for categorical responses.
Usage
lmest(responsesFormula = NULL, latentFormula = NULL,
data, index, k = 1:4, start = 0,
modSel = c("BIC", "AIC"), modBasic = 0,
modManifest = c("LM", "FM"),
paramLatent = c("multilogit", "difflogit"),
weights = NULL, tol = 10^-8, maxit = 1000,
out_se = FALSE, q = NULL, output = FALSE,
parInit = list(piv = NULL, Pi = NULL, Psi = NULL,
Be = NULL, Ga = NULL, mu = NULL,
al = NULL, be = NULL, si = NULL,
rho = NULL, la = NULL, PI = NULL,
fixPsi = FALSE),
fort = TRUE, seed = NULL, ntry = 0)
Arguments
responsesFormula |
a symbolic description of the model to fit. A detailed description is given in the ‘Details’ section |
latentFormula |
a symbolic description of the model to fit. A detailed description is given in the ‘Details’ section |
data |
a |
index |
a character vector with two elements, the first indicating the name of the unit identifier, and the second the time occasions |
k |
an integer vector specifying the number of latent states (default: |
start |
type of starting values (0 = deterministic, 1 = random, 2 = initial values in input) |
modSel |
a string indicating the model selection criteria: "BIC" for Bayesian Information Criterion and "AIC" for Akaike Information Criterion Criterion |
modBasic |
model on the transition probabilities (0 for time-heterogeneity, 1 for time-homogeneity, from 2 to (TT-1) partial time-homogeneity of a certain order) |
modManifest |
model for manifest distribution ( |
paramLatent |
type of parametrization for the transition probabilities ("multilogit" = standard multinomial logit for every row of the transition matrix, "difflogit" = multinomial logit based on the difference between two sets of parameters) |
weights |
an optional vector of weights for the available responses |
tol |
tolerance level for convergence |
maxit |
maximum number of iterations of the algorithm |
out_se |
to compute the information matrix and standard errors |
q |
number of support points for the AR(1) process (if modManifest ="FM") |
output |
to return additional output: |
parInit |
list of initial model parameters when |
fort |
to use fortran routines when possible |
seed |
an integer value with the random number generator state |
ntry |
to set the number of random initializations |
Details
lmest
is a general function for estimating LM models for categorical responses. The function requires data in long format and two additional columns indicating the unit identifier and the time occasions.
Covariates are allowed to affect manifest distribution (measurement model) or the initial and transition probabilities (latent model). Two different formulas are employed to specify the different LM models, responsesFormula
and latentFormula
:
responsesFormula
is used to specify the measurament model:responsesFormula = y1 + y2 ~ NULL
the LM model without covariates and two responses (y1
andy2
) is specified;responsesFormula = NULL
all the columns in the data except the"id"
and"time"
columns are used as responses to estimate the LM model without covariates;responsesFormula = y1 ~ x1 + x2
the univariate LM model with response (y1
) and two covariates (x1
andx2
) in the measurement model is specified;
latentFormula
is used to specify the LM model with covariates in the latent model:responsesFormula = y1 + y2 ~ NULL
latentFormula = ~ x1 + x2 | x3 + x4
the LM model with two responses (y1
andy2
) and two covariates affecting the initial probabilities (x1
andx2
) and other two affecting the transition probabilities (x3
andx4
) is specified;responsesFormula = y1 + y2 ~ NULL
latentFormula = ~ 1 | x1 + x2
(orlatentFormula = ~ NULL | x1 + x2
)
the covariates affect only the transition probabilities and an intercept is specified for the intial probabilities;responsesFormula = y1 + y2 ~ NULL
latentFormula = ~ x1 + x2
the LM model with two covariates (x1
andx2
) affecting both the initial and transition probabilities is specified;responsesFormula = y1 + y2 ~ NULL
latentFormula = ~ NULL | NULL
(orlatentFormula = ~ 1 | 1
)
the LM model with only an intercept on the initial and transition probabilities is specified.
The function also allows us to deal with missing responses, including drop-out and non-monotonic missingness, under the missing-at-random assumption. Missing values for the covariates are not allowed. The LM model with individual covariates in the measurement model is estimated only for complete univariate responses.
For continuous outcomes see the function lmestCont
.
Value
Returns an object of class 'LMbasic'
for the model without covariates (see LMbasic-class
), or an object of class 'LMmanifest'
for the model with covariates on the manifest model (see LMmanifest-class
), or an object of class 'LMlatent'
for the model with covariates on the latent model (see LMlatent-class
).
Author(s)
Francesco Bartolucci, Silvia Pandolfi, Fulvia Pennoni, Alessio Farcomeni, Alessio Serafini
References
Bartolucci F., Pandolfi S., Pennoni F. (2017) LMest: An R Package for Latent Markov Models for Longitudinal Categorical Data, Journal of Statistical Software, 81(4), 1-38.
Bartolucci, F., Farcomeni, A. and Pennoni, F. (2013) Latent Markov Models for Longitudinal Data, Chapman and Hall/CRC press.
Examples
### Basic LM model
data("data_SRHS_long")
SRHS <- data_SRHS_long[1:2400,]
# Categories rescaled to vary from 0 (“poor”) to 4 (“excellent”)
SRHS$srhs <- 5 - SRHS$srhs
out <- lmest(responsesFormula = srhs ~ NULL,
index = c("id","t"),
data = SRHS,
k = 3,
start = 1,
modBasic = 1,
seed = 123)
out
summary(out)
## Not run:
## Basic LM model with model selection using BIC
out1 <- lmest(responsesFormula = srhs ~ NULL,
index = c("id","t"),
data = SRHS,
k = 1:5,
tol = 1e-8,
modBasic = 1,
seed = 123, ntry = 2)
out1
out1$Bic
# Basic LM model with model selection using AIC
out2 <- lmest(responsesFormula = srhs ~ NULL,
index = c("id","t"),
data = SRHS,
k = 1:5,
tol = 1e-8,
modBasic = 1,
modSel = "AIC",
seed = 123, ntry = 2)
out2
out2$Aic
# Criminal data
data(data_criminal_sim)
data_criminal_sim = data.frame(data_criminal_sim)
responsesFormula <- lmestFormula(data = data_criminal_sim,response = "y")$responsesFormula
out3 <- lmest(responsesFormula = responsesFormula,
index = c("id","time"),
data =data_criminal_sim,
k = 1:7,
modBasic = 1,
tol = 10^-4)
out3
# Example of drug consumption data
data("data_drug")
long <- data_drug[,-6]-1
long <- data.frame(id = 1:nrow(long),long)
long <- reshape(long,direction = "long",
idvar = "id",
varying = list(2:ncol(long)))
out4 <- lmest(index = c("id","time"),
k = 3,
data = long,
weights = data_drug[,6],
modBasic = 1)
out4
summary(out4)
### LM model with covariates in the latent model
# Covariates: gender, race, educational level (2 columns), age and age^2
out5 <- lmest(responsesFormula = srhs ~ NULL,
latentFormula = ~
I(gender - 1) +
I( 0 + (race == 2) + (race == 3)) +
I(0 + (education == 4)) +
I(0 + (education == 5)) +
I(age - 50) + I((age-50)^2/100),
index = c("id","t"),
data = SRHS,
k = 2,
paramLatent = "multilogit",
start = 0)
out5
summary(out5)
### LM model with the above covariates in the measurement model
out6 <- lmest(responsesFormula = srhs ~ -1 +
I(gender - 1) +
I( 0 + (race == 2) + (race == 3)) +
I(0 + (education == 4)) +
I(0 + (education == 5)) + I(age - 50) +
I((age-50)^2/100),
index = c("id","t"),
data = SRHS,
k = 2,
modManifest = "LM",
out_se = TRUE,
tol = 1e-8,
start = 1,
seed = 123)
out6
summary(out6)
## End(Not run)