| lca.model {LCAextend} | R Documentation |
fits latent class models for phenotypic measurements in pedigrees with or without familial dependence using an Expectation-Maximization (EM) algorithm
Description
This is the main function for fitting latent class models. It performs some checks of the pedigrees (it exits if an individual has only one
parent in the pedigree, if no children is in the pedigree or if there
are not enough individuals for parameters estimation) and of the
initial values (positivity of probabilites and their summation to
one). For models with familial dependence, the child latent class
depends on his parents classes via
triplet-transition probabilities. In the case of models without
familial dependence, it performs the classical Latent
Class Analysis (LCA) where all individuals are supposed independent
and the pedigree structure is meaningless. The EM algorithm stops when
the difference between log-likelihood is smaller then tol that
is fixed by the user.
Usage
lca.model(ped, probs, param, optim.param, fit = TRUE,
optim.probs.indic = c(TRUE, TRUE, TRUE, TRUE), tol = 0.001,
x = NULL, var.list = NULL, famdep = TRUE, modify.init = NULL)
Arguments
ped |
a matrix or data frame representing pedigrees and measurements: |
probs |
a list of initial probability parameters (see below
for more details). The function |
param |
a list of initial measurement distribution parameters (see below for more details). The function |
optim.param |
a variable indicating how measurement distribution parameter optimization is performed (see below for more details), |
fit |
a logical variable, if |
optim.probs.indic |
a vector of logical values indicating which probability parameters to estimate, |
tol |
a small number governing the stopping rule of the EM algorithm. Default is 0.001, |
x |
a matrix of covariates (optional), default is |
var.list |
a list of integers indicating the columns of
|
famdep |
a logical variable indicating if familial dependence model is used or not. Default is |
modify.init |
a function to modify initial values of the EM algorithm, or |
Details
The symptom status vector (column 6 of ped) takes value 1 for
subjects that have been
examined and show no symptoms (i.e. completely unaffected
subjects). When applying the LCA to
measurements available on all subjects, the status vector must take the
value of 2 for every individual with measurements.
probs is a list of initial probability parameters:
For models with familial dependence:
pa probability vector, each
p[c]is the probability that an symptomatic founder is in classcforc>=1,p0the probability that a founder without symptoms is in class 0,
p.transan array of dimension
KtimesK+1timesK+1, whereKis the number of latent classes of the model, and is such thatp.trans[c_i,c_1,c_2]is the conditional probability that a symptomatic individualiis in classc_igiven that his parents are in classesc_1andc_2,p0connecta vector of length
K, wherep0connect[c]is the probability that a connector without symptoms is in class0, given that one of his parents is in classc>=1and the other in class 0,p.foundthe probability that a founder is symptomatic,
p.childthe probability that a child is symptomatic,
For models without familial dependence, all individuals are independent:
pa probability vector, each
p[c]is the probability that an symptomatic individual is in classcforc>=1,p0the probability that an individual without symptoms is in class 0,
p.affthe probability that an individual is symptomatic,
param is a list of measurement distribution parameters: the coefficients alpha (cumulative logistic coefficients see alpha.compute) in
the case of discrete or ordinal data, and means mu and variances-covariances matrices sigma in the case of continuous data,
optim.param is a variable indicating how the measurement distribution parameter estimation of the M step is performed. Two possibilities,
optim.noconst.ordi and optim.const.ordi, are now available in the case of discrete or ordinal measurements, and four possibilities
optim.indep.norm (measurements are independent, diagonal variance-covariance matrix),
optim.diff.norm (general variance-covariance matrix but equal for all classes),
optim.equal.norm (variance-covariance matrices are different for each class but equal variance and equal covariance for a class) and
optim.gene.norm (general variance-covariance matrices for all classes), are now available in the case of continuous measurements,
One of the allowed values of optim.param must be entered without quotes.
optim.probs.indic is a vector of logical values of length 4 for
models with familial dependence and 2 for models without familial
dependence.
For models with familial dependence:
optim.probs.indic[1]indicates whether
p0will be estimated or not,optim.probs.indic[2]indicates whether
p0connectwill be estimated or not,optim.probs.indic[3]indicates whether
p.foundwill be estimated or not,optim.probs.indic[4]indicates whether
p.connectwill be estimated or not.
For models without familial dependence:
optim.probs.indic[1]indicates whether
p0will be estimated or not,optim.probs.indic[2]indicates whether
p.affwill be estimated or not.
All defaults are TRUE. If the dataset contains only nuclear families, there is no information to estimate p0connect and p.connect, and these parameters will not be estimated, irrespective of the indicator value.
Value
The function returns a list of 4 elements:
param |
the Maximum Likelihood Estimator (MLE) of the
measurement distribution parameters if |
probs |
the MLE of probability parameters if |
When measurements are available on all subjects, the probability parameters p0 and p0connect are degenerated to 0 and
p.afound, p.child and p.aff to 1 in the output.
weight |
an array of dimension |
ll |
the maximum log-likelihood value (log-ML) if |
References
TAYEB, A. LABBE, A., BUREAU, A. and MERETTE, C. (2011) Solving Genetic Heterogeneity in Extended
Families by Identifying Sub-types of Complex Diseases. Computational Statistics, 26(3): 539-560. DOI: 10.1007/s00180-010-0224-2,
LABBE, A., BUREAU, A. et MERETTE, C. (2009) Integration of Genetic Familial Dependence Structure in Latent Class Models. The International Journal of Biostatistics, 5(1): Article 6.
Examples
#data
data(ped.ordi)
fam <- ped.ordi[,1]
#probs and param
data(param.ordi)
data(probs)
#the function applied only to two first families of ped.ordi
lca.model(ped.ordi[fam%in%1:2,],probs,param.ordi,optim.noconst.ordi,
fit=TRUE,optim.probs.indic=c(TRUE,TRUE,TRUE,TRUE),tol=0.001,x=NULL,
var.list=NULL,famdep=TRUE,modify.init=NULL)