sllim {xLLiM} | R Documentation |
EM Algorithm for Student Locally Linear Mapping
Description
EM Algorithm for Student Locally Linear Mapping
Usage
sllim(tapp,yapp,in_K,in_r=NULL,maxiter=100,Lw=0,cstr=NULL,verb=0,in_theta=NULL,
in_phi=NULL)
Arguments
tapp |
An |
yapp |
An |
in_K |
Initial number of components |
in_r |
Initial assignments (default NULL) |
maxiter |
Maximum number of iterations (default 100). The algorithm stops if the number of iterations exceeds |
Lw |
Number of hidden components (default 0) |
cstr |
Constraints on |
verb |
Verbosity: print out the progression of the algorithm. If |
in_theta |
Initial parameters (default NULL), same structure as the output of this function |
in_phi |
Initial parameters (default NULL), same structure as the output of this function |
Details
This function implements the robust counterpart of GLLiM model and should be applied when outliers are present in the data.
The SLLiM model implemented in this function addresses the following non-linear mapping issue:
where is a L-vector of multivariate responses and
is a large D-vector of covariates' profiles such that
. The methods implemented in this package aims at estimating the non linear regression function
.
First, the methods of this package are based on an inverse regression strategy. The inverse conditional relation is specified in a way that the forward relation of interest
can be deduced in closed-from. Under some hypothesis on covariance structures, the large number
of covariates is handled by this inverse regression trick, which acts as a dimension reduction technique. The number of parameters to estimate is therefore drastically reduced. Second, we propose to approximate the non linear
regression function by a piecewise affine function. Therefore, an hidden discrete variable
is introduced, in order to divide the space in
regions such that an affine model holds between responses Y and variables X, in each region
:
where is a
matrix of coefficients for regression
,
is a D-vector of intercepts and
is a noise with covariance matrix proportional to
.
SLLiM is defined as the following hierarchical generalized Student mixture model for the inverse conditional density :
where are the sets of parameters
and
. In the previous expression,
and
determine the heaviness of the tail of the generalized Student distribution, which gives robustness to the model. Note that
and
where
is the Mahalanobis distance.
The forward conditional density of interest can be deduced from these equations and is also a Student mixture of regressions model.
Like gllim
, sllim
allows the addition of latent variables in order to account for correlation among covariates or if it is supposed that responses are only partially observed. Adding latent factors is known to improve prediction accuracy, if Lw
is not too large with regard to the number of covariates. When latent factors are added, the dimension of the response is L=Lt+Lw
and L=Lt
otherwise.
For SLLiM, the number of parameters to estimate is:
where and
(resp.
) is the number of parameters in each of the large (resp. small) covariance matrix
(resp.
). For example,
if the constraint on
is
cstr$Sigma="i"
, then,which is the default constraint in the
gllim
functionif the constraint on
is
cstr$Sigma="d"
, then,
if the constraint on
is
cstr$Sigma=""
, then,
if the constraint on
is
cstr$Sigma="*"
, then.
The rule to compute the number of parameters of is the same as
, replacing D by
. Currently the
matrices are not constrained and
because for indentifiability reasons the
part is set to the identity matrix.
The user must choose the number of mixtures components and, if needed, the number of latent factors
. For small datasets (less than 100 observations), we suggest to select both
by minimizing the BIC criterion. For larger datasets, to save computation time, we suggest to set
using BIC while setting
to an arbitrary value large enough to catch non linear relations between responses and covariates and small enough to have several observations (at least 10) in each clusters. Indeed, for large datasets, the number of clusters should not have a strong impact on the results while it is sufficiently large.
Value
Returns a list with the following elements:
LLf |
Final log-likelihood |
LL |
Log-likelihood value at each iteration of the EM algorithm |
theta |
A list containing the estimations of parameters as follows: |
c |
An |
Gamma |
An |
A |
An |
b |
An |
Sigma |
An |
nbpar |
The number of parameters estimated in the model |
phi |
A list containing the estimations of parameters as follows: |
r |
An |
pi |
A vector of length |
alpha |
A vector of length |
Author(s)
Emeline Perthame (emeline.perthame@inria.fr), Florence Forbes (florence.forbes@inria.fr), Antoine Deleforge (antoine.deleforge@inria.fr)
References
[1] A. Deleforge, F. Forbes, and R. Horaud. High-dimensional regression with Gaussian mixtures and partially-latent response variables. Statistics and Computing, 25(5):893–911, 2015.
[2] E. Perthame, F. Forbes, and A. Deleforge. Inverse regression approach to robust nonlinear high-to-low dimensional mapping. Journal of Multivariate Analysis, 163(C):1–14, 2018. https://doi.org/10.1016/j.jmva.2017.09.009
See Also
xLLiM-package
, emgm
, sllim_inverse_map
, gllim
Examples
data(data.xllim)
responses = data.xllim[1:2,] # 2 responses in rows and 100 observations in columns
covariates = data.xllim[3:52,] # 50 covariates in rows and 100 observations in columns
## Setting 5 components in the model
K = 5
## the model can be initialized by running an EM algorithm for Gaussian Mixtures (EMGM)
r = emgm(rbind(responses, covariates), init=K);
## and then the sllim model is estimated
mod = sllim(responses,covariates,in_K=K,in_r=r);
## if initialization is not specified, the model is automatically initialized by EMGM
## mod = sllim(responses,covariates,in_K=K)
## Adding 1 latent factor
## mod = sllim(responses,covariates,in_K=K,in_r=r,Lw=1)
## Some constraints on the covariance structure of \eqn{X} can be added
## mod = sllim(responses,covariates,in_K=K,in_r=r,cstr=list(Sigma="i"))
# Isotropic covariance matrices
# (same variance among covariates but different in each component)
## mod = sllim(responses,covariates,in_K=K,in_r=r,cstr=list(Sigma="d"))
# Heteroskedastic covariance matrices
# (variances are different among covariates and in each component)
## mod = sllim(responses,covariates,in_K=K,in_r=r,cstr=list(Sigma=""))
# Unconstrained full covariance matrices
## mod = sllim(responses,covariates,in_K=K,in_r=r,cstr=list(Sigma="*"))
# Full covariance matrices but equal for all components