datacggm {cglasso} | R Documentation |
Create a Dataset from a Conditional Gaussian Graphical Model with Censored and/or Missing Values
Description
‘The datacggm
’ function is used to create a dataset from a conditional Gaussian graphical model with censored and/or missing values.
Usage
datacggm(Y, lo = -Inf, up = +Inf, X = NULL, control = list(maxit = 1.0E+4,
thr = 1.0E-4))
Arguments
Y |
a |
lo |
the lower censoring vector; |
up |
the upper censoring vector; |
X |
an optional |
control |
a named list used to pass the arguments to the EM algorithm (see below for more details). The components are:
|
Details
The function ‘datacggm
’ returns an R object of class ‘datacggm
’, that is a named list containing the elements needed to fit a conditional graphical LASSO (cglasso) model to datasets with censored and/or missing values.
A set of specific method functions are developed to decsribe data with censored/missing values. For example, the method function ‘print.datacggm
’ prints out the left and right-censored values using the following rules: a right-censored value is labeled adding the symbol ‘+
’ at the end of the value, whereas the symbol ‘-
’ is used for the left-censored values (see examples below). The summary statistics can be obtained using the method function ‘summary.datacggm
’. The matrices Y
and X
are extracted from a datacggm
object using the function ‘getMatrix
’.
For each column of the matrix ‘Y
’, mean and variance are estimated using a standard EM-algorithm based on the assumption of a Gaussian distribution. ‘maxit
’ and ‘thr
’ are used to set the number of iterations and the threshold for convergence, respectively. Marginal means and variances can be extracted using the accessor functions ‘ColMeans
’ and ‘ColVars
’, respectively. Furthermore, the plotting functions ‘hist.datacggm
’ and ‘qqcnorm
’ can be used to inspect the marginal distribution of each column of the matrix ‘Y
’.
The status indicator matrix, denoted by R
, can be extracted by using the function event
. The entries of this matrix specify the status of an observation using the following code:
-
‘
R[i, j] = 0
’ means that they_{ij}
is inside the open interval(lo[j], up[j])
; -
‘
R[i, j] = -1
’ means that they_{ij}
is a left-censored value; -
‘
R[i, j] = +1
’ means that they_{ij}
is a right-censored value; -
‘
R[i, j] = +9
’ means that they_{ij}
is a missing value.
See below for the other functions related to an object of class ‘datacggm
’.
Value
‘datacggm
’ returns an R object of S3 class “datacggm
”, that is, a nested named list containing the
following components:
Y |
the |
X |
the |
Info |
|
Author(s)
Luigi Augugliaro (luigi.augugliaro@unipa.it)
References
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020a) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020b) <doi:10.1093/biostatistics/kxy043>.
\ell_1
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
See Also
Related to the R objects of class “datacggm
” there are the accessor functions, rowNames
, colNames
, getMatrix
, ColMeans
, ColVars
, upper
, lower
, event
, qqcnorm
and the method functions is.datacggm
, dim.datacggm
, summary.datacggm
and hist.datacggm
. The function rcggm
can be used to simulate a dataset from a conditional Gaussian graphical model whereas cglasso
is the model fitting function devoted to the l1-penalized censored Gaussian graphical model.
Examples
set.seed(123)
# a dataset from a right-censored Gaussian graphical model
n <- 100L
p <- 3L
Y <- matrix(rnorm(n * p), n, p)
up <- 1
Y[Y >= up] <- up
Z <- datacggm(Y = Y, up = up)
Z
# a dataset from a conditional censored Gaussian graphical model
n <- 100L
p <- 3L
q <- 2
Y <- matrix(rnorm(n * p), n, p)
up <- 1
lo <- -1
Y[Y >= up] <- up
Y[Y <= lo] <- lo
X <- matrix(rnorm(n * q), n, q)
Z <- datacggm(Y = Y, lo = lo, up = up, X = X)
Z
# a dataset from a conditional censored Gaussian graphical model
# and with missing-at-random values
n <- 100L
p <- 3L
q <- 2
Y <- matrix(rnorm(n * p), n, p)
NA.id <- matrix(rbinom(n * p, 1L, 0.01), n, p)
Y[NA.id == 1L] <- NA
up <- 1
lo <- -1
Y[Y >= up] <- up
Y[Y <= lo] <- lo
X <- matrix(rnorm(n * q), n, q)
Z <- datacggm(Y = Y, lo = lo, up = up, X = X)
Z