parametric_dsmm {dsmmR} | R Documentation |
Parametric Drifting semi-Markov model specification
Description
Creates a parametric model specification for a drifting
semi-Markov model. Returns an object of class
(dsmm_parametric, dsmm)
.
Usage
parametric_dsmm(
model_size,
states,
initial_dist,
degree,
f_is_drifting,
p_is_drifting,
p_dist,
f_dist,
f_dist_pars
)
Arguments
model_size |
Positive integer that represents the size of
the drifting semi-Markov model |
states |
Character vector that represents the state space |
initial_dist |
Numerical vector of |
degree |
Positive integer that represents the polynomial degree |
f_is_drifting |
Logical. Specifies if |
p_is_drifting |
Logical. Specifies if |
p_dist |
Numerical array, that represents the probabilities of the
transition matrix
|
f_dist |
Character array, that represents the discrete sojourn time
distribution
|
f_dist_pars |
Numerical array, that represents the parameters of the
sojourn time distributions given in
|
Details
Defined Arguments
For the parametric case, we explicitly define:
The transition matrix of the embedded Markov chain
(J_{t})_{t\in \{0,\dots,n\}}
, given in the attributep_dist
:If
p
is not drifting, it contains the values:p(u,v), \forall u, v \in E,
given in an array with dimensions of
s \times s
, where the first dimension corresponds to the previous stateu
and the second dimension corresponds to the current statev
.If
p
is drifting, fori \in \{ 0,\dots,d \}
, it contains the values:p_{\frac{i}{d}}(u,v), \forall u, v \in E,
given in an array with dimensions of
s \times s \times (d + 1)
, where the first and second dimensions are defined as in the non-drifting case, and the third dimension corresponds to thed+1
different matricesp_{\frac{i}{d}}.
The conditional sojourn time distribution, given in the attribute
f_dist
:If
f
is not drifting, it contains the discrete distribution names (as characters orNA
), given in an array with dimensions ofs \times s
, where the first dimension corresponds to the previous stateu
, the second dimension corresponds to the current statev
.If
f
is drifting, it contains the discrete distribution names (as characters orNA
) given in an array with dimensions ofs \times s \times (d + 1)
, where the first and second dimensions are defined as in the non-drifting case, and the third dimension corresponds to thed+1
different arraysf_{\frac{i}{d}}.
The conditional sojourn time distribution parameters, given in the attribute
f_dist_pars
:If
f
is not drifting, it contains the numerical values (orNA
) of the corresponding distributions defined inf_dist
, given in an array with dimensions ofs \times s
, where the first dimension corresponds to the previous stateu
, the second dimension corresponds to the current statev
.If
f
is drifting, it contains the numerical values (orNA
) of the corresponding distributions defined inf_dist
, given in an array with dimensions ofs \times s \times (d + 1)
, where the first and second dimensions are defined as in the non-drifting case, and the third dimension corresponds to thed+1
different arraysf_{\frac{i}{d}}.
Sojourn time distributions
In this package, the available distributions for the modeling of the
conditional sojourn times, of the drifting semi-Markov model, used through
the argument f_dist
, are the following:
Uniform
(n)
:f(x) = 1/n
, forx = 1, 2, \dots, n
, wheren
is a positive integer. This can be specified through the following:-
f_dist = "unif"
-
f_dist_pars
= (n
,NA
) (n
as defined here).
-
Geometric
(p)
:f(x) = p (1-p)^{x-1}
, forx = 1, 2, \dots,
wherep \in (0, 1)
is the probability of success. This can be specified through the following:-
f_dist
="geom"
-
f_dist_pars
= (p
,NA
) (p
as defined here).
-
Poisson
(\lambda)
:f(x) = \frac{\lambda^{x-1} exp(-\lambda)}{(x-1)!}
, forx = 1, 2, \dots,
where\lambda > 0
. This can be specified through the following:-
f_dist
="pois"
-
f_dist_pars
= (\lambda
,NA
)
-
Negative binomial
(\alpha, p)
:f(x)=\frac{\Gamma(x+\alpha-1)}{\Gamma(\alpha)(x-1)!} p^{\alpha}(1-p)^{x-1}
, forx = 1, 2,\dots,
where\Gamma
is the Gamma function,\alpha \in (0, +\infty)
is the parameter describing the target for number of successful trials, or the dispersion parameter (the shape parameter of the gamma mixing distribution).p
is the probability of success,0 < p < 1
.-
f_dist
="nbinom"
-
f_dist_pars
= (\alpha, p
) (p
as defined here)
-
Discrete Weibull of type 1
(q, \beta)
:f(x)=q^{(x-1)^{\beta}}-q^{x^{\beta}}
, forx=1,2,\dots,
withq \in (0, 1)
is the first parameter (probability) and\beta \in (0, +\infty)
is the second parameter. This can be specified through the following:-
f_dist
="dweibull"
-
f_dist_pars
= (q, \beta
) (q
as defined here)
-
From these discrete distributions, by using "dweibull", "nbinom"
we require two parameters. It's for this reason that the attribute
f_dist_pars
is an array of dimensions
s \times s \times 2
if f
is not drifting or s \times s \times 2 \times (d+1)
if f
is drifting.
Value
Returns an object of the S3 class dsmm_parametric, dsmm
.
It has the following attributes:
-
dist
: List. Contains 3 arrays, passing down from the arguments:-
p_drift
orp_notdrift
, corresponding to whether the definedp
transition matrix is drifting or not. -
f_drift_parametric
orf_notdrift_parametric
, corresponding to whether the definedf
sojourn time distribution is drifting or not. -
f_drift_parameters
orf_notdrift_parameters
, which are the definedf
sojourn time distribution parameters, depending on whetherf
is drifting or not.
-
-
initial_dist
: Numerical vector. Passing down from the arguments. It contains the initial distribution of the drifting semi-Markov model. -
states
: Character vector. Passing down from the arguments. It contains the state spaceE
. -
s
: Positive integer. It contains the number of states in the state space,s = |E|
, which is given in the attributestates
. -
degree
: Positive integer. Passing down from the arguments. It contains the polynomial degreed
considered for the drifting of the model. -
model_size
: Positive integer. Passing down from the arguments. It contains the size of the drifting semi-Markov modeln
, which represents the length of the embedded Markov chain(J_{t})_{t\in \{0,\dots,n\}}
, without the last state. -
f_is_drifting
: Logical. Passing down from the arguments. Specifies iff
is drifting or not. -
p_is_drifting
: Logical. Passing down from the arguments. Specifies ifp
is drifting or not. -
Model
: Character. Possible values:-
"Model_1"
: Bothp
andf
are drifting. -
"Model_2"
:p
is drifting andf
is not drifting. -
"Model_3"
:f
is drifting andp
is not drifting.
-
-
A_i
: Numerical matrix. Represents the polynomialsA_i(t)
with degreed
that are used for solving the systemMJ = P
. Used for the methods defined for the object. Not printed when viewing the object.
References
V. S. Barbu, N. Limnios. (2008). semi-Markov Chains and Hidden semi-Markov Models Toward Applications - Their Use in Reliability and DNA Analysis. New York: Lecture Notes in Statistics, vol. 191, Springer.
Vergne, N. (2008). Drifting Markov models with Polynomial Drift and Applications to DNA Sequences. Statistical Applications in Genetics Molecular Biology 7 (1).
Barbu V. S., Vergne, N. (2019). Reliability and survival analysis for drifting Markov models: modeling and estimation. Methodology and Computing in Applied Probability, 21(4), 1407-1429.
T. Nakagawa and S. Osaki. (1975). The discrete Weibull distribution. IEEE Transactions on Reliability, R-24, 300-301.
See Also
Methods applied to this object: simulate.dsmm, get_kernel.
For the non-parametric drifting semi-Markov model specification: nonparametric_dsmm.
For the theoretical background of drifting semi-Markov models: dsmmR.
Examples
# We can also define states in a flexible way, including spaces.
states <- c("Dollar $", " /1'2'3/ ", " Z E T A ", "O_M_E_G_A")
s <- length(states)
d <- 1
# ===========================================================================
# Defining parametric drifting semi-Markov models.
# ===========================================================================
# ---------------------------------------------------------------------------
# Defining the drifting distributions for Model 1.
# ---------------------------------------------------------------------------
# `p_dist` has dimensions of: (s, s, d + 1).
# Sums over v must be 1 for all u and i = 0, ..., d.
# First matrix.
p_dist_1 <- matrix(c(0, 0.1, 0.4, 0.5,
0.5, 0, 0.3, 0.2,
0.3, 0.4, 0, 0.3,
0.8, 0.1, 0.1, 0),
ncol = s, byrow = TRUE)
# Second matrix.
p_dist_2 <- matrix(c(0, 0.3, 0.6, 0.1,
0.3, 0, 0.4, 0.3,
0.5, 0.3, 0, 0.2,
0.2, 0.3, 0.5, 0),
ncol = s, byrow = TRUE)
# get `p_dist` as an array of p_dist_1 and p_dist_2.
p_dist_model_1 <- array(c(p_dist_1, p_dist_2), dim = c(s, s, d + 1))
# `f_dist` has dimensions of: (s, s, d + 1).
# First matrix.
f_dist_1 <- matrix(c(NA, "unif", "dweibull", "nbinom",
"geom", NA, "pois", "dweibull",
"dweibull", "pois", NA, "geom",
"pois", NA, "geom", NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_2 <- matrix(c(NA, "pois", "geom", "nbinom",
"geom", NA, "pois", "dweibull",
"unif", "geom", NA, "geom",
"pois", "pois", "geom", NA),
nrow = s, ncol = s, byrow = TRUE)
# get `f_dist` as an array of `f_dist_1` and `f_dist_2`
f_dist_model_1 <- array(c(f_dist_1, f_dist_2), dim = c(s, s, d + 1))
# `f_dist_pars` has dimensions of: (s, s, 2, d + 1).
# First array of coefficients, corresponding to `f_dist_1`.
# First matrix.
f_dist_1_pars_1 <- matrix(c(NA, 5, 0.4, 4,
0.7, NA, 5, 0.6,
0.2, 3, NA, 0.6,
4, NA, 0.4, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_1_pars_2 <- matrix(c(NA, NA, 0.2, 0.6,
NA, NA, NA, 0.8,
0.6, NA, NA, NA,
NA, NA, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second array of coefficients, corresponding to `f_dist_2`.
# First matrix.
f_dist_2_pars_1 <- matrix(c(NA, 6, 0.4, 3,
0.7, NA, 2, 0.5,
3, 0.6, NA, 0.7,
6, 0.2, 0.7, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_2_pars_2 <- matrix(c(NA, NA, NA, 0.6,
NA, NA, NA, 0.8,
NA, NA, NA, NA,
NA, NA, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Get `f_dist_pars`.
f_dist_pars_model_1 <- array(c(f_dist_1_pars_1, f_dist_1_pars_2,
f_dist_2_pars_1, f_dist_2_pars_2),
dim = c(s, s, 2, d + 1))
# ---------------------------------------------------------------------------
# Parametric object for Model 1.
# ---------------------------------------------------------------------------
obj_par_model_1 <- parametric_dsmm(
model_size = 10000,
states = states,
initial_dist = c(0.8, 0.1, 0.1, 0),
degree = d,
p_dist = p_dist_model_1,
f_dist = f_dist_model_1,
f_dist_pars = f_dist_pars_model_1,
p_is_drifting = TRUE,
f_is_drifting = TRUE
)
# p drifting array.
p_drift <- obj_par_model_1$dist$p_drift
p_drift
# f distribution.
f_dist_drift <- obj_par_model_1$dist$f_drift_parametric
f_dist_drift
# parameters for the f distribution.
f_dist_pars_drift <- obj_par_model_1$dist$f_drift_parameters
f_dist_pars_drift
# ---------------------------------------------------------------------------
# Defining Model 2 - p is drifting, f is not drifting.
# ---------------------------------------------------------------------------
# `p_dist` has the same dimensions as in Model 1: (s, s, d + 1).
p_dist_model_2 <- array(c(p_dist_1, p_dist_2), dim = c(s, s, d + 1))
# `f_dist` has dimensions of: (s, s).
f_dist_model_2 <- matrix(c( NA, "pois", NA, "nbinom",
"geom", NA, "geom", "dweibull",
"unif", "geom", NA, "geom",
"nbinom", "unif", "dweibull", NA),
nrow = s, ncol = s, byrow = TRUE)
# `f_dist_pars` has dimensions of: (s, s, 2),
# corresponding to `f_dist_model_2`.
# First matrix.
f_dist_pars_1_model_2 <- matrix(c(NA, 0.2, NA, 3,
0.2, NA, 0.2, 0.5,
3, 0.4, NA, 0.7,
2, 3, 0.7, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_pars_2_model_2 <- matrix(c(NA, NA, NA, 0.6,
NA, NA, NA, 0.8,
NA, NA, NA, NA,
0.2, NA, 0.3, NA),
nrow = s, ncol = s, byrow = TRUE)
# Get `f_dist_pars`.
f_dist_pars_model_2 <- array(c(f_dist_pars_1_model_2,
f_dist_pars_2_model_2),
dim = c(s, s, 2))
# ---------------------------------------------------------------------------
# Parametric object for Model 2.
# ---------------------------------------------------------------------------
obj_par_model_2 <- parametric_dsmm(
model_size = 10000,
states = states,
initial_dist = c(0.8, 0.1, 0.1, 0),
degree = d,
p_dist = p_dist_model_2,
f_dist = f_dist_model_2,
f_dist_pars = f_dist_pars_model_2,
p_is_drifting = TRUE,
f_is_drifting = FALSE
)
# p drifting array.
p_drift <- obj_par_model_2$dist$p_drift
p_drift
# f distribution.
f_dist_notdrift <- obj_par_model_2$dist$f_notdrift_parametric
f_dist_notdrift
# parameters for the f distribution.
f_dist_pars_notdrift <- obj_par_model_2$dist$f_notdrift_parameters
f_dist_pars_notdrift
# ---------------------------------------------------------------------------
# Defining Model 3 - f is drifting, p is not drifting.
# ---------------------------------------------------------------------------
# `p_dist` has dimensions of: (s, s).
p_dist_model_3 <- matrix(c(0, 0.1, 0.3, 0.6,
0.4, 0, 0.1, 0.5,
0.4, 0.3, 0, 0.3,
0.9, 0.01, 0.09, 0),
ncol = s, byrow = TRUE)
# `f_dist` has the same dimensions as in Model 1: (s, s, d + 1).
f_dist_model_3 <- array(c(f_dist_1, f_dist_2), dim = c(s, s, d + 1))
# `f_dist_pars` has the same dimensions as in Model 1: (s, s, 2, d + 1).
f_dist_pars_model_3 <- array(c(f_dist_1_pars_1, f_dist_1_pars_2,
f_dist_2_pars_1, f_dist_2_pars_2),
dim = c(s, s, 2, d + 1))
# ---------------------------------------------------------------------------
# Parametric object for Model 3.
# ---------------------------------------------------------------------------
obj_par_model_3 <- parametric_dsmm(
model_size = 10000,
states = states,
initial_dist = c(0.3, 0.2, 0.2, 0.3),
degree = d,
p_dist = p_dist_model_3,
f_dist = f_dist_model_3,
f_dist_pars = f_dist_pars_model_3,
p_is_drifting = FALSE,
f_is_drifting = TRUE
)
# p drifting array.
p_notdrift <- obj_par_model_3$dist$p_notdrift
p_notdrift
# f distribution.
f_dist_drift <- obj_par_model_3$dist$f_drift_parametric
f_dist_drift
# parameters for the f distribution.
f_dist_pars_drift <- obj_par_model_3$dist$f_drift_parameters
f_dist_pars_drift
# ===========================================================================
# Parametric estimation using methods corresponding to an object
# which inherits from the class `dsmm_parametric`.
# ===========================================================================
### Comments
### 1. Using a larger `klim` and a larger `model_size` will increase the
### accuracy of the model, with the need of larger memory requirements
### and computational cost.
### 2. For the parametric estimation it is recommended to use a common set
### of distributions while only the parameters are drifting. This results
### in higher accuracy.
# ---------------------------------------------------------------------------
# Defining the distributions for Model 1 - both p and f are drifting.
# ---------------------------------------------------------------------------
# `p_dist` has dimensions of: (s, s, d + 1).
# First matrix.
p_dist_1 <- matrix(c(0, 0.2, 0.4, 0.4,
0.5, 0, 0.3, 0.2,
0.3, 0.4, 0, 0.3,
0.5, 0.3, 0.2, 0),
ncol = s, byrow = TRUE)
# Second matrix.
p_dist_2 <- matrix(c(0, 0.3, 0.5, 0.2,
0.3, 0, 0.4, 0.3,
0.5, 0.3, 0, 0.2,
0.2, 0.4, 0.4, 0),
ncol = s, byrow = TRUE)
# get `p_dist` as an array of p_dist_1 and p_dist_2.
p_dist_model_1 <- array(c(p_dist_1, p_dist_2), dim = c(s, s, d + 1))
# `f_dist` has dimensions of: (s, s, d + 1).
# We will use the same sojourn time distributions.
f_dist_1 <- matrix(c( NA, "unif", "dweibull", "nbinom",
"geom", NA, "pois", "dweibull",
"dweibull", "pois", NA, "geom",
"pois", 'nbinom', "geom", NA),
nrow = s, ncol = s, byrow = TRUE)
# get `f_dist`
f_dist_model_1 <- array(f_dist_1, dim = c(s, s, d + 1))
# `f_dist_pars` has dimensions of: (s, s, 2, d + 1).
# First array of coefficients, corresponding to `f_dist_1`.
# First matrix.
f_dist_1_pars_1 <- matrix(c(NA, 7, 0.4, 4,
0.7, NA, 5, 0.6,
0.2, 3, NA, 0.6,
4, 4, 0.4, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_1_pars_2 <- matrix(c(NA, NA, 0.2, 0.6,
NA, NA, NA, 0.8,
0.6, NA, NA, NA,
NA, 0.3, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second array of coefficients, corresponding to `f_dist_2`.
# First matrix.
f_dist_2_pars_1 <- matrix(c(NA, 6, 0.5, 3,
0.5, NA, 4, 0.5,
0.4, 5, NA, 0.7,
6, 5, 0.7, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_2_pars_2 <- matrix(c(NA, NA, 0.4, 0.5,
NA, NA, NA, 0.6,
0.5, NA, NA, NA,
NA, 0.4, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Get `f_dist_pars`.
f_dist_pars_model_1 <- array(c(f_dist_1_pars_1, f_dist_1_pars_2,
f_dist_2_pars_1, f_dist_2_pars_2),
dim = c(s, s, 2, d + 1))
# ---------------------------------------------------------------------------
# Defining the parametric object for Model 1.
# ---------------------------------------------------------------------------
obj_par_model_1 <- parametric_dsmm(
model_size = 4000,
states = states,
initial_dist = c(0.8, 0.1, 0.1, 0),
degree = d,
p_dist = p_dist_model_1,
f_dist = f_dist_model_1,
f_dist_pars = f_dist_pars_model_1,
p_is_drifting = TRUE,
f_is_drifting = TRUE
)
cat("The object has class of (",
paste0(class(obj_par_model_1),
collapse = ', '), ").")
# ---------------------------------------------------------------------------
# Generating a sequence from the parametric object.
# ---------------------------------------------------------------------------
# A larger klim will lead to an increase in accuracy.
klim <- 20
sim_seq <- simulate(obj_par_model_1, klim = klim, seed = 1)
# ---------------------------------------------------------------------------
# Fitting the generated sequence under the same distributions.
# ---------------------------------------------------------------------------
fit_par_model1 <- fit_dsmm(sequence = sim_seq,
states = states,
degree = d,
f_is_drifting = TRUE,
p_is_drifting = TRUE,
estimation = 'parametric',
f_dist = f_dist_model_1)
cat("The object has class of (",
paste0(class(fit_par_model1),
collapse = ', '), ").")
cat("\nThe estimated parameters are:\n")
fit_par_model1$dist$f_drift_parameters