parametric_dsmm {dsmmR} | R Documentation |
Parametric Drifting semi-Markov model specification
Description
Creates a parametric model specification for a drifting
semi-Markov model. Returns an object of class
(dsmm_parametric, dsmm)
.
Usage
parametric_dsmm(
model_size,
states,
initial_dist,
degree,
f_is_drifting,
p_is_drifting,
p_dist,
f_dist,
f_dist_pars
)
Arguments
model_size |
Positive integer that represents the size of
the drifting semi-Markov model |
states |
Character vector that represents the state space |
initial_dist |
Numerical vector of |
degree |
Positive integer that represents the polynomial degree |
f_is_drifting |
Logical. Specifies if |
p_is_drifting |
Logical. Specifies if |
p_dist |
Numerical array, that represents the probabilities of the
transition matrix
|
f_dist |
Character array, that represents the discrete sojourn time
distribution
|
f_dist_pars |
Numerical array, that represents the parameters of the
sojourn time distributions given in
|
Details
Defined Arguments
For the parametric case, we explicitly define:
The transition matrix of the embedded Markov chain
, given in the attribute
p_dist
:If
is not drifting, it contains the values:
given in an array with dimensions of
, where the first dimension corresponds to the previous state
and the second dimension corresponds to the current state
.
If
is drifting, for
, it contains the values:
given in an array with dimensions of
, where the first and second dimensions are defined as in the non-drifting case, and the third dimension corresponds to the
different matrices
The conditional sojourn time distribution, given in the attribute
f_dist
:If
is not drifting, it contains the discrete distribution names (as characters or
NA
), given in an array with dimensions of, where the first dimension corresponds to the previous state
, the second dimension corresponds to the current state
.
If
is drifting, it contains the discrete distribution names (as characters or
NA
) given in an array with dimensions of, where the first and second dimensions are defined as in the non-drifting case, and the third dimension corresponds to the
different arrays
The conditional sojourn time distribution parameters, given in the attribute
f_dist_pars
:If
is not drifting, it contains the numerical values (or
NA
) of the corresponding distributions defined inf_dist
, given in an array with dimensions of, where the first dimension corresponds to the previous state
, the second dimension corresponds to the current state
.
If
is drifting, it contains the numerical values (or
NA
) of the corresponding distributions defined inf_dist
, given in an array with dimensions of, where the first and second dimensions are defined as in the non-drifting case, and the third dimension corresponds to the
different arrays
Sojourn time distributions
In this package, the available distributions for the modeling of the
conditional sojourn times, of the drifting semi-Markov model, used through
the argument f_dist
, are the following:
Uniform
:
, for
, where
is a positive integer. This can be specified through the following:
-
f_dist = "unif"
-
f_dist_pars
= (,
NA
) (as defined here).
-
Geometric
:
, for
where
is the probability of success. This can be specified through the following:
-
f_dist
="geom"
-
f_dist_pars
= (,
NA
) (as defined here).
-
Poisson
:
, for
where
. This can be specified through the following:
-
f_dist
="pois"
-
f_dist_pars
= (,
NA
)
-
Negative binomial
:
, for
where
is the Gamma function,
is the parameter describing the target for number of successful trials, or the dispersion parameter (the shape parameter of the gamma mixing distribution).
is the probability of success,
.
-
f_dist
="nbinom"
-
f_dist_pars
= () (
as defined here)
-
Discrete Weibull of type 1
:
, for
with
is the first parameter (probability) and
is the second parameter. This can be specified through the following:
-
f_dist
="dweibull"
-
f_dist_pars
= () (
as defined here)
-
From these discrete distributions, by using "dweibull", "nbinom"
we require two parameters. It's for this reason that the attribute
f_dist_pars
is an array of dimensions
if
is not drifting or
if
is drifting.
Value
Returns an object of the S3 class dsmm_parametric, dsmm
.
It has the following attributes:
-
dist
: List. Contains 3 arrays, passing down from the arguments:-
p_drift
orp_notdrift
, corresponding to whether the definedtransition matrix is drifting or not.
-
f_drift_parametric
orf_notdrift_parametric
, corresponding to whether the definedsojourn time distribution is drifting or not.
-
f_drift_parameters
orf_notdrift_parameters
, which are the definedsojourn time distribution parameters, depending on whether
is drifting or not.
-
-
initial_dist
: Numerical vector. Passing down from the arguments. It contains the initial distribution of the drifting semi-Markov model. -
states
: Character vector. Passing down from the arguments. It contains the state space.
-
s
: Positive integer. It contains the number of states in the state space,, which is given in the attribute
states
. -
degree
: Positive integer. Passing down from the arguments. It contains the polynomial degreeconsidered for the drifting of the model.
-
model_size
: Positive integer. Passing down from the arguments. It contains the size of the drifting semi-Markov model, which represents the length of the embedded Markov chain
, without the last state.
-
f_is_drifting
: Logical. Passing down from the arguments. Specifies ifis drifting or not.
-
p_is_drifting
: Logical. Passing down from the arguments. Specifies ifis drifting or not.
-
Model
: Character. Possible values:-
"Model_1"
: Bothand
are drifting.
-
"Model_2"
:is drifting and
is not drifting.
-
"Model_3"
:is drifting and
is not drifting.
-
-
A_i
: Numerical matrix. Represents the polynomialswith degree
that are used for solving the system
. Used for the methods defined for the object. Not printed when viewing the object.
References
V. S. Barbu, N. Limnios. (2008). semi-Markov Chains and Hidden semi-Markov Models Toward Applications - Their Use in Reliability and DNA Analysis. New York: Lecture Notes in Statistics, vol. 191, Springer.
Vergne, N. (2008). Drifting Markov models with Polynomial Drift and Applications to DNA Sequences. Statistical Applications in Genetics Molecular Biology 7 (1).
Barbu V. S., Vergne, N. (2019). Reliability and survival analysis for drifting Markov models: modeling and estimation. Methodology and Computing in Applied Probability, 21(4), 1407-1429.
T. Nakagawa and S. Osaki. (1975). The discrete Weibull distribution. IEEE Transactions on Reliability, R-24, 300-301.
See Also
Methods applied to this object: simulate.dsmm, get_kernel.
For the non-parametric drifting semi-Markov model specification: nonparametric_dsmm.
For the theoretical background of drifting semi-Markov models: dsmmR.
Examples
# We can also define states in a flexible way, including spaces.
states <- c("Dollar $", " /1'2'3/ ", " Z E T A ", "O_M_E_G_A")
s <- length(states)
d <- 1
# ===========================================================================
# Defining parametric drifting semi-Markov models.
# ===========================================================================
# ---------------------------------------------------------------------------
# Defining the drifting distributions for Model 1.
# ---------------------------------------------------------------------------
# `p_dist` has dimensions of: (s, s, d + 1).
# Sums over v must be 1 for all u and i = 0, ..., d.
# First matrix.
p_dist_1 <- matrix(c(0, 0.1, 0.4, 0.5,
0.5, 0, 0.3, 0.2,
0.3, 0.4, 0, 0.3,
0.8, 0.1, 0.1, 0),
ncol = s, byrow = TRUE)
# Second matrix.
p_dist_2 <- matrix(c(0, 0.3, 0.6, 0.1,
0.3, 0, 0.4, 0.3,
0.5, 0.3, 0, 0.2,
0.2, 0.3, 0.5, 0),
ncol = s, byrow = TRUE)
# get `p_dist` as an array of p_dist_1 and p_dist_2.
p_dist_model_1 <- array(c(p_dist_1, p_dist_2), dim = c(s, s, d + 1))
# `f_dist` has dimensions of: (s, s, d + 1).
# First matrix.
f_dist_1 <- matrix(c(NA, "unif", "dweibull", "nbinom",
"geom", NA, "pois", "dweibull",
"dweibull", "pois", NA, "geom",
"pois", NA, "geom", NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_2 <- matrix(c(NA, "pois", "geom", "nbinom",
"geom", NA, "pois", "dweibull",
"unif", "geom", NA, "geom",
"pois", "pois", "geom", NA),
nrow = s, ncol = s, byrow = TRUE)
# get `f_dist` as an array of `f_dist_1` and `f_dist_2`
f_dist_model_1 <- array(c(f_dist_1, f_dist_2), dim = c(s, s, d + 1))
# `f_dist_pars` has dimensions of: (s, s, 2, d + 1).
# First array of coefficients, corresponding to `f_dist_1`.
# First matrix.
f_dist_1_pars_1 <- matrix(c(NA, 5, 0.4, 4,
0.7, NA, 5, 0.6,
0.2, 3, NA, 0.6,
4, NA, 0.4, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_1_pars_2 <- matrix(c(NA, NA, 0.2, 0.6,
NA, NA, NA, 0.8,
0.6, NA, NA, NA,
NA, NA, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second array of coefficients, corresponding to `f_dist_2`.
# First matrix.
f_dist_2_pars_1 <- matrix(c(NA, 6, 0.4, 3,
0.7, NA, 2, 0.5,
3, 0.6, NA, 0.7,
6, 0.2, 0.7, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_2_pars_2 <- matrix(c(NA, NA, NA, 0.6,
NA, NA, NA, 0.8,
NA, NA, NA, NA,
NA, NA, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Get `f_dist_pars`.
f_dist_pars_model_1 <- array(c(f_dist_1_pars_1, f_dist_1_pars_2,
f_dist_2_pars_1, f_dist_2_pars_2),
dim = c(s, s, 2, d + 1))
# ---------------------------------------------------------------------------
# Parametric object for Model 1.
# ---------------------------------------------------------------------------
obj_par_model_1 <- parametric_dsmm(
model_size = 10000,
states = states,
initial_dist = c(0.8, 0.1, 0.1, 0),
degree = d,
p_dist = p_dist_model_1,
f_dist = f_dist_model_1,
f_dist_pars = f_dist_pars_model_1,
p_is_drifting = TRUE,
f_is_drifting = TRUE
)
# p drifting array.
p_drift <- obj_par_model_1$dist$p_drift
p_drift
# f distribution.
f_dist_drift <- obj_par_model_1$dist$f_drift_parametric
f_dist_drift
# parameters for the f distribution.
f_dist_pars_drift <- obj_par_model_1$dist$f_drift_parameters
f_dist_pars_drift
# ---------------------------------------------------------------------------
# Defining Model 2 - p is drifting, f is not drifting.
# ---------------------------------------------------------------------------
# `p_dist` has the same dimensions as in Model 1: (s, s, d + 1).
p_dist_model_2 <- array(c(p_dist_1, p_dist_2), dim = c(s, s, d + 1))
# `f_dist` has dimensions of: (s, s).
f_dist_model_2 <- matrix(c( NA, "pois", NA, "nbinom",
"geom", NA, "geom", "dweibull",
"unif", "geom", NA, "geom",
"nbinom", "unif", "dweibull", NA),
nrow = s, ncol = s, byrow = TRUE)
# `f_dist_pars` has dimensions of: (s, s, 2),
# corresponding to `f_dist_model_2`.
# First matrix.
f_dist_pars_1_model_2 <- matrix(c(NA, 0.2, NA, 3,
0.2, NA, 0.2, 0.5,
3, 0.4, NA, 0.7,
2, 3, 0.7, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_pars_2_model_2 <- matrix(c(NA, NA, NA, 0.6,
NA, NA, NA, 0.8,
NA, NA, NA, NA,
0.2, NA, 0.3, NA),
nrow = s, ncol = s, byrow = TRUE)
# Get `f_dist_pars`.
f_dist_pars_model_2 <- array(c(f_dist_pars_1_model_2,
f_dist_pars_2_model_2),
dim = c(s, s, 2))
# ---------------------------------------------------------------------------
# Parametric object for Model 2.
# ---------------------------------------------------------------------------
obj_par_model_2 <- parametric_dsmm(
model_size = 10000,
states = states,
initial_dist = c(0.8, 0.1, 0.1, 0),
degree = d,
p_dist = p_dist_model_2,
f_dist = f_dist_model_2,
f_dist_pars = f_dist_pars_model_2,
p_is_drifting = TRUE,
f_is_drifting = FALSE
)
# p drifting array.
p_drift <- obj_par_model_2$dist$p_drift
p_drift
# f distribution.
f_dist_notdrift <- obj_par_model_2$dist$f_notdrift_parametric
f_dist_notdrift
# parameters for the f distribution.
f_dist_pars_notdrift <- obj_par_model_2$dist$f_notdrift_parameters
f_dist_pars_notdrift
# ---------------------------------------------------------------------------
# Defining Model 3 - f is drifting, p is not drifting.
# ---------------------------------------------------------------------------
# `p_dist` has dimensions of: (s, s).
p_dist_model_3 <- matrix(c(0, 0.1, 0.3, 0.6,
0.4, 0, 0.1, 0.5,
0.4, 0.3, 0, 0.3,
0.9, 0.01, 0.09, 0),
ncol = s, byrow = TRUE)
# `f_dist` has the same dimensions as in Model 1: (s, s, d + 1).
f_dist_model_3 <- array(c(f_dist_1, f_dist_2), dim = c(s, s, d + 1))
# `f_dist_pars` has the same dimensions as in Model 1: (s, s, 2, d + 1).
f_dist_pars_model_3 <- array(c(f_dist_1_pars_1, f_dist_1_pars_2,
f_dist_2_pars_1, f_dist_2_pars_2),
dim = c(s, s, 2, d + 1))
# ---------------------------------------------------------------------------
# Parametric object for Model 3.
# ---------------------------------------------------------------------------
obj_par_model_3 <- parametric_dsmm(
model_size = 10000,
states = states,
initial_dist = c(0.3, 0.2, 0.2, 0.3),
degree = d,
p_dist = p_dist_model_3,
f_dist = f_dist_model_3,
f_dist_pars = f_dist_pars_model_3,
p_is_drifting = FALSE,
f_is_drifting = TRUE
)
# p drifting array.
p_notdrift <- obj_par_model_3$dist$p_notdrift
p_notdrift
# f distribution.
f_dist_drift <- obj_par_model_3$dist$f_drift_parametric
f_dist_drift
# parameters for the f distribution.
f_dist_pars_drift <- obj_par_model_3$dist$f_drift_parameters
f_dist_pars_drift
# ===========================================================================
# Parametric estimation using methods corresponding to an object
# which inherits from the class `dsmm_parametric`.
# ===========================================================================
### Comments
### 1. Using a larger `klim` and a larger `model_size` will increase the
### accuracy of the model, with the need of larger memory requirements
### and computational cost.
### 2. For the parametric estimation it is recommended to use a common set
### of distributions while only the parameters are drifting. This results
### in higher accuracy.
# ---------------------------------------------------------------------------
# Defining the distributions for Model 1 - both p and f are drifting.
# ---------------------------------------------------------------------------
# `p_dist` has dimensions of: (s, s, d + 1).
# First matrix.
p_dist_1 <- matrix(c(0, 0.2, 0.4, 0.4,
0.5, 0, 0.3, 0.2,
0.3, 0.4, 0, 0.3,
0.5, 0.3, 0.2, 0),
ncol = s, byrow = TRUE)
# Second matrix.
p_dist_2 <- matrix(c(0, 0.3, 0.5, 0.2,
0.3, 0, 0.4, 0.3,
0.5, 0.3, 0, 0.2,
0.2, 0.4, 0.4, 0),
ncol = s, byrow = TRUE)
# get `p_dist` as an array of p_dist_1 and p_dist_2.
p_dist_model_1 <- array(c(p_dist_1, p_dist_2), dim = c(s, s, d + 1))
# `f_dist` has dimensions of: (s, s, d + 1).
# We will use the same sojourn time distributions.
f_dist_1 <- matrix(c( NA, "unif", "dweibull", "nbinom",
"geom", NA, "pois", "dweibull",
"dweibull", "pois", NA, "geom",
"pois", 'nbinom', "geom", NA),
nrow = s, ncol = s, byrow = TRUE)
# get `f_dist`
f_dist_model_1 <- array(f_dist_1, dim = c(s, s, d + 1))
# `f_dist_pars` has dimensions of: (s, s, 2, d + 1).
# First array of coefficients, corresponding to `f_dist_1`.
# First matrix.
f_dist_1_pars_1 <- matrix(c(NA, 7, 0.4, 4,
0.7, NA, 5, 0.6,
0.2, 3, NA, 0.6,
4, 4, 0.4, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_1_pars_2 <- matrix(c(NA, NA, 0.2, 0.6,
NA, NA, NA, 0.8,
0.6, NA, NA, NA,
NA, 0.3, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second array of coefficients, corresponding to `f_dist_2`.
# First matrix.
f_dist_2_pars_1 <- matrix(c(NA, 6, 0.5, 3,
0.5, NA, 4, 0.5,
0.4, 5, NA, 0.7,
6, 5, 0.7, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_2_pars_2 <- matrix(c(NA, NA, 0.4, 0.5,
NA, NA, NA, 0.6,
0.5, NA, NA, NA,
NA, 0.4, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Get `f_dist_pars`.
f_dist_pars_model_1 <- array(c(f_dist_1_pars_1, f_dist_1_pars_2,
f_dist_2_pars_1, f_dist_2_pars_2),
dim = c(s, s, 2, d + 1))
# ---------------------------------------------------------------------------
# Defining the parametric object for Model 1.
# ---------------------------------------------------------------------------
obj_par_model_1 <- parametric_dsmm(
model_size = 4000,
states = states,
initial_dist = c(0.8, 0.1, 0.1, 0),
degree = d,
p_dist = p_dist_model_1,
f_dist = f_dist_model_1,
f_dist_pars = f_dist_pars_model_1,
p_is_drifting = TRUE,
f_is_drifting = TRUE
)
cat("The object has class of (",
paste0(class(obj_par_model_1),
collapse = ', '), ").")
# ---------------------------------------------------------------------------
# Generating a sequence from the parametric object.
# ---------------------------------------------------------------------------
# A larger klim will lead to an increase in accuracy.
klim <- 20
sim_seq <- simulate(obj_par_model_1, klim = klim, seed = 1)
# ---------------------------------------------------------------------------
# Fitting the generated sequence under the same distributions.
# ---------------------------------------------------------------------------
fit_par_model1 <- fit_dsmm(sequence = sim_seq,
states = states,
degree = d,
f_is_drifting = TRUE,
p_is_drifting = TRUE,
estimation = 'parametric',
f_dist = f_dist_model_1)
cat("The object has class of (",
paste0(class(fit_par_model1),
collapse = ', '), ").")
cat("\nThe estimated parameters are:\n")
fit_par_model1$dist$f_drift_parameters