maxlogLreg {EstimationTools}R Documentation

Maximum Likelihood Estimation for parametric linear regression models

Description

[Experimental]

Function to compute maximum likelihood estimators (MLE) of regression parameters of any distribution implemented in R with covariates (linear predictors).

Usage

maxlogLreg(
  formulas,
  y_dist,
  support = NULL,
  data = NULL,
  subset = NULL,
  fixed = NULL,
  link = NULL,
  start = NULL,
  lower = NULL,
  upper = NULL,
  optimizer = "nlminb",
  control = NULL,
  silent = FALSE,
  StdE_method = c("optim", "numDeriv"),
  ...
)

Arguments

formulas

a list of formula objects. Each element must have an ~, with the terms on the right separated by + operators. The response variable on the left side is optional. Linear predictor of each parameter must be specified with the name of the parameter followed by the suffix '.fo'. See the examples below for further illustration.

y_dist

a formula object that specifies the distribution of the response variable. On the left side of ~ must be the response, and in the right side must be the name o de probability density/mass function. See the section Details and the examples below for further illustration.

support

a list with the following entries:

  • interval: a two dimensional atomic vector indicating the set of possible values of a random variable having the distribution specified in y_dist.

  • type: character indicating if distribution has a discrete or a continous random variable.

data

an optional data frame containing the variables in the model. If data is not specified, the variables are taken from the environment from which maxlogLreg is called.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

fixed

a list with fixed/known parameters of distribution of interest. Fixed parameters must be passed with its name and its value (known).

link

a list with names of parameters to be linked, and names of the link function object. For names of parameters, please visit documentation of density/mass function. There are three link functions available: log_link, logit_link and NegInv_link. Take into account: the order used in argument over corresponds to the order in argument link.

start

a numeric vector with initial values for the parameters to be estimated. Zero is the default value.

lower

a numeric vector with lower bounds, with the same lenght of argument start (for box-constrained optimization). -Inf is the default value.

upper

a numeric vector with upper bounds, with the same lenght of argument start (for box-constrained optimization). Inf is the default value.

optimizer

a length-one character vector with the name of optimization routine. nlminb, optim and DEoptim are available; nlminb is the default routine.

control

control parameters of the optimization routine. Please, visit documentation of selected optimizer for further information.

silent

logical. If TRUE, warnings of maxlogL are suppressed.

StdE_method

a length-one character vector with the routine for Hessian matrix computation. The This is needed for standard error estimation. The options available are "optim" and "numDeriv". For further information, visit optim or hessian.

...

Further arguments to be supplied to the optimization routine.

Details

maxlogLreg computes programmatically the log-likelihood (log L) function corresponding for the following model:

y_i \stackrel{iid.}{\sim} \mathcal{D}(\theta_{i1},\theta_{i2},\dots, \theta_{ij}, \dots, \theta_{ik})

g(\boldsymbol{\theta}_{j}) = \boldsymbol{\eta}_{j} = \mathbf{X}_j^\top \boldsymbol{\beta}_j,

where,

Then, maxlogLreg maximizes the log L through optim, nlminb or DEoptim. maxlogLreg generates an S3 object of class maxlogL.

Estimation with censorship can be handled with Surv objects (see example 2). The output object stores the corresponding censorship matrix, defined as r_{il} = 1 if sample unit i has status l, or r_{il} = 0 in other case. i=1,2,\dots,n and l=1,2,3 (l=1: observation status, l=2: right censorship status, l=3: left censorship status).

Value

A list with class maxlogL containing the following lists:

fit

A list with output information about estimation and method used.

inputs

A list with all input arguments.

outputs

A list with additional information. The most important outputs are:

  • npar: number of parameters.

  • n: sample size

  • Stde_method: standard error computation method.

  • b_lenght: a list with the number of regression parameters.

  • design_matrix: a list with the \mathbf{X} matrix for each parameter, the response values (called y) and the censorship matrix (called status). See the Details section for further information.

Note

Author(s)

Jaime Mosquera GutiƩrrez, jmosquerag@unal.edu.co

References

Nelder JA, Mead R (1965). “A Simplex Method for Function Minimization.” The Computer Journal, 7(4), 308–313. ISSN 0010-4620, doi:10.1093/comjnl/7.4.308, https://academic.oup.com/comjnl/article-lookup/doi/10.1093/comjnl/7.4.308.

Fox PA, Hall AP, Schryer NL (1978). “The PORT Mathematical Subroutine Library.” ACM Transactions on Mathematical Software, 4(2), 104–126. ISSN 00983500, doi:10.1145/355780.355783, https://dl.acm.org/doi/10.1145/355780.355783.

Nash JC (1979). Compact Numerical Methods for Computers. Linear Algebra and Function Minimisation, 2nd Edition edition. Adam Hilger, Bristol.

Dennis JE, Gay DM, Walsh RE (1981). “An Adaptive Nonlinear Least-Squares Algorithm.” ACM Transactions on Mathematical Software, 7(3), 348–368. ISSN 00983500, doi:10.1145/355958.355965, https://dl.acm.org/doi/10.1145/355958.355965.

See Also

summary.maxlogL, optim, nlminb, DEoptim, DEoptim.control, maxlogL, bootstrap_maxlogL

Other maxlogL: hazard_fun(), maxlogL()

Examples

library(EstimationTools)

#--------------------------------------------------------------------------------
# Example 1: Estimation in simulated normal distribution
n <- 1000
x <- runif(n = n, -5, 6)
y <- rnorm(n = n, mean = -2 + 3 * x, sd = exp(1 + 0.3* x))
norm_data <- data.frame(y = y, x = x)

# It does not matter the order of distribution parameters
formulas <- list(sd.fo = ~ x, mean.fo = ~ x)
support <- list(interval = c(-Inf, Inf), type = 'continuous')

norm_mod <- maxlogLreg(formulas, y_dist = y ~ dnorm, support = support,
                       data = norm_data,
                       link = list(over = "sd", fun = "log_link"))
summary(norm_mod)


#--------------------------------------------------------------------------------
# Example 2: Fitting with censorship
# (data from https://www.itl.nist.gov/div898/handbook/apr/section4/apr413.htm)

failures <- c(55, 187, 216, 240, 244, 335, 361, 373, 375, 386)
fails <- c(failures, rep(500, 10))
status <- c(rep(1, length(failures)), rep(0, 10))
Wei_data <- data.frame(fails = fails, status = status)

# Formulas with linear predictors
formulas <- list(scale.fo=~1, shape.fo=~1)
support <- list(interval = c(0, Inf), type = 'continuous')

# Bounds for optimization. Upper bound set with default values (Inf)
start <- list(
  scale = list(Intercept = 100),
  shape = list(Intercept = 10)
)
lower <- list(
  scale = list(Intercept = 0),
  shape = list(Intercept = 0)
)

mod_weibull <- maxlogLreg(formulas, y_dist = Surv(fails, status) ~ dweibull,
                          support = c(0, Inf), start = start,
                          lower = lower, data = Wei_data)
summary(mod_weibull)


#--------------------------------------------------------------------------------


[Package EstimationTools version 4.0.0 Index]