glmcat {GLMcat} | R Documentation |
Generalized linear models for categorical responses
Description
Estimate generalized linear models implemented under the unified
specification ( ratio,cdf,Z) where ratio
represents the ratio of probabilities
(reference, cumulative, adjacent, or sequential), cdf
the cumulative distribution function
for the linkage, and Z the design matrix which must be specified through the parallel
and the threshold
arguments.
Usage
glmcat(
formula,
data,
ratio = c("reference", "cumulative", "sequential", "adjacent"),
cdf = list(),
parallel = NA,
categories_order = NA,
ref_category = NA,
threshold = c("standard", "symmetric", "equidistant"),
control = list(),
normalization = 1,
na.action = "na.omit",
find_nu = FALSE,
...
)
Arguments
formula |
formula a symbolic description of the model to be fit. An expression of the form 'y ~ predictors' is interpreted as a specification that the response 'y' is modeled by a linear predictor specified by 'predictors'. |
data |
a dataframe object in R, with the dependent variable as a factor. |
ratio |
a string indicating the ratio (equivalently to the family) options are: reference, adjacent, cumulative and sequential. It is mandatory for the user to specify the desired ratio option as there is no default value. |
cdf |
The inverse distribution function to be used as part of the link function. - If the distribution has no parameters to specify, then it should be entered as a string indicating the name, e.g., 'cdf = "normal"'. The default value is 'cdf = "logistic"'. - If there are parameters to specify, then a list must be entered. For example, for Student's distribution: 'cdf = list("student", df=2)'. For the non-central distribution of Student: 'cdf = list("noncentralt", df=2, mu=1)'. |
parallel |
a character vector indicating the name of the variables with a parallel effect. If a variable is categorical, specify the name and the level of the variable as a string, e.g., '"namelevel"'. |
categories_order |
a character vector indicating the incremental order of the categories, e.g., 'c("a", "b", "c")' for 'a < b < c'. Alphabetical order is assumed by default. Order is relevant for adjacent, cumulative, and sequential ratio. |
ref_category |
a string indicating the reference category. This option is suitable for models with reference ratio. |
threshold |
a restriction to impose on the thresholds. Options are: 'standard', 'equidistant', or 'symmetric'. This is valid only for the cumulative ratio. |
control |
a list of control parameters for the estimation algorithm. - 'maxit': The maximum number of iterations for the Fisher scoring algorithm. - 'epsilon': A double to change the convergence criterion of GLMcat models. - 'beta_init': An appropriately sized vector for the initial iteration of the algorithm. |
normalization |
the quantile to use for the normalization of the estimated coefficients when the logistic distribution is used as the base cumulative distribution function. |
na.action |
an argument to handle missing data. Available options are 'na.omit', 'na.fail', and 'na.exclude'. It does not include the 'na.pass' option. |
find_nu |
a logical argument to indicate whether the user intends to utilize the Student CDF and seeks an optimization algorithm to identify an optimal degrees of freedom setting for the model. |
... |
additional arguments.
|
Details
Fitting models for categorical responses
This function fits generalized linear models for categorical responses using the unified specification framework introduced by Peyhardi, Trottier, and Guédon (2015).
References
Peyhardi J, Trottier C, Guédon Y (2015). “A new specification of generalized linear models for categorical responses.” Biometrika, 102(4), 889–906. doi:10.1093/biomet/asv042.
See Also
Examples
data(DisturbedDreams)
ref_log_com <- glmcat(formula = Level ~ Age, data = DisturbedDreams,
ref_category = "Very.severe",
cdf = "logistic", ratio = "reference")