gldrm {gldrm} | R Documentation |
Fits a generalized linear density ratio model (GLDRM)
Description
A GLDRM is a semiparametric generalized linear model. In contrast to a GLM, which assumes a particular exponential family distribution, the GLDRM uses a semiparametric likelihood to estimate the reference distribution. The reference distribution may be any discrete, continuous, or mixed exponential family distribution. The model parameters, which include both the regression coefficients and the cdf of the unspecified reference distribution, are estimated by maximizing a semiparametric likelihood. Regression coefficients are estimated with no loss of efficiency, i.e. the asymptotic variance is the same as if the true exponential family distribution were known.
Usage
gldrm(
formula,
data = NULL,
link = "identity",
mu0 = NULL,
offset = NULL,
gldrmControl = gldrm.control(),
thetaControl = theta.control(),
betaControl = beta.control(),
f0Control = f0.control()
)
Arguments
formula |
An object of class "formula". |
data |
An optional data frame containing the variables in the model. |
link |
Link function. Can be a character string to be passed to the
|
mu0 |
Mean of the reference distribution. The reference distribution is
not unique unless its mean is restricted to a specific value. This value can
be any number within the range of observed values, but values near the boundary
may cause numerical instability. This is an optional argument with |
offset |
Known component of the linear term. Offset must be passed through
this argument - offset terms in the formula will be ignored.
value and covariate values. If sampling weights are a function of both the
response value and covariates, then |
gldrmControl |
Optional control arguments.
Passed as an object of class "gldrmControl", which is constructed by the
|
thetaControl |
Optional control arguments for the theta update procedure.
Passed as an object of class "thetaControl", which is constructed by the
|
betaControl |
Optional control arguments for the beta update procedure.
Passed as an object of class "betaControl", which is constructed by the
|
f0Control |
Optional control arguments for the |
Details
The arguments linkfun
, linkinv
, and mu.eta
mirror the "link-glm" class. Objects of this class can be created with the
stats::make.link
function.
The "gldrm" class is a list of the following items.
-
conv
Logical indicator for whether the gldrm algorithm converged within the iteration limit. -
iter
Number of iterations used. A single iteration is abeta
update, followed by anf0
update. -
llik
Semiparametric log-likelihood of the fitted model. -
beta
Vector containing the regression coefficient estimates. -
mu
Vector containing the estimated mean response value for each observation in the training data. -
eta
Vector containing the estimated linear combination of covariates for each observation. -
f0
Vector containing the semiparametric estimate of the reference distribution, evaluated at the observed response values. The values of correspond to the support values, sorted in increasing order. -
spt
Vector containing the unique observed response values, sorted in increasing order. -
mu0
Mean of the estimated semiparametric reference distribution. The mean of the reference distribution must be fixed at a value in order for the model to be identifiable. It can be fixed at any value within the range of observed response values, but thegldrm
function assignsmu0
to be the mean of the observed response values. -
varbeta
Estimated variance matrix of the regression coefficients. -
seBeta
Standard errors for\hat{\beta}
. Equal tosqrt(diag(varbeta))
. -
seMu
Standard errors for\hat{\mu}
computed fromvarbeta
. -
seEta
Standard errors for\hat{\eta}
computed fromvarbeta
. -
theta
Vector containing the estimated tilt parameter for each observation. The tilted density function of the response variable is given byf(y|x_i) = f_0(y) \exp(\theta_i y) / \int f_0(u) \exp(\theta_i u) du.
-
bPrime
is a vector containing the mean of the tilted distribution,b'(\theta_i)
, for each observation.bPrime
should matchmu
, except in cases wheretheta
is capped for numerical stability.b'(\theta_i) = \int u f(u|x_i) du
-
bPrime2
is a vector containing the variance of the tilted distribution,b''(\theta_i)
, for each observation.b''(\theta_i) = \int (u - b'(\theta_i))^2 f(u|x_i) du
-
fTilt
is a vector containing the semiparametric fitted probability,\hat{f}(y_i | x_i)
, for each observation. The semiparametric log-likelihood is equal to\sum_{i=1}^n \log \hat{f}(y_i | x_i).
-
sampprobs
If sampling probabilities were passed through thesampprobs
argument, then they are returned here in matrix form. Each row corresponds to an observation. -
llikNull
Log-likelihood of the null model with no covariates. -
lr.stat
Likelihood ratio test statistic comparing fitted model to the null model. It is calculated as2 \times (llik - llik_0) / (p-1)
. The asymptotic distribution is F(p-1, n-p) under the null hypothesis. -
lr.pval
P-value of the likelihood ratio statistic. -
fTiltMatrix
is a matrix containing the semiparametric density for each observation, i.e.\hat{f}(y | x_i)
for each uniquey
value. This is a matrix with nrow equal to the number of observations and ncol equal to the number of unique response values observed. Only returned ifreturnfTilt = TRUE
in the gldrmControl arguments. -
score.logf0
Score function forlog(f0)
. Only returned ifreturnf0ScoreInfo = TRUE
in the gldrmControl arguments. -
info.logf0
Information matrix forlog(f0)
. Only returned ifreturnf0ScoreInfo = TRUE
in the gldrmControl arguments. -
formula
Model formula. -
data
Model data frame. -
link
Link function. If a character string was passed to thelink
argument, then this will be an object of class "link-glm". Otherwise, it will be the list of three functions passed to thelink
argument.
Value
An S3 object of class "gldrm". See details.
Examples
data(iris, package="datasets")
# Fit a gldrm with log link
fit <- gldrm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species,
data=iris, link="log")
fit
# Fit a gldrm with custom link function
link <- list()
link$linkfun <- function(mu) log(mu)^3
link$linkinv <- function(eta) exp(eta^(1/3))
link$mu.eta <- function(eta) exp(eta^(1/3)) * 1/3 * eta^(-2/3)
fit2 <- gldrm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species,
data=iris, link=link)
fit2