R: Fitting Binary Regression Models

brm {brm}

R Documentation

Fitting Binary Regression Models

Description

brm is used to estimate the association between two binary variables, and how that varies as a function of other covariates.

Usage

brm(
  y,
  x,
  va,
  vb = NULL,
  param,
  est.method = "MLE",
  vc = NULL,
  optimal = TRUE,
  weights = NULL,
  subset = NULL,
  max.step = NULL,
  thres = 1e-08,
  alpha.start = NULL,
  beta.start = NULL,
  message = FALSE
)

Arguments

`y`	The response vector. Should only take values 0 or 1.
`x`	The exposure vector. Should only take values 0 or 1.
`va`	The covariates matrix for the target model. It can be specified via an object of class "`formula`" or a matrix. In the latter case, no intercept terms will be added to the covariates matrix.
`vb`	The covariates matrix for the nuisance model. It can be specified via an object of class "`formula`" or a matrix. In the latter case, no intercept terms will be added to the covariates matrix. (If not specified, defaults to va.)
`param`	The measure of association. Can take value 'RD' (risk difference), 'RR' (relative risk) or 'OR' (odds ratio)
`est.method`	The method to be used in fitting the model. Can be 'MLE' (maximum likelihood estimation, the default) or 'DR' (doubly robust estimation).
`vc`	The covariates matrix for the probability of exposure, often called the propensity score. It can be specified via an object of class "`formula`" or a matrix. In the latter case, no intercept terms will be added to the covariates matrix. By default we fit a logistic regression model for the propensity score. (If not specified, defaults to va.)
`optimal`	Use the optimal weighting function for the doubly robust estimator? Ignored if the estimation method is 'MLE'. The default is TRUE.
`weights`	An optional vector of 'prior weights' to be used in the fitting process. Should be NULL or a numeric vector.
`subset`	An optional vector specifying a subset of observations to be used in the fitting process.
`max.step`	The maximal number of iterations to be passed into the `optim` function. The default is 1000.
`thres`	Threshold for judging convergence. The default is 1e-6.
`alpha.start`	Starting values for the parameters in the target model.
`beta.start`	Starting values for the parameters in the nuisance model.
`message`	Show optimization details? Ignored if the estimation method is 'MLE'. The default is FALSE.

Details

brm contains two parts: the target model for the dependence measure (RR, RD or OR) and the nuisance model; the latter is required for maximum likelihood estimation. If param="RR" then the target model is log RR(va) = \alpha*va. If param="RD" then the target model is atanh RD(va) = \alpha*va. If param="OR" then the target model is log OR(va) = \alpha*va. For RR and RD, the nuisance model is for the log Odds Product: log OP(vb) = \beta*vb. For OR, the nuisance model is for the baseline risk: logit(P(y=1|x=0,vb)) = \beta*vb. In each case the nuisance model is variation independent of the target model, which ensures that the predicted probabilities lie in [0,1]. See Richardson et al. (2016+) for more details.

If est.method="DR" then given a correctly specified logistic regression model for the propensity score logit(P(x=1|vc)) = \gamma*vc, estimation of the RR or RD is consistent, even if the log Odds Product model is misspecified. This estimation method is not available for the OR. See Tchetgen Tchetgen et al. (2014) for more details.

When estimating RR and RD, est.method="DR" is recommended unless it is known that the log Odds Product model is correctly specified. Optimal weights (optimal=TRUE) are also recommended to increase efficiency.

For the doubly robust estimation method, MLE is used to obtain preliminary estimates of \alpha, \beta and \gamma. The estimate of \alpha is then updated by solving a doubly-robust estimating equation. (The estimate for \beta remains the MLE.)

Value

A list consisting of

`param`	the measure of association.
`point.est`	the point estimates.
`se.est`	the standard error estimates.
`cov`	estimate of the covariance matrix for the estimates.
`conf.lower`	the lower limit of the 95% (marginal) confidence interval.
`conf.upper`	the upper limit of the 95% (marginal) confidence interval.
`p.value`	the two sided p-value for testing zero coefficients.
`coefficients`	the matrix summarizing key information: point estimate, 95% confidence interval and p-value.
`param.est`	the fitted RR/RD/OR.
`va`	the matrix of covariates for the target model.
`vb`	the matrix of covariates for the nuisance model.
`converged`	Did the maximization process converge?

Author(s)

Linbo Wang <linbo.wang@utoronto.ca>, Mark Clements <mark.clements@ki.se>, Thomas Richardson <thomasr@uw.edu>

References

Thomas S. Richardson, James M. Robins and Linbo Wang. "On Modeling and Estimation for the Relative Risk and Risk Difference." Journal of the American Statistical Association: Theory and Methods (2017).

Eric J. Tchetgen Tchetgen, James M. Robins and Andrea Rotnitzky. "On doubly robust estimation in a semiparametric odds ratio model." Biometrika 97.1 (2010): 171-180.

Examples

set.seed(0)
n           = 100
alpha.true  = c(0,-1)
beta.true   = c(-0.5,1)
gamma.true  = c(0.1,-0.5)
params.true = list(alpha.true=alpha.true, beta.true=beta.true, 
     gamma.true=gamma.true)
v.1         = rep(1,n)       # intercept term
v.2         = runif(n,-2,2) 
v           = cbind(v.1,v.2)
pscore.true = exp(v %*% gamma.true) / (1+exp(v %*% gamma.true))
p0p1.true   = getProbRR(v %*% alpha.true,v %*% beta.true)
x           = rbinom(n, 1, pscore.true)  
pA.true       = p0p1.true[,1]
pA.true[x==1] = p0p1.true[x==1,2]
y = rbinom(n, 1, pA.true)

fit.mle = brm(y,x,v,v,'RR','MLE',v,TRUE)
fit.drw = brm(y,x,v,v,'RR','DR',v,TRUE)
fit.dru = brm(y,x,v,v,'RR','DR',v,FALSE)

fit.mle2 = brm(y,x,~v.2, ~v.2, 'RR','MLE', ~v.2,TRUE) # same as fit.mle

[Package brm version 1.1.1 Index]