R: Model-assisted inference for population means without...

mn.nreg {RCAL}

R Documentation

Model-assisted inference for population means without regularization

Description

This function implements model-assisted inference for population means with missing data, using non-regularized calibrated estimation.

Usage

mn.nreg(y, tr, x, ploss = "cal", yloss = "gaus", off = 0)

Arguments

`y`	An `n` x `1` vector of outcomes with missing data.
`tr`	An `n` x `1` vector of non-missing indicators (=1 if `y` is observed or 0 if `y` is missing).
`x`	An `n` x `p` matix of covariates (excluding a constant), used in both propensity score and outcome regression models.
`ploss`	A loss function used in propensity score estimation (either "ml" or "cal").
`yloss`	A loss function used in outcome regression (either "gaus" for continuous outcomes or "ml" for binary outcomes).
`off`	An offset value (e.g., the true value in simulations) used to calculate the z-statistic from augmented IPW estimation.

Details

Two steps are involved in this function: first fitting propensity score and outcome regression models and then applying the augmented IPW estimator for a population mean. For ploss="cal", calibrated estimation is performed similarly as in Tan (2020a, 2020b), but without regularization. The method then leads to model-assisted inference, in which confidence intervals are valid if the propensity score model is correctly specified but the outcome regression model may be misspecified. With linear outcome models, the inference is also doubly robust (Kim and Haziza 2014; Vermeulen and Vansteelandt 2015). For ploss="ml", maximum likelihood estimation is used (Robins et al. 1994). In this case, standard errors are in general conservative if the propensity score model is correctly specified but the outcome regression model may be misspecified.

Value

`ps`	A list containing the results from fitting the propensity score model by `glm.nreg`.
`fp`	The `n` x `1` vector of fitted propensity scores.
`or`	A list containing the results from fitting the outcome regression model by `glm.nreg`.
`fo`	The `n` x `1` vector of fitted values from outcome regression.
`est`	A list containing the results from augmented IPW estimation by `mn.aipw`.

References

Kim, J.K. and Haziza, D. (2014) Doubly robust inference with missing data in survey sampling, Statistica Sinica, 24, 375-394.

Robins, J.M., Rotnitzky, A., and Zhao, L.P. (1994) Estimation of regression coefficients when some regressors are not always observed, Journal of the American Statistical Association, 89, 846-866.

Vermeulen, K. and Vansteelandt, S. (2015) Bias-reduced doubly robust estimation, Journal of the American Statistical Association, 110, 1024-1036.

Tan, Z. (2020a) Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data, Biometrika, 107, 137–158.

Tan, Z. (2020b) Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data, Annals of Statistics, 48, 811–837.

Examples

data(simu.data)
n <- dim(simu.data)[1]
p <- dim(simu.data)[2]-2

y <- simu.data[,1]
tr <- simu.data[,2]
x <- simu.data[,2+1:p]
x <- scale(x)

# missing data
y[tr==0] <- NA

# include only 10 covariates
x2 <- x[,1:10]

mn.cal <- mn.nreg(y, tr, x2, ploss="cal", yloss="gaus")
unlist(mn.cal$est)

[Package RCAL version 2.0 Index]