lmac,makeNA,coef.lmac,vcov.lmac,pcac,loglinac,tbltofakedf {regtools} | R Documentation |
Available Cases Method for Missing Data
Description
Various estimators that handle missing data via the Available Cases Method
Usage
lmac(xy,nboot=0)
makeNA(m,probna)
NAsTo0s(x)
ZerosToNAs(x,replaceVal=0)
## S3 method for class 'lmac'
coef(object,...)
## S3 method for class 'lmac'
vcov(object,...)
pcac(indata,scale=FALSE)
loglinac(x,margin)
tbltofakedf(tbl)
Arguments
replaceVal |
Value to be replaced by NA. |
xy |
Matrix or data frame, X values in the first columns, Y in the last column. |
indata |
Matrix or data frame. |
x |
Matrix or data frame, one column per variable. |
nboot |
If positive, number of bootstrap samples to take. |
probna |
Probability that an element will be NA. |
scale |
If TRUE, call |
tbl |
An R table. |
m |
Number of synthetic NAs to insert. |
object |
Output from |
... |
Needed for consistency with generic function. Not used. |
margin |
A list of vectors specifying the model, as in
|
Details
The Available Cases (AC) approach applies to statistical methods that depend only on products of k of the variables, so that cases having non-NA values for those k variables can be used, as opposed to using only cases that are fully intact in all variables, the Complete Cases (CC) approach. In the case of linear regression, for instance, the estimated coefficients depend only on covariances between the variables (both predictors and response). This approach assumes thst the cases with missing values have the same distribution as the intact cases.
The lmac
function forms OLS estimates as with lm
, but
applying AC, in contrast to lm
, which uses the CC method.
The pcac
function is an AC substitute for prcomp
. The
data is centered, corresponding to a fixed value of center =
TRUE
in prcomp
. It is also scaled if scale
is TRUE,
corresponding scale = TRUE
in prcomp
. Due to AC,
there is a small chance of negative eigenvalues, in which case
stop
will be called.
The loglinac
function is an AC substitute for loglin
.
The latter takes tables as input, but loglinac
takes the raw
data. If you have just the table, use tbltofakedf
to
regenerate a usable data frame.
The makeNA
function is used to insert random NA values into
data, for testing purposes.
Value
For lmac
, an object of class lmac
, with components
coefficients, as with
lm
; accessible directly or by callingcoef
, as withlm
fitted.values, as with
lm
residuals, as with
lm
r2, (unadjusted) R-squared
cov, for
nboot > 0
the estimated covariance matrix of the vector of estimated regression coefficients; accessible directly or by callingvcov
, as withlm
For pcac
, an R list, with components
sdev, as with
prcomp
rotation, as with
prcomp
For loglinac
, an R list, with components
param, estimated coefficients, as in
loglin
fit, estimated expected call counts, as in
loglin
Author(s)
Norm Matloff
Examples
n <- 25000
w <- matrix(rnorm(2*n),ncol=2) # x and epsilon
x <- w[,1]
y <- x + w[,2]
# insert some missing values
nmiss <- round(0.1*n)
x[sample(1:n,nmiss)] <- NA
nmiss <- round(0.2*n)
y[sample(1:n,nmiss)] <- NA
acout <- lmac(cbind(x,y))
coef(acout) # should be near pop. values 0 and 1