| lmac,makeNA,coef.lmac,vcov.lmac,pcac,loglinac,tbltofakedf {regtools} | R Documentation |
Available Cases Method for Missing Data
Description
Various estimators that handle missing data via the Available Cases Method
Usage
lmac(xy,nboot=0)
makeNA(m,probna)
NAsTo0s(x)
ZerosToNAs(x,replaceVal=0)
## S3 method for class 'lmac'
coef(object,...)
## S3 method for class 'lmac'
vcov(object,...)
pcac(indata,scale=FALSE)
loglinac(x,margin)
tbltofakedf(tbl)
Arguments
replaceVal |
Value to be replaced by NA. |
xy |
Matrix or data frame, X values in the first columns, Y in the last column. |
indata |
Matrix or data frame. |
x |
Matrix or data frame, one column per variable. |
nboot |
If positive, number of bootstrap samples to take. |
probna |
Probability that an element will be NA. |
scale |
If TRUE, call |
tbl |
An R table. |
m |
Number of synthetic NAs to insert. |
object |
Output from |
... |
Needed for consistency with generic function. Not used. |
margin |
A list of vectors specifying the model, as in
|
Details
The Available Cases (AC) approach applies to statistical methods that depend only on products of k of the variables, so that cases having non-NA values for those k variables can be used, as opposed to using only cases that are fully intact in all variables, the Complete Cases (CC) approach. In the case of linear regression, for instance, the estimated coefficients depend only on covariances between the variables (both predictors and response). This approach assumes thst the cases with missing values have the same distribution as the intact cases.
The lmac function forms OLS estimates as with lm, but
applying AC, in contrast to lm, which uses the CC method.
The pcac function is an AC substitute for prcomp. The
data is centered, corresponding to a fixed value of center =
TRUE in prcomp. It is also scaled if scale is TRUE,
corresponding scale = TRUE in prcomp. Due to AC,
there is a small chance of negative eigenvalues, in which case
stop will be called.
The loglinac function is an AC substitute for loglin.
The latter takes tables as input, but loglinac takes the raw
data. If you have just the table, use tbltofakedf to
regenerate a usable data frame.
The makeNA function is used to insert random NA values into
data, for testing purposes.
Value
For lmac, an object of class lmac, with components
coefficients, as with
lm; accessible directly or by callingcoef, as withlmfitted.values, as with
lmresiduals, as with
lmr2, (unadjusted) R-squared
cov, for
nboot > 0the estimated covariance matrix of the vector of estimated regression coefficients; accessible directly or by callingvcov, as withlm
For pcac, an R list, with components
sdev, as with
prcomprotation, as with
prcomp
For loglinac, an R list, with components
param, estimated coefficients, as in
loglinfit, estimated expected call counts, as in
loglin
Author(s)
Norm Matloff
Examples
n <- 25000
w <- matrix(rnorm(2*n),ncol=2) # x and epsilon
x <- w[,1]
y <- x + w[,2]
# insert some missing values
nmiss <- round(0.1*n)
x[sample(1:n,nmiss)] <- NA
nmiss <- round(0.2*n)
y[sample(1:n,nmiss)] <- NA
acout <- lmac(cbind(x,y))
coef(acout) # should be near pop. values 0 and 1