R: EB estimators of an indicator with non-sample values of...

ebBHF {sae}

R Documentation

EB estimators of an indicator with non-sample values of auxiliary variables.

Description

Fits by REML method the unit level model of Battese, Harter and Fuller (1988) to a transformation of the specified dependent variable by a Box-Cox family or power family and obtains Monte Carlo approximations of EB estimators of the specified small area indicators, when the values of auxiliary variables for out-of-sample units are available.

Usage

ebBHF(formula, dom, selectdom, Xnonsample, MC = 100, data,
      transform = "BoxCox", lambda = 0, constant = 0, indicator)

Arguments

`formula`	an object of class `formula` (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under Details.
`dom`	`n*1` vector or factor (same size as `y` in `formula`) with domain codes.
`selectdom`	`I*1` optional vector or factor with the domain codes for which we want to estimate the indicators. It must be a subset of the domain codes in `dom`. If this parameter is not included, the unique domain codes included in `dom` are considered.
`Xnonsample`	matrix or data frame containing in the first column the domain codes and in the rest of columns the values of each of `p` auxiliary variables for the out-of-sample units in each selected domain. The domains considered in `Xnonsample` must contain at least those specified in `selectdom`.
`MC`	number of Monte Carlo replicates for the empirical approximation of the EB estimator. Default value is `MC=100`.
`data`	optional data frame containing the variables named in `formula` and `dom`. By default the variables are taken from the environment from which `ebBHF` is called.
`transform`	type of transformation for the dependent variable to be chosen between the `"BoxCox"` and `"power"` families so that the dependent variable in `formula` follows approximately a Normal distribution. Default value is `"BoxCox"`.
`lambda`	value for the parameter of the family of transformations specified in `transform`. Default value is `0`, which gives the log transformation for the two possible families.
`constant`	constant added to the dependent variable before doing the transformation, to achieve a distribution close to Normal. Default value is `0`.
`indicator`	function of the (untransformed) variable on the left hand side of `formula` that we want to estimate in each domain.

Details

This function uses random number generation. To fix the seed, use set.seed.

A typical model has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response. A terms specification of the form first + second indicates all the terms in first together with all the terms in second with duplicates removed.

A formula has an implied intercept term. To remove this use either y ~ x - 1 or y ~ 0 + x. See formula for more details of allowed formulae.

Value

The function returns a list with the following objects:

eb

data frame with number of rows equal to number of selected domains, containing in its columns the domain codes (domain), the EB estimators of indicator (eb) and the sample sizes (sampsize). For domains with zero sample size, the EB estimators are based on the synthetic regression. For domains in selectdom not included in Xnonsample the EB estimators are NA.

fit

a list containing the following objects:

summary: summary of the unit level model fitting.
fixed: vector with the estimated values of the fixed regression coefficient.
random: vector with the predicted random effects.
errorvar: estimated model error variance.
refvar: estimated random effects variance.
loglike: log-likelihood.
residuals: vector with raw residuals from the model fit.

Cases with NA values in formula or dom are ignored.

References

- Molina, I. and Rao, J.N.K. (2010). Small Area Estimation of Poverty Indicators. The Canadian Journal of Statistics 38, 369-385.

Examples

data(incomedata)         # Load data set
attach(incomedata)

# Construct design matrix for sample elements
Xs <- cbind(age2, age3, age4, age5, nat1, educ1, educ3, labor1, labor2)

# Select the domains to compute EB estimators. 
data(Xoutsamp)
domains <- unique(Xoutsamp[,"domain"])

# Poverty gap indicator
povertyline <- 0.6*median(income)
povertyline                         # 6477.484
povgap <- function(y)     
{
   z <- 6477.484
   result <- mean((y<z) * (z-y) / z) 
   return (result)
}

# Compute EB predictors of poverty gap. The value constant=3600 is selected
# to achieve approximately symmetric residuals.
set.seed(123)
result <- ebBHF(income ~ Xs, dom=prov, selectdom=domains,
                Xnonsample=Xoutsamp, MC=10, constant=3600, indicator=povgap)
result$eb
result$fit$summary
result$fit$fixed
result$fit$random[,1]
result$fit$errorvar
result$fit$refvar
result$fit$loglike
result$fit$residuals[1:10]

detach(incomedata)

[Package sae version 1.3 Index]