ei_gce {EIEntropy}R Documentation

Ecologic Inference applying entropy

Description

The function ei_gce defines the Kullback-Leibler function which minimises the distance between the distribution of probabilities P and the distribution Q. The distribution Q is based on prior information that we have of our variable of interest previous to the analysis. The function will set the optimization parameters and, using the "optim" function, an optimal solution is obtained. The function defines the independent variables in the two databases needed, which we call datahp with "n_hp" observations and datahs with "n_hs" observations; and the function of the binary variable of interest y. Then the weights of each observation for the two databases used are defined, if there are not weights available it will be 1 by default. The errors are calculated pondering the support vector of dimension var, 0, -var. This support vector can be specified by the user. The default support vector is based on variance. We recommend a wider interval with v(-1,0,1) as the maximum. The restrictions are defined in order to guarantee consistency. The minimization of Kullback_Leibler distance is solved with "optim" function with the method "BFGS", with maximum number of iterations 100 and with tolerance defined by the user. If the user did not define tolerance it will be 1e-24 by default. For additional details about the methodology see Fernández-Vazquez, et al. (2020)

Usage

ei_gce(fn, datahp, datahs, q, w, tol, method, v)

Arguments

fn

is the formula that represents the dependent variable in the optimization. In the context of this function, 'fn' is used to define the dependent variable to be optimized by the entropy function.

datahp

The data where the variable of interest y is available and also the independent variables. Note: The variables and weights used as independent variables must have the same name in 'datahp' and in 'datahs'

datahs

The data with the information of the independent variables as a disaggregated level. Note: The variables and weights used as independent variables must have the same name in 'datahp' and in 'datahs'. The variables in both databases need to match up in content.

q

The prior distribution Q

w

The weights to be used in this function

tol

The tolerance to be applied in the optimization function. If the tolerance is not specified, the default tolerance has been set in 1e-24

method

The method used in the function optim. It can be selected by the user between: "Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN", "Brent". The method by default and the recommended is BFGS

v

The support vector

Details

To solve the optimization upper and lower bounds for p and w are settled, specifically, p and w must be above 0 and lower than 1. In addition, the initial values of p are settled as the defined prior and the errors (w) as 1/3.

Value

The function will provide you a dataframe called table with the next information:

References

Fernandez-Vazquez, E., Diaz-Dapena, A., Rubiera-Morollon, F., Viñuela, A., (2020) Spatial Disaggregation of Social Indicators: An Info-Metrics Approach. Social Indicators Research, 152(2), 809–821. https://doi.org/10.1007/s11205-020-02455-z.

Examples

#In this example we use the data of this package
datahp <- financial()
datahs <- social()
# Setting up our function for the dependent variable.
fn               <- datahp$poor_liq ~ Dcollege+Totalincome+Dunemp
#In this case we know that the mean probability of being poor is 0.35.With this function
#we can add the information as information a priori. This information a priori correspond to the
#Q distribution and in this function is called q for the sake of simplicity:
q<- c(0.5,0.5)
v<- matrix(c(-0.2,0,0.2))
#Applying the function ei_gce to our databases. In this case datahp is the
# data where we have our variable of interest
#datahs is the data where we have the information for the disaggregation.
#w can be included if we have weights in both surveys
#Tolerance in this example is fixed in 1e-20
result  <- ei_gce(fn,datahp,datahs,q=q,w=w,method="BFGS",v=v)

[Package EIEntropy version 0.0.1.1 Index]