estGF {GFE}R Documentation

Gross Flows estimation

Description

Gross Flows under complex electoral surveys.

Usage

estGF(
  sampleBase = NULL,
  niter = 100,
  model = NULL,
  colWeights = NULL,
  nonrft = FALSE
)

Arguments

sampleBase

An object of class "data.frame" containing the information of electoral candidates. The data must contain the samplings weights.

niter

The number of iterations for the ηi\eta_{i} and pijp_{ij} model parameters within the model.

model

A character indicating the model to be used in estimating estimated gross flows. The models available are: "I","II","III","IV" (see also "Details").

colWeights

The column name containing the sampling weights to be used in the fitting process.

nonrft

A logical value indicating a non response for first time.

Details

The population size NN must satisfy the condition:

N=jiNij+jCj+iRi+M N = \sum_{j}\sum_{i} N_{ij} + \sum_{j} C_{j} + \sum_{i} R_{i} + M

where, NijN_{ij} is the amount of people interviewed who have classification ii at first time and classification jj at second time, RiR_{i} is the amount of people who did not respond at second time, but did at first time, CjC_{j} is the amount of people who did not respond at first time, but they did at second time and MM is the number of people who did not respond at any time or could not be reached. Let ηi\eta_{i} the initial probability that a person has classification ii in the first time, and let pijp_{ij} the vote transition probability for the cell (i,j)(i,j), where iηi=1\sum_{i} \eta_{i} = 1 and jpij=1\sum_{j} p_{ij} = 1. Thus, four possibles models for the gross flows are given by:

  1. Model I: This model assumes that a person's initial probability of being classified as ii at first time is the same for everyone, that is, ψ(i,j)=ψ\psi(i,j) = \psi. Besides, transition probabilities between respond and non response not depend of the classification (i,j)(i,j), that is ρMM(i,j)=ρMM\rho_{MM}(i,j) = \rho_{MM} and ρRR(i,j)=ρRR\rho_{RR}(i,j) = \rho_{RR}.

  2. Model II: Unlike 'Model I', this model assumes that person initial probability that person has classification (i,j)(i,j), only depends of his classification at first time, that is ψ(i,j)=ψ(i)\psi(i,j) = \psi(i).

  3. Model III: Unlike 'Model I', this model assumes that transition probabilities between response and non response only depends of probability classification at first time, that is ρMM(i,j)=ρMM(i)\rho_{MM}(i,j) = \rho_{MM}(i) and ρRR(i,j)=ρRR(i)\rho_{RR}(i,j) = \rho_{RR}(i).

  4. Model IV: Unlike 'Model I', this model assumes that transition probabilities between response and non response only depends of probability classification at second time, that is ρMM(i,j)=ρMM(j)\rho_{MM}(i,j) = \rho_{MM}(j) and ρRR(i,j)=ρRR(j)\rho_{RR}(i,j) = \rho_{RR}(j).

Value

estGF returns a list containing:

  1. Est.CIV: a data.frame containing the gross flows estimation.

  2. Params.Model: a list that contains the η^i\hat{\eta}_{i}, p^ij\hat{p}_{ij}, ψ^(i,j)\hat{\psi}(i,j), ρ^RR(i,j)\hat{\rho}_{RR}(i,j), ρ^MM(i,j)\hat{\rho}_{MM}(i,j) parameters for the estimated model.

  3. Sam.Est: a list containing the sampling estimators N^ij\hat{N}_{ij}, R^i\hat{R}_{i}, C^j\hat{C}_{j}, M^\hat{M}, N^\hat{N}.

References

Stasny, E. (1987), ‘Some markov-chain models for nonresponse in estimating gross’, Journal of Oficial Statistics 3, pp. 359-373.
Sarndal, C.-E., Swensson, B. & Wretman, J. (1992), Model Assisted Survey Sampling, Springer-Verlag, New York, USA.
Gutierrez, A., Trujillo, L. & Silva, N. (2014), ‘The estimation of gross ows in complex surveys with random nonresponse’, Survey Methodology 40(2), pp. 285-321.

Examples

library(TeachingSampling)
library(data.table)
# Colombia's electoral candidates in 2014
candidates_t0 <- c("Clara","Enrique","Santos","Martha","Zuluaga","WhiteVote", "NoVote")
candidates_t1 <- c("Santos","Zuluaga","WhiteVote", "NoVote")

N <- 100000
nCanT0 <- length(candidates_t0)
nCanT1 <- length(candidates_t1)
# Initial probabilities
eta <- matrix(c(0.10, 0.10, 0.20, 0.17, 0.28, 0.1, 0.05),
				byrow = TRUE, nrow = nCanT0)
# Transition probabilities
P <- matrix(c(0.10, 0.60, 0.15, 0.15,
				 0.30, 0.10, 0.25,0.35,
				 0.34, 0.25, 0.16, 0.25,
				 0.25,0.05, 0.35,0.35,
				 0.10, 0.25, 0.45,0.20,
				 0.12, 0.36, 0.22, 0.30,
				 0.10,0.15, 0.30,0.45),
		byrow = TRUE, nrow = nCanT0)
citaMod <- matrix(, ncol = nCanT1, nrow = nCanT0)
row.names(citaMod) <- candidates_t0
colnames(citaMod) <- candidates_t1

for(ii in 1:nCanT0){
		citaMod[ii,] <- c(rmultinom(1, size = N * eta[ii,], prob = P[ii,]))
}

# # Model I
psiI   <- 0.9
rhoRRI <- 0.9
rhoMMI <- 0.5

citaModI <- matrix(nrow = nCanT0 + 1, ncol = nCanT1 + 1)
rownames(citaModI) <- c(candidates_t0, "Non_Resp")
colnames(citaModI) <- c(candidates_t1, "Non_Resp")
citaModI[1:nCanT0, 1:nCanT1] <- P * c(eta) * rhoRRI * psiI
citaModI[(nCanT0 + 1), (nCanT1 + 1)] <- rhoMMI * (1-psiI)
citaModI[1:nCanT0, (nCanT1 + 1)] <- (1-rhoRRI) * psiI * rowSums(P * c(eta))
citaModI[(nCanT0 + 1), 1:nCanT1 ] <- (1-rhoMMI) * (1-psiI) * colSums(P * c(eta))
citaModI <- round_preserve_sum(citaModI * N)
DBcitaModI <- createBase(citaModI)

# Creating auxiliary information
DBcitaModI[,AuxVar := rnorm(nrow(DBcitaModI), mean = 45, sd = 10)]

# Selects a sample with unequal probabilities
res <- S.piPS(n = 3200, as.data.frame(DBcitaModI)[,"AuxVar"])
sam <- res[,1]
pik <- res[,2]
DBcitaModISam <- copy(DBcitaModI[sam,])
DBcitaModISam[,Pik := pik]

# Gross Flows estimation
estima <- estGF(sampleBase = DBcitaModISam, niter = 500, model = "I", colWeights = "Pik")
estima

[Package GFE version 0.1.1 Index]