estGF {GFE} | R Documentation |
Gross Flows estimation
Description
Gross Flows under complex electoral surveys.
Usage
estGF(
sampleBase = NULL,
niter = 100,
model = NULL,
colWeights = NULL,
nonrft = FALSE
)
Arguments
sampleBase |
An object of class "data.frame" containing the information of electoral candidates. The data must contain the samplings weights. |
niter |
The number of iterations for the |
model |
A character indicating the model to be used in estimating estimated gross flows. The models available are: "I","II","III","IV" (see also "Details"). |
colWeights |
The column name containing the sampling weights to be used in the fitting process. |
nonrft |
A logical value indicating a non response for first time. |
Details
The population size must satisfy the condition:
where, is the amount of people interviewed who have classification
at first time and classification
at second time,
is the amount of people who did not respond at second time, but did at first time,
is the amount of people who did not respond at first time, but they did at second time and
is the number of people who did not respond at any time or could not be reached.
Let
the initial probability that a person has classification
in the first time, and let
the vote transition probability for the cell
, where
and
.
Thus, four possibles models for the gross flows are given by:
-
Model I: This model assumes that a person's initial probability of being classified as
at first time is the same for everyone, that is,
. Besides, transition probabilities between respond and non response not depend of the classification
, that is
and
.
-
Model II: Unlike 'Model I', this model assumes that person initial probability that person has classification
, only depends of his classification at first time, that is
.
-
Model III: Unlike 'Model I', this model assumes that transition probabilities between response and non response only depends of probability classification at first time, that is
and
.
-
Model IV: Unlike 'Model I', this model assumes that transition probabilities between response and non response only depends of probability classification at second time, that is
and
.
Value
estGF
returns a list containing:
-
Est.CIV: a data.frame containing the gross flows estimation.
-
Params.Model: a list that contains the
,
,
,
,
parameters for the estimated model.
-
Sam.Est: a list containing the sampling estimators
,
,
,
,
.
References
Stasny, E. (1987), ‘Some markov-chain models for nonresponse in estimating gross’, Journal of Oficial Statistics 3, pp. 359-373.
Sarndal, C.-E., Swensson, B. & Wretman, J. (1992), Model Assisted Survey Sampling, Springer-Verlag, New York, USA.
Gutierrez, A., Trujillo, L. & Silva, N. (2014), ‘The estimation of gross ows in complex surveys with random nonresponse’, Survey Methodology 40(2), pp. 285-321.
Examples
library(TeachingSampling)
library(data.table)
# Colombia's electoral candidates in 2014
candidates_t0 <- c("Clara","Enrique","Santos","Martha","Zuluaga","WhiteVote", "NoVote")
candidates_t1 <- c("Santos","Zuluaga","WhiteVote", "NoVote")
N <- 100000
nCanT0 <- length(candidates_t0)
nCanT1 <- length(candidates_t1)
# Initial probabilities
eta <- matrix(c(0.10, 0.10, 0.20, 0.17, 0.28, 0.1, 0.05),
byrow = TRUE, nrow = nCanT0)
# Transition probabilities
P <- matrix(c(0.10, 0.60, 0.15, 0.15,
0.30, 0.10, 0.25,0.35,
0.34, 0.25, 0.16, 0.25,
0.25,0.05, 0.35,0.35,
0.10, 0.25, 0.45,0.20,
0.12, 0.36, 0.22, 0.30,
0.10,0.15, 0.30,0.45),
byrow = TRUE, nrow = nCanT0)
citaMod <- matrix(, ncol = nCanT1, nrow = nCanT0)
row.names(citaMod) <- candidates_t0
colnames(citaMod) <- candidates_t1
for(ii in 1:nCanT0){
citaMod[ii,] <- c(rmultinom(1, size = N * eta[ii,], prob = P[ii,]))
}
# # Model I
psiI <- 0.9
rhoRRI <- 0.9
rhoMMI <- 0.5
citaModI <- matrix(nrow = nCanT0 + 1, ncol = nCanT1 + 1)
rownames(citaModI) <- c(candidates_t0, "Non_Resp")
colnames(citaModI) <- c(candidates_t1, "Non_Resp")
citaModI[1:nCanT0, 1:nCanT1] <- P * c(eta) * rhoRRI * psiI
citaModI[(nCanT0 + 1), (nCanT1 + 1)] <- rhoMMI * (1-psiI)
citaModI[1:nCanT0, (nCanT1 + 1)] <- (1-rhoRRI) * psiI * rowSums(P * c(eta))
citaModI[(nCanT0 + 1), 1:nCanT1 ] <- (1-rhoMMI) * (1-psiI) * colSums(P * c(eta))
citaModI <- round_preserve_sum(citaModI * N)
DBcitaModI <- createBase(citaModI)
# Creating auxiliary information
DBcitaModI[,AuxVar := rnorm(nrow(DBcitaModI), mean = 45, sd = 10)]
# Selects a sample with unequal probabilities
res <- S.piPS(n = 3200, as.data.frame(DBcitaModI)[,"AuxVar"])
sam <- res[,1]
pik <- res[,2]
DBcitaModISam <- copy(DBcitaModI[sam,])
DBcitaModISam[,Pik := pik]
# Gross Flows estimation
estima <- estGF(sampleBase = DBcitaModISam, niter = 500, model = "I", colWeights = "Pik")
estima