R: Gross flows variance estimation.

reSamGF {GFE}

R Documentation

Gross flows variance estimation.

Description

Gross flows variance estimation according to resampling method (Bootstrap or Jackknife).

Usage

reSamGF(
  sampleBase = NULL,
  nRepBoot = 500,
  model = "I",
  niter = 100,
  type = "Bootstrap",
  colWeights = NULL,
  nonrft = FALSE
)

Arguments

`sampleBase`	An object of class data.frame or data.table containing the sample selected to estimate the gross flows.
`nRepBoot`	The number of replicates for the bootstrap method.
`model`	A character indicating the model that will be used for estime the gross flows. The available models are: 'I','II','III','IV'.
`niter`	The number of iterations for the `\eta_{i}` and `p_{ij}` model parameters.
`type`	A character indicating the resampling method ("Bootstrap" or "Jackknife")
`colWeights`	The data colum name containing the sampling weights to be used on the fitting process.
`nonrft`	a logical value indicating the non response for the first time.

Details

The resampling methods for variance estimation are:

Bootstrap:: This technique allows to estimate the sampling distribution of almost any statistic by using random sampling methods. Bootstrapping is the practice of estimating properties of an statistic (such as its variance) by measuring those properties from it's approximated sample.
Jackknife:: The jackknife estimate of a parameter is found by systematically leaving out each observation from a dataset and calculating the estimate and then finding the average of these calculations. Given a sample of size n, the jackknife estimate is found by aggregating the estimates of each n-1-sized sub-sample.

Value

reSamGF returns a list that contains the variance of each parameter of the selected model.

References

Efron, B. (1979), ‘Computers and the theory of statistics: Thinking the unthinkable’, SIAM review 21(4), pp. 460-480.
Quenouille, M. H. (1949), ‘Problems in plane sampling’, The Annals of Mathematical Statistics pp. 355-375.
Tukey, J. W. (1958), ‘Bias and confidence in not-quite large samples’, Annals of Mathematical Statistics 29, pp. 614.

Examples

library(TeachingSampling)
library(data.table)
# Colombia's electoral candidates in 2014
candidates_t0 <- c("Clara","Enrique","Santos","Martha","Zuluaga","Blanco", "NoVoto")
candidates_t1 <- c("Santos","Zuluaga","Blanco", "NoVoto")

N 	   <- 100000
nCanT0  <- length(candidates_t0)
nCanT1  <- length(candidates_t1)

# Initial probabilities 
eta <- matrix(c(0.10, 0.10, 0.20, 0.17, 0.28, 0.1, 0.05),
			   byrow = TRUE, nrow = nCanT0)
# Transition probabilities
P <- matrix(c(0.10, 0.60, 0.15, 0.15,
			  0.30, 0.10, 0.25,	0.35,
			  0.34, 0.25, 0.16, 0.25,
			  0.25,	0.05, 0.35,	0.35,
			  0.10, 0.25, 0.45,	0.20,
			  0.12, 0.36, 0.22, 0.30,
			  0.10,	0.15, 0.30,	0.45),
	 byrow = TRUE, nrow = nCanT0)

citaMod <- matrix(, ncol = nCanT1, nrow = nCanT0)
row.names(citaMod) <- candidates_t0
colnames(citaMod) <- candidates_t1

for(ii in 1:nCanT0){ 
 citaMod[ii,] <- c(rmultinom(1, size = N * eta[ii,], prob = P[ii,]))
}

# # Model I
psiI   <- 0.9
rhoRRI <- 0.9
rhoMMI <- 0.5

citaModI <- matrix(nrow = nCanT0 + 1, ncol = nCanT1 + 1)
rownames(citaModI) <- c(candidates_t0, "Non_Resp")
colnames(citaModI) <- c(candidates_t1, "Non_Resp")

citaModI[1:nCanT0, 1:nCanT1] 		 <- P * c(eta) * rhoRRI * psiI  
citaModI[(nCanT0 + 1), (nCanT1 + 1)]  <- rhoMMI * (1-psiI) 
citaModI[1:nCanT0, (nCanT1 + 1)]  	 <-  (1-rhoRRI) * psiI * rowSums(P * c(eta))
citaModI[(nCanT0 + 1), 1:nCanT1 ] 	 <-  (1-rhoMMI) * (1-psiI) * colSums(P * c(eta))
citaModI   <- round_preserve_sum(citaModI * N)
DBcitaModI <- createBase(citaModI)

# Creating auxiliary information
DBcitaModI[,AuxVar := rnorm(nrow(DBcitaModI), mean = 45, sd = 10)]
# Selects a sample with unequal probabilities
res <- S.piPS(n = 1200, as.data.frame(DBcitaModI)[,"AuxVar"])
sam <- res[,1]
pik <- res[,2]
DBcitaModISam <- copy(DBcitaModI[sam,])
DBcitaModISam[,Pik := pik]

# Gross flows estimation
estima <- estGF(sampleBase = DBcitaModISam, niter = 500, model = "II", colWeights = "Pik")
# gross flows variance estimation
varEstima <- reSamGF(sampleBase = DBcitaModISam, type = "Bootstrap", nRepBoot = 100,
						model = "II", niter = 101,  colWeights = "Pik")
varEstima

[Package GFE version 0.1.1 Index]