lsmi_cv {snowboot} | R Documentation |
Cross-validation to Select an Optimal Combination of n.seed and n.wave
Description
From the vector of specified n.seeds
and possible waves 1:n.wave
around each
seed, the function selects a single number n.seed
and an n.wave
(optimal seed-wave combination) that produce
a labeled snowball with multiple inclusions (LSMI) sample with desired
bootstrap confidence intervals for a parameter of interest. Here by ‘desired’
we mean that the interval (and corresponding seed-wave combination) are selected
as having the best coverage (closest to the specified level prob
), based on
a cross-validation procedure with proxy estimates of the parameter.
See Algorithm 2 by Gel et al. (2017) and Details
below.
Usage
lsmi_cv(
net,
n.seeds,
n.wave,
seeds = NULL,
B = 100,
prob = 0.95,
cl = 1,
param = c("mu"),
method = c("percentile", "basic"),
proxyRep = 19,
proxySize = 30
)
Arguments
net |
a network object that is a list containing:
The network object can be simulated by |
n.seeds |
an integer vector of numbers of seeds for snowball sampling
(cf. a single integer |
n.wave |
an integer defining the number of waves (order of the neighborhood)
to be recorded around the seed in the LSMI. For example, |
seeds |
a vector of numeric IDs of pre-specified seeds. If specified, LSMIs are constructed around each such seed. |
B |
a positive integer, the number of bootstrap replications to perform. Default is 100. |
prob |
confidence level for the intervals. Default is 0.95 (i.e., 95% confidence). |
cl |
parameter to specify computer cluster for bootstrapping, passed to
the package
|
param |
The parameter of interest for which to run a cross-validation
and select optimal |
method |
method for calculating the bootstrap intervals. Default is
|
proxyRep |
The number of times to repeat proxy sampling. Default is 19. |
proxySize |
The size of the proxy sample. Default is 30. |
Details
Currently, the bootstrap intervals can be calculated with two alternative
methods: "percentile"
or "basic"
. The "percentile"
intervals correspond to Efron's 100\cdot
prob
% intervals
(see Efron 1979, also Equation 5.18 by Davison and Hinkley 1997 and Equation 3 by Gel et al. 2017, Chen et al. 2018):
(\theta^*_{[B\alpha/2]}, \theta^*_{[B(1-\alpha/2)]}),
where \theta^*_{[B\alpha/2]}
and \theta^*_{[B(1-\alpha/2)]}
are empirical quantiles of the bootstrap distribution with B
bootstrap
replications for parameter \theta
(\theta
can be the f(k)
or \mu
),
and \alpha = 1 -
prob
.
The "basic"
method produces intervals
(see Equation 5.6 by Davison and Hinkley 1997):
(2\hat{\theta} - \theta^*_{[B(1-\alpha/2)]}, 2\hat{\theta} - \theta^*_{[B\alpha/2]}),
where \hat{\theta}
is the sample estimate of the parameter.
Note that this method can lead to negative confidence bounds, especially
when \hat{\theta}
is close to 0.
Value
A list consisting of:
bci |
A numeric vector of length 2 with the bootstrap confidence interval
(lower bound, upper bound) for the parameter of interest. This interval is
obtained by bootstrapping node degrees in an LSMI with the optimal combination
of |
estimate |
Point estimate of the parameter of interest
(based on the LSMI with |
best_combination |
An integer vector of lenght 2 containing the optimal
|
seeds |
A vector of numeric IDs of the seeds that were used
in the LSMI with the optimal combination of |
References
Chen Y, Gel YR, Lyubchich V, Nezafati K (2018).
“Snowboot: bootstrap methods for network inference.”
The R Journal, 10(2), 95–113.
doi: 10.32614/RJ-2018-056.
Davison AC, Hinkley DV (1997).
Bootstrap Methods and Their Application.
Cambridge University Press, Cambridge.
Efron B (1979).
“Bootstrap methods: Another look at the jackknife.”
The Annals of Statistics, 7(1), 1–26.
doi: 10.1214/aos/1176344552.
Gel YR, Lyubchich V, Ramirez Ramirez LL (2017).
“Bootstrap quantification of estimation uncertainties in network degree distributions.”
Scientific Reports, 7, 5807.
doi: 10.1038/s41598-017-05885-x.
See Also
lsmi
, lsmi_union
, boot_dd
, boot_ci
Examples
net <- artificial_networks[[1]]
a <- lsmi_cv(net, n.seeds = c(10, 20, 30), n.wave = 5, B = 100)