MA.estimates {RDS} | R Documentation |
MA Estimates
Description
This function computes the sequential sampling (MA) estimates for a categorical variable or numeric variable.
Usage
MA.estimates(
rds.data,
trait.variable,
seed.selection = "degree",
number.of.seeds = NULL,
number.of.coupons = NULL,
number.of.iterations = 3,
N = NULL,
M1 = 25,
M2 = 20,
seed = 1,
initial.sampling.probabilities = NULL,
MPLE.samplesize = 50000,
SAN.maxit = 5,
SAN.nsteps = 2^19,
sim.interval = 10000,
number.of.cross.ties = NULL,
max.degree = NULL,
parallel = 1,
parallel.type = "PSOCK",
full.output = FALSE,
verbose = TRUE
)
Arguments
rds.data |
An |
trait.variable |
A string giving the name of the variable in the
|
seed.selection |
An estimate of the mechanism guiding the choice of seeds. The choices are
|
number.of.seeds |
The number of seeds chosen to initiate the sampling. |
number.of.coupons |
The number of coupons given to each respondent. |
number.of.iterations |
The number of iterations used at the core of the algorithm. |
N |
An estimate of the number of members of the population being
sampled. If |
M1 |
The number of networked populations generated at each iteration. |
M2 |
The number of (full) RDS samples generated for each networked population at each iteration. |
seed |
The random number seed used to initiate the computations. |
initial.sampling.probabilities |
Initialize sampling probabilities for the algorithm. If missing, they are taken as proportional to degree, and this is almost always the best starting values. |
MPLE.samplesize |
Number of samples to take in the computation of the maximum pseudolikelihood estimator (MPLE) of the working model parameter. The default is almost always sufficient. |
SAN.maxit |
A ceiling on the number of simulated annealing iterations. |
SAN.nsteps |
Number of MCMC proposals for all the annealing runs combined. |
sim.interval |
Number of MCMC steps between each of the M1 sampled networks per iteration. |
number.of.cross.ties |
The expected number of ties between those with
the trait and those without. If missing, it is computed based on the
respondent's reports of the number of ties they have to population members
who have the trait (i.e. |
max.degree |
Impose ceiling on degree size. |
parallel |
Number of processors to use in the computations. The default is 1, that is no parallel processing. |
parallel.type |
The type of cluster to start. e.g. 'PSOCK', 'MPI', etc. |
full.output |
More verbose output |
verbose |
Should verbose diagnostics be printed while the algorithm is running. |
Value
If trait.variable
is numeric then the model-assisted estimate
of the mean is returned, otherwise a vector of proportion estimates is
returned. If full.output=TRUE
this leads to:
If full.output=FALSE
this leads to an object of class
rds.interval.estimate
which is a list with component
- estimate
the numerical point estimate of proportion of the
trait.variable
.- interval
a matrix with size columns and one row per category of
trait.variable
:- point estimate
The HT estimate of the population mean.
- 95% Lower Bound
Lower 95% confidence bound
- 95% Upper Bound
Upper 95% confidence bound
- Design Effect
The design effect of the RDS
- s.e.
standard error
- n
count of the number of sample values with that value of the trait
- rds.data
an
rds.data.frame
that indicates recruitment patterns by a pair of attributes named “id” and “recruiter.id”.- N
an estimate of the number of members of the population being sampled. If
NULL
it is read as thepop.size.mid
attribute of therds.data
frame. If that is missing it defaults to 1000.- M1
the number of networked populations generated at each iteration.
- M2
the number of (full) RDS populations generated for each networked population at each iteration.
- seed
the random number seed used to initiate the computations.
- seed.selection
an estimate of the mechanism guiding the choice of seeds. The choices are
- "allwithtrait"
indicating that all the seeds had the trait;
- "random"
meaning they were, as if, a simple random sample of individuals from the population;
- "sample"
indicating that the seeds are taken as those in the sample (and resampled for the population with that composition if necessary);
- "degree"
is proportional to the degree of the individual;
- "allwithtraitdegree"
indicating that all the seeds had the trait and the probability of being a seed is proportional to the degree of the respondent.
- number.of.seeds
The number of seeds chosen to initiate the sampling.
- number.of.coupons
The number of coupons given to each respondent.
- number.of.iterations
The number of iterations used at the core of the algorithm.
- outcome.variable
The name of the outcome variable
- weight.type
The type of weighting used (i.e. MA)
- uncertainty
The type of weighting used (i.e. MA)
- details
A list of other diagnostic output from the computations.
- varestBS
Output from the bootstrap procedure. A list with two elements:
var
is the bootstrap variance, andBSest
is the vector of bootstrap estimates themselves.- coefficient
estimate of the parameter of the ERGM for the network.
Author(s)
Krista J. Gile with help from Mark S. Handcock
References
Gile, Krista J. 2011 Improved Inference for Respondent-Driven Sampling Data with Application to HIV Prevalence Estimation, Journal of the American Statistical Association, 106, 135-146.
Gile, Krista J., Handcock, Mark S., 2010. Respondent-driven Sampling: An Assessment of Current Methodology, Sociological Methodology, 40, 285-327. <doi:10.1111/j.1467-9531.2010.01223.x>
Gile, Krista J., Beaudry, Isabelle S. and Handcock, Mark S., 2018 Methods for Inference from Respondent-Driven Sampling Data, Annual Review of Statistics and Its Application <doi:10.1146/annurev-statistics-031017-100704>.
See Also
RDS.I.estimates
, RDS.I.estimates
Examples
## Not run:
data(faux)
MA.estimates(rds.data=faux,trait.variable='X')
## End(Not run)