cont_analysis {spsurvey} | R Documentation |
Continuous variable analysis
Description
This function organizes input and output for the analysis of continuous
variables. The analysis data, dframe
, can be either a data frame or a
simple features (sf
) object. If an sf
object is used,
coordinates are extracted from the geometry column in the object, arguments
xcoord
and ycoord
are assigned values "xcoord"
and
"ycoord"
, respectively, and the geometry column is dropped from the
object.
Usage
cont_analysis(
dframe,
vars,
subpops = NULL,
siteID = NULL,
weight = "weight",
xcoord = NULL,
ycoord = NULL,
stratumID = NULL,
clusterID = NULL,
weight1 = NULL,
xcoord1 = NULL,
ycoord1 = NULL,
sizeweight = FALSE,
sweight = NULL,
sweight1 = NULL,
fpc = NULL,
popsize = NULL,
vartype = "Local",
jointprob = "overton",
conf = 95,
pctval = c(5, 10, 25, 50, 75, 90, 95),
statistics = c("CDF", "Pct", "Mean", "Total"),
All_Sites = FALSE
)
Arguments
dframe |
Data to be analyzed (analysis data). A data frame or
|
vars |
Vector composed of character values that identify the
names of response variables in |
subpops |
Vector composed of character values that identify the
names of subpopulation (domain) variables in |
siteID |
Character value providing name of the site ID variable in
the |
weight |
Character value providing name of the design weight
variable in |
xcoord |
Character value providing name of the x-coordinate variable in
the |
ycoord |
Character value providing name of the y-coordinate variable in
the |
stratumID |
Character value providing name of the stratum ID variable in
the |
clusterID |
Character value providing the name of the cluster
(stage one) ID variable in |
weight1 |
Character value providing name of the stage one weight
variable in |
xcoord1 |
Character value providing the name of the stage one
x-coordinate variable in |
ycoord1 |
Character value providing the name of the stage one
y-coordinate variable in |
sizeweight |
Logical value that indicates whether size weights should be
used during estimation, where |
sweight |
Character value providing the name of the size weight variable
in |
sweight1 |
Character value providing name of the stage one size weight
variable in |
fpc |
Object that specifies values required for calculation of the finite population correction factor used during variance estimation. The object must match the survey design in terms of stratification and whether the design is single-stage or two-stage. For an unstratified design, the object is a vector. The vector is composed of a single numeric value for a single-stage design. For a two-stage unstratified design, the object is a named vector containing one more than the number of clusters in the sample, where the first item in the vector specifies the number of clusters in the population and each subsequent item specifies the number of stage two units for the cluster. The name for the first item in the vector is arbitrary. Subsequent names in the vector identify clusters and must match the cluster IDs. For a stratified design, the object is a named list of vectors, where names must match the strata IDs. For each stratum, the format of the vector is identical to the format described for unstratified single-stage and two-stage designs. Note that the finite population correction factor is not used with the local mean variance estimator. Example fpc for a single-stage unstratified survey design:
Example fpc for a single-stage stratified survey design:
Example fpc for a two-stage unstratified survey design:
Example fpc for a two-stage stratified survey design:
|
popsize |
Object that provides values for the population argument of the
Example popsize for calibration:
Example popsize for post-stratification using a data frame:
Example popsize for post-stratification using a table:
Example popsize for post-stratification using an xtabs object:
|
vartype |
Character value providing the choice of the variance
estimator, where |
jointprob |
Character value providing the choice of joint inclusion
probability approximation for use with Horvitz-Thompson and Yates-Grundy
variance estimators, where |
conf |
Numeric value providing the Gaussian-based confidence level. The default value
is |
pctval |
Vector of the set of values at which percentiles are
estimated. The default set is: |
statistics |
Character vector specifying desired estimates, where
|
All_Sites |
A logical variable used when |
Value
The analysis results. A list composed of one, two, three, or four
data frames that contain population estimates for all combinations of
subpopulations, categories within each subpopulation, and response
variables, where the number of data frames is determined by argument
statistics
. The possible data frames in the output list are:
CDF
: a data frame containing CDF estimates
Pct
: data frame containing percentile estimates
Mean
: a data frame containing mean estimates
Total
: a data frame containing total estimates
The CDF
data frame contains the following variables:
- Type
subpopulation (domain) name
- Subpopulation
subpopulation name within a domain
- Indicator
response variable
- Value
value of response variable
- nResp
sample size at or below
Value
- Estimate.P
CDF proportion estimate (in %)
- StdError.P
standard error of CDF proportion estimate
- MarginofError.P
margin of error of CDF proportion estimate
- LCBxxPct.P
xx% (default 95%) lower confidence bound of CDF proportion estimate
- UCBxxPct.P
xx% (default 95%) upper confidence bound of CDF proportion estimate
- Estimate.U
CDF total estimate
- StdError.U
standard error of CDF total estimate
- MarginofError.U
margin of error of CDF total estimate
- LCBxxPct.U
xx% (default 95%) lower confidence bound of CDF total estimate
- UCBxxPct.U
xx% (default 95%) upper confidence bound of CDF total estimate
The Pct
data frame contains the following variables:
- Type
subpopulation (domain) name
- Subpopulation
subpopulation name within a domain
- Indicator
response variable
- Statistic
value of percentile
- nResp
sample size at or below
Value
- Estimate
percentile estimate
- StdError
standard error of percentile estimate
- MarginofError
margin of error of percentile estimate
- LCBxxPct
xx% (default 95%) lower confidence bound of percentile estimate
- UCBxxPct
xx% (default 95%) upper confidence bound of percentile estimate
The Mean
data frame contains the following variables:
- Type
subpopulation (domain) name
- Subpopulation
subpopulation name within a domain
- Indicator
response variable
- nResp
sample size at or below
Value
- Estimate
mean estimate
- StdError
standard error of mean estimate
- MarginofError
margin of error of mean estimate
- LCBxxPct
xx% (default 95%) lower confidence bound of mean estimate
- UCBxxPct
xx% (default 95%) upper confidence bound of mean estimate
The Total
data frame contains the following variables:
- Type
subpopulation (domain) name
- Subpopulation
subpopulation name within a domain
- Indicator
response variable
- nResp
sample size at or below
Value
- Estimate
total estimate
- StdError
standard error of total estimate
- MarginofError
margin of error of total estimate
- LCBxxPct
xx% (default 95%) lower confidence bound of total estimate
- UCBxxPct
xx% (default 95%) upper confidence bound of total estimate
Author(s)
Tom Kincaid Kincaid.Tom@epa.gov
See Also
cat_analysis
for categorical variable analysis
Examples
dframe <- data.frame(
siteID = paste0("Site", 1:100),
wgt = runif(100, 10, 100),
xcoord = runif(100),
ycoord = runif(100),
stratum = rep(c("Stratum1", "Stratum2"), 50),
ContVar = rnorm(100, 10, 1),
All_Sites = rep("All Sites", 100),
Resource_Class = rep(c("Good", "Poor"), c(55, 45))
)
myvars <- c("ContVar")
mysubpops <- c("All_Sites", "Resource_Class")
mypopsize <- data.frame(
Resource_Class = c("Good", "Poor"),
Total = c(4000, 1500)
)
cont_analysis(dframe,
vars = myvars, subpops = mysubpops, siteID = "siteID",
weight = "wgt", xcoord = "xcoord", ycoord = "ycoord",
stratumID = "stratum", popsize = mypopsize, statistics = "Mean"
)