diffrisk_analysis {spsurvey} | R Documentation |
Risk difference analysis
Description
This function organizes input and output for risk difference analysis (of
categorical variables). The analysis data,
dframe
, can be either a data frame or a simple features (sf
) object. If an
sf
object is used, coordinates are extracted from the geometry column in the
object, arguments xcoord
and ycoord
are assigned values
"xcoord"
and "ycoord"
, respectively, and the geometry column is
dropped from the object.
Usage
diffrisk_analysis(
dframe,
vars_response,
vars_stressor,
response_levels = NULL,
stressor_levels = NULL,
subpops = NULL,
siteID = NULL,
weight = "weight",
xcoord = NULL,
ycoord = NULL,
stratumID = NULL,
clusterID = NULL,
weight1 = NULL,
xcoord1 = NULL,
ycoord1 = NULL,
sizeweight = FALSE,
sweight = NULL,
sweight1 = NULL,
fpc = NULL,
popsize = NULL,
vartype = "Local",
conf = 95,
All_Sites = FALSE
)
Arguments
dframe |
Data to be analyzed (analysis data). A data frame or
|
vars_response |
Vector composed of character values that identify the
names of response variables in |
vars_stressor |
Vector composed of character values that identify the
names of stressor variables in |
response_levels |
List providing the category values (levels) for each
element in the |
stressor_levels |
List providing the category values (levels) for each
element in the |
subpops |
Vector composed of character values that identify the
names of subpopulation (domain) variables in |
siteID |
Character value providing the name of the site ID variable in
|
weight |
Character value providing the name of the design weight
variable in |
xcoord |
Character value providing name of the x-coordinate variable in
|
ycoord |
Character value providing name of the y-coordinate variable in
|
stratumID |
Character value providing the name of the stratum ID
variable in |
clusterID |
Character value providing the name of the cluster
(stage one) ID variable in |
weight1 |
Character value providing the name of the stage one weight
variable in |
xcoord1 |
Character value providing the name of the stage one
x-coordinate variable in |
ycoord1 |
Character value providing the name of the stage one
y-coordinate variable in |
sizeweight |
Logical value that indicates whether size weights should be
used during estimation, where |
sweight |
Character value providing the name of the size weight variable
in |
sweight1 |
Character value providing the name of the stage one size
weight variable in |
fpc |
Object that specifies values required for calculation of the finite population correction factor used during variance estimation. The object must match the survey design in terms of stratification and whether the design is single-stage or two-stage. For an unstratified design, the object is a vector. The vector is composed of a single numeric value for a single-stage design. For a two-stage unstratified design, the object is a named vector containing one more than the number of clusters in the sample, where the first item in the vector specifies the number of clusters in the population and each subsequent item specifies the number of stage two units for the cluster. The name for the first item in the vector is arbitrary. Subsequent names in the vector identify clusters and must match the cluster IDs. For a stratified design, the object is a named list of vectors, where names must match the strata IDs. For each stratum, the format of the vector is identical to the format described for unstratified single-stage and two-stage designs. Note that the finite population correction factor is not used with the local mean variance estimator. Example fpc for a single-stage unstratified survey design:
Example fpc for a single-stage stratified survey design:
Example fpc for a two-stage unstratified survey design:
Example fpc for a two-stage stratified survey design:
|
popsize |
Object that provides values for the population argument of the
Example popsize for calibration:
Example popsize for post-stratification using a data frame:
Example popsize for post-stratification using a table:
Example popsize for post-stratification using an xtabs object:
|
vartype |
Character value providing the choice of the variance
estimator, where |
conf |
Numeric value providing the Gaussian-based confidence level. The default value
is |
All_Sites |
A logical variable used when |
Value
The analysis results. A data frame of population estimates for all combinations of subpopulations, categories within each subpopulation, response variables, and categories within each response variable. Estimates are provided for proportion and size of the population plus standard error, margin of error, and confidence interval estimates. The data frame contains the following variables:
- Type
subpopulation (domain) name
- Subpopulation
subpopulation name within a domain
- Response
response variable
- Stressor
stressor variable
- nResp
sample size
- Estimate
risk difference estimate
- Estimate_StressPoor
risk estimate for poor condition stressor
- Estimate_StressGood
risk estimate for good condition stressor
- StdError
risk difference standard error
- MarginofError
risk difference margin of error
- LCBxxPct
xx% (default 95%) lower confidence bound
- UCBxxPct
xx% (default 95%) upper confidence bound
- WeightTotal
sum of design weights
- Count_RespPoor_StressPoor
number of observations in the poor response and poor stressor group
- Count_RespPoor_StressGood
number of observations in the poor response and good stressor group
- Count_RespGood_StressPoor
number of observations in the good response and poor stressor group
- Count_RespGood_StressGood
number of observations in the good response and good stressor group
- Prop_RespPoor_StressPoor
weighted proportion of observations in the poor response and poor stressor group
- Prop_RespPoor_StressGood
weighted proportion of observations in the poor response and good stressor group
- Prop_RespGood_StressPoor
weighted proportion of observations in the good response and poor stressor group
- Prop_RespGood_StressGood
weighted proportion of observations in the good response and good stressor group
Details
Risk difference measures the absolute strength of association between conditional probabilities defined for a response variable and a stressor variable, where the response and stressor variables are classified as either good (i.e., reference condition) or poor (i.e., different from reference condition). Risk difference is defined as the difference between two conditional probabilities: the probability that the response variable is in poor condition given that the stressor variable is in poor condition and the probability that the response variable is in poor condition given that the stressor variable is in good condition. Risk difference values close to zero indicate that the stressor variable has little or no impact on the probability that the response variable is in poor condition. Risk difference values much greater than zero indicate that the stressor variable has a significant impact on the probability that the response variable is in poor condition.
Author(s)
Tom Kincaid Kincaid.Tom@epa.gov
See Also
attrisk_analysis
for attributable risk analysis
relrisk_analysis
for relative risk analysis
Examples
dframe <- data.frame(
siteID = paste0("Site", 1:100),
wgt = runif(100, 10, 100),
xcoord = runif(100),
ycoord = runif(100),
stratum = rep(c("Stratum1", "Stratum2"), 50),
RespVar1 = sample(c("Poor", "Good"), 100, replace = TRUE),
RespVar2 = sample(c("Poor", "Good"), 100, replace = TRUE),
StressVar = sample(c("Poor", "Good"), 100, replace = TRUE),
All_Sites = rep("All Sites", 100),
Resource_Class = rep(c("Agr", "Forest"), c(55, 45))
)
myresponse <- c("RespVar1", "RespVar2")
mystressor <- c("StressVar")
mysubpops <- c("All_Sites", "Resource_Class")
diffrisk_analysis(dframe,
vars_response = myresponse,
vars_stressor = mystressor, subpops = mysubpops, siteID = "siteID",
weight = "wgt", xcoord = "xcoord", ycoord = "ycoord",
stratumID = "stratum"
)