optimal_survey_scheme {surveyvoi} | R Documentation |
Optimal survey scheme
Description
Find the optimal survey scheme that maximizes value of information.
This function uses the exact method for
calculating the expected value of the decision given a survey scheme.
Usage
optimal_survey_scheme(
site_data,
feature_data,
site_detection_columns,
site_n_surveys_columns,
site_probability_columns,
site_management_cost_column,
site_survey_cost_column,
feature_survey_column,
feature_survey_sensitivity_column,
feature_survey_specificity_column,
feature_model_sensitivity_column,
feature_model_specificity_column,
feature_target_column,
total_budget,
survey_budget,
site_management_locked_in_column = NULL,
site_management_locked_out_column = NULL,
site_survey_locked_out_column = NULL,
prior_matrix = NULL,
n_threads = 1,
verbose = FALSE
)
Arguments
site_data |
sf::sf() object with site data.
|
feature_data |
base::data.frame() object with feature data.
|
site_detection_columns |
character names of numeric
columns in the argument to site_data that contain the proportion of
surveys conducted within each site that detected each feature.
Each column should correspond to a different feature, and contain
a proportion value (between zero and one). If a site has
not previously been surveyed, a value of zero should be used.
|
site_n_surveys_columns |
character names of numeric
columns in the argument to site_data that contain the total
number of surveys conducted for each each feature within each site.
Each column should correspond to a different feature, and contain
a non-negative integer number (e.g. 0, 1, 2, 3). If a site has
not previously been surveyed, a value of zero should be used.
|
site_probability_columns |
character names of numeric
columns in the argument to site_data that contain modeled
probabilities of occupancy for each feature in each site.
Each column should correspond to a different feature, and contain
probability data (values between zero and one). No missing (NA )
values are permitted in these columns.
|
site_management_cost_column |
character name of column in the
argument to site_data that contains costs for managing each
site for conservation. This column should have numeric values that
are equal to or greater than zero. No missing (NA ) values are
permitted in this column.
|
site_survey_cost_column |
character name of column in the
argument to site_data that contains costs for surveying each
site. This column should have numeric values that are equal to
or greater than zero. No missing (NA ) values are permitted in this
column.
|
feature_survey_column |
character name of the column in the
argument to feature_data that contains logical (TRUE /
FALSE ) values indicating if the feature will be surveyed in
the planned surveys or not. Note that considering additional features will
rapidly increase computational burden, and so it is only recommended to
consider features that are of specific conservation interest.
No missing (NA ) values are permitted in this column.
|
feature_survey_sensitivity_column |
character name of the
column in the argument to feature_data that contains
probability of future surveys correctly detecting a presence of each
feature in a given site (i.e. the sensitivity of the survey methodology).
This column should have numeric values that are between zero and
one. No missing (NA ) values are permitted in this column.
|
feature_survey_specificity_column |
character name of the
column in the argument to feature_data that contains
probability of future surveys correctly detecting an absence of each
feature in a given site (i.e. the specificity of the survey methodology).
This column should have numeric values that are between zero and
one. No missing (NA ) values are permitted in this column.
|
feature_model_sensitivity_column |
character name of the
column in the argument to feature_data that contains
probability of the initial models correctly predicting a presence of each
feature in a given site (i.e. the sensitivity of the models).
This column should have numeric values that are between zero and
one. No missing (NA ) values are permitted in this column.
This should ideally be calculated using
fit_xgb_occupancy_models() or
fit_hglm_occupancy_models() .
|
feature_model_specificity_column |
character name of the
column in the argument to feature_data that contains
probability of the initial models correctly predicting an absence of each
feature in a given site (i.e. the specificity of the models).
This column should have numeric values that are between zero and
one. No missing (NA ) values are permitted in this column.
This should ideally be calculated using
fit_xgb_occupancy_models() or
fit_hglm_occupancy_models() .
|
feature_target_column |
character name of the column in the
argument to feature_data that contains the target
values used to parametrize the conservation benefit of managing of each
feature.
This column should have numeric values that
are equal to or greater than zero. No missing (NA ) values are
permitted in this column.
|
total_budget |
numeric maximum expenditure permitted
for conducting surveys and managing sites for conservation.
|
survey_budget |
numeric maximum expenditure permitted
for conducting surveys.
|
site_management_locked_in_column |
character name of the column
in the argument to site_data that contains logical
(TRUE / FALSE ) values indicating which sites should
be locked in for (TRUE ) being managed for conservation or
(FALSE ) not. No missing (NA ) values are permitted in this
column. This is useful if some sites have already been earmarked for
conservation, or if some sites are already being managed for conservation.
Defaults to NULL such that no sites are locked in.
|
site_management_locked_out_column |
character name of the column
in the argument to site_data that contains logical
(TRUE / FALSE ) values indicating which sites should
be locked out for (TRUE ) being managed for conservation or
(FALSE ) not. No missing (NA ) values are permitted in this
column. This is useful if some sites could potentially be surveyed
to improve model predictions even if they cannot be managed for
conservation. Defaults to NULL such that no sites are locked out.
|
site_survey_locked_out_column |
character name of the column
in the argument to site_data that contains logical
(TRUE / FALSE ) values indicating which sites should
be locked out (TRUE ) from being selected for future surveys or
(FALSE ) not. No missing (NA ) values are permitted in this
column. This is useful if some sites will never be considered for future
surveys (e.g. because they are too costly to survey, or have a
low chance of containing the target species).
Defaults to NULL such that no sites are locked out.
|
prior_matrix |
numeric matrix containing
the prior probability of each feature occupying each site.
Rows correspond to features, and columns correspond to sites.
Defaults to NULL such that prior data is calculated automatically
using prior_probability_matrix() .
|
n_threads |
integer number of threads to use for computation.
|
verbose |
logical indicating if information should be
printed during processing. Defaults to FALSE .
|
Details
The optimal survey scheme is determined using a brute-force algorithm.
Initially, all feasible (valid) survey schemes are identified given the
survey costs and the survey budget (using
feasible_survey_schemes()
. Next, the expected value of each and
every feasible survey scheme is computed
(using evdsi()
).
Finally, the greatest expected value is identified, and all survey schemes
that share this greatest expected value are returned. Due to the nature of
this algorithm, it can take a very long time to complete.
Value
A matrix
of logical
(TRUE
/ FALSE
)
values indicating if a site is selected in the scheme or not. Columns
correspond to sites, and rows correspond to different schemes. If
there is only one optimal survey scheme then the matrix
will only
contain a single row. This matrix also has a numeric
"ev"
attribute that contains the expected value of each scheme.
Dependencies
Please note that this function requires the Gurobi optimization software
(https://www.gurobi.com/) and the gurobi R package if different
sites have different survey costs. Installation instruction are available
online for Linux, Windows, and Mac OS
(see https://support.gurobi.com/hc/en-us/articles/4534161999889-How-do-I-install-Gurobi-Optimizer).
Examples
# set seeds for reproducibility
set.seed(123)
# load example site data
data(sim_sites)
print(sim_sites)
# load example feature data
data(sim_features)
print(sim_features)
# set total budget for managing sites for conservation
# (i.e. 50% of the cost of managing all sites)
total_budget <- sum(sim_sites$management_cost) * 0.5
# set total budget for surveying sites for conservation
# (i.e. 40% of the cost of managing all sites)
survey_budget <- sum(sim_sites$survey_cost) * 0.4
## Not run:
# find optimal survey scheme using exact method
opt_survey <- optimal_survey_scheme(
sim_sites, sim_features,
c("f1", "f2", "f3"), c("n1", "n2", "n3"), c("p1", "p2", "p3"),
"management_cost", "survey_cost",
"survey", "survey_sensitivity", "survey_specificity",
"model_sensitivity", "model_specificity",
"target", total_budget, survey_budget)
# print result
print(opt_survey)
## End(Not run)
[Package
surveyvoi version 1.0.6
Index]