generate_multimethod_data {emery} | R Documentation |
Create data sets which simulate paired measurements of multiple methods
Description
generate_multimethod_data()
is a general function for creating a data set which
simulates the results one might see when using several different methods to measure a set
of objects.
Usage
generate_multimethod_data(
type = c("binary", "ordinal", "continuous"),
n_method = 3,
n_obs = 100,
prev = 0.5,
D = NULL,
method_names = NULL,
obs_names = NULL,
...
)
generate_multimethod_binary(
n_method = 3,
n_obs = 100,
prev = 0.5,
D = NULL,
se = rep(0.9, n_method),
sp = rep(0.9, n_method),
method_names = NULL,
obs_names = NULL,
n_method_subset = n_method,
first_reads_all = FALSE
)
generate_multimethod_ordinal(
n_method = 3,
n_obs = 100,
prev = 0.5,
D = NULL,
n_level = 5,
pmf_pos = matrix(rep(1:n_level - 1, n_method), nrow = n_method, byrow = TRUE),
pmf_neg = matrix(rep(n_level:1 - 1, n_method), nrow = n_method, byrow = TRUE),
method_names = NULL,
level_names = NULL,
obs_names = NULL,
n_method_subset = n_method,
first_reads_all = FALSE
)
generate_multimethod_continuous(
n_method = 2,
n_obs = 100,
prev = 0.5,
D = NULL,
mu_i1 = rep(12, n_method),
sigma_i1 = diag(n_method),
mu_i0 = rep(10, n_method),
sigma_i0 = diag(n_method),
method_names = NULL,
obs_names = NULL,
n_method_subset = n_method,
first_reads_all = FALSE
)
Arguments
type |
A string specifying the data type of the methods being simulated. |
n_method |
An integer representing the number of methods to simulate. |
n_obs |
An integer representing the number of observations to simulate. |
prev |
A value between 0-1 which represents the proportion of "positive" results in the target population. |
D |
Optional binary vector representing the true classification of each observation. |
method_names |
Optional vector of names used to identify each method. |
obs_names |
Optional vector of names used to identify each observation. |
... |
Additional parameters |
se , sp |
Used for binary methods. A vector of length n_method of values between 0-1 representing the sensitivity and specificity of the methods. |
n_method_subset |
Used for binary methods. An integer defining how many methods to select at random to produce a result for each observation |
first_reads_all |
Used for binary methods. A logical which forces method 1 to have a result for every observation |
n_level |
Used for ordinal methods. An integer representing the number of ordinal levels each method has |
pmf_pos , pmf_neg |
Used for ordinal methods. A n_method by n_level matrix representing the probability mass functions for positive and negative results, respectively |
level_names |
Used for ordinal methods. Optional vector of names used to identify each level |
mu_i1 , mu_i0 |
Used for continuous methods. Vectors of length n_method of the method mean values for positive (negative) observations |
sigma_i1 , sigma_i0 |
Used for continuous methods. Covariance matrices of method positive (negative) observations |
Details
The function supports binary measurement methods, e.g., Pass/Fail; ordinal measurement methods, e.g., the Likert scale; and continuous measurement methods, e.g., height. The data are generated under the assumption that the underlying population consists of a mixture of two groups. The primary application of this is to simulate a sample from a population which has some prevalence of disease.
Value
A list containing a simulated data set and the parameters used to create it
Examples
# Set seed for this example
set.seed(11001101)
# Generate data for 4 binary methods
my_sim <- generate_multimethod_data(
"binary",
n_obs = 75,
n_method = 4,
se = c(0.87, 0.92, 0.79, 0.95),
sp = c(0.85, 0.93, 0.94, 0.80),
method_names = c("alpha", "beta", "gamma", "delta"))
# View the data
my_sim$generated_data
# View the parameters used to generate the data
my_sim$params
# Estimate ML accuracy values by EM algorithm
my_result <- estimate_ML(
"binary",
data = my_sim$generated_data,
save_progress = FALSE # this reduces the data stored in the resulting object
)
# View results of ML estimate
my_result@results