generate_multimethod_data {emery}R Documentation

Create data sets which simulate paired measurements of multiple methods

Description

generate_multimethod_data() is a general function for creating a data set which simulates the results one might see when using several different methods to measure a set of objects.

Usage

generate_multimethod_data(
  type = c("binary", "ordinal", "continuous"),
  n_method = 3,
  n_obs = 100,
  prev = 0.5,
  D = NULL,
  method_names = NULL,
  obs_names = NULL,
  ...
)

generate_multimethod_binary(
  n_method = 3,
  n_obs = 100,
  prev = 0.5,
  D = NULL,
  se = rep(0.9, n_method),
  sp = rep(0.9, n_method),
  method_names = NULL,
  obs_names = NULL,
  n_method_subset = n_method,
  first_reads_all = FALSE
)

generate_multimethod_ordinal(
  n_method = 3,
  n_obs = 100,
  prev = 0.5,
  D = NULL,
  n_level = 5,
  pmf_pos = matrix(rep(1:n_level - 1, n_method), nrow = n_method, byrow = TRUE),
  pmf_neg = matrix(rep(n_level:1 - 1, n_method), nrow = n_method, byrow = TRUE),
  method_names = NULL,
  level_names = NULL,
  obs_names = NULL,
  n_method_subset = n_method,
  first_reads_all = FALSE
)

generate_multimethod_continuous(
  n_method = 2,
  n_obs = 100,
  prev = 0.5,
  D = NULL,
  mu_i1 = rep(12, n_method),
  sigma_i1 = diag(n_method),
  mu_i0 = rep(10, n_method),
  sigma_i0 = diag(n_method),
  method_names = NULL,
  obs_names = NULL,
  n_method_subset = n_method,
  first_reads_all = FALSE
)

Arguments

type

A string specifying the data type of the methods being simulated.

n_method

An integer representing the number of methods to simulate.

n_obs

An integer representing the number of observations to simulate.

prev

A value between 0-1 which represents the proportion of "positive" results in the target population.

D

Optional binary vector representing the true classification of each observation.

method_names

Optional vector of names used to identify each method.

obs_names

Optional vector of names used to identify each observation.

...

Additional parameters

se, sp

Used for binary methods. A vector of length n_method of values between 0-1 representing the sensitivity and specificity of the methods.

n_method_subset

Used for binary methods. An integer defining how many methods to select at random to produce a result for each observation

first_reads_all

Used for binary methods. A logical which forces method 1 to have a result for every observation

n_level

Used for ordinal methods. An integer representing the number of ordinal levels each method has

pmf_pos, pmf_neg

Used for ordinal methods. A n_method by n_level matrix representing the probability mass functions for positive and negative results, respectively

level_names

Used for ordinal methods. Optional vector of names used to identify each level

mu_i1, mu_i0

Used for continuous methods. Vectors of length n_method of the method mean values for positive (negative) observations

sigma_i1, sigma_i0

Used for continuous methods. Covariance matrices of method positive (negative) observations

Details

The function supports binary measurement methods, e.g., Pass/Fail; ordinal measurement methods, e.g., the Likert scale; and continuous measurement methods, e.g., height. The data are generated under the assumption that the underlying population consists of a mixture of two groups. The primary application of this is to simulate a sample from a population which has some prevalence of disease.

Value

A list containing a simulated data set and the parameters used to create it

Examples

# Set seed for this example
set.seed(11001101)

# Generate data for 4 binary methods
my_sim <- generate_multimethod_data(
  "binary",
  n_obs = 75,
  n_method = 4,
  se = c(0.87, 0.92, 0.79, 0.95),
  sp = c(0.85, 0.93, 0.94, 0.80),
  method_names = c("alpha", "beta", "gamma", "delta"))

# View the data
my_sim$generated_data

# View the parameters used to generate the data
my_sim$params

# Estimate ML accuracy values by EM algorithm
my_result <- estimate_ML(
  "binary",
  data = my_sim$generated_data,
  save_progress = FALSE # this reduces the data stored in the resulting object
)

# View results of ML estimate
my_result@results


[Package emery version 0.5.1 Index]