sim_dataset {diverse}R Documentation

A procedure to simulate datasets

Description

Simulates a dataset with values of variety for each entity and possible values of abundance.

Usage

sim_dataset(n_categ, category_prefix = "", entity_prefix = "",
  values = "log-normal", size = -1, mean = 0, sd = 1,
  category_random = FALSE)

Arguments

n_categ

a vector with number of categories for each entity. The number of entities to create is defined by the length of this vector.

category_prefix

a prefix to be used as part of the category label

entity_prefix

a prefix to be used as part of the entity label

values

values of abundance. This argument can be both, a distribution name or a vector of integers. The distribution is used to simulate individuals that are aggregated in frequencies or values of abundance. Use 'log-normal' for log normal distribution or 'normal' for normal distribution. In the second case, an integer or a vector of integers of possible values of abundance to be used randomly. Default value is 'log-normal'

size

number of individuals. A number or a vector of numbers for each entity. Default value is 7 times variety.

mean

parameter for normal or log-normal distribution. Default value is 0.

sd

parameter for normal or log-normal distribution. Default value is 1.

category_random

boolean argument to determine if categories should be taken randomly (TRUE) or sequentially (FALSE). Default is FALSE

Value

A data frame with three columns: entity, category and value of abundance.

Examples

sim_dataset(n_categ=50,  category_prefix='ctg', values=1) #equal value, just one entity
#Several entities with random values
n_entities <- 50
v_n_c <- sample(1:100, size = n_entities, replace=TRUE)
v_v <- sample(10:5000, size= n_entities, replace=TRUE)
d <- sim_dataset(n_categ = v_n_c, values= v_v, category_random = TRUE)

[Package diverse version 0.1.5 Index]