generateBatchDataLogPoisson {batchmix}R Documentation

Generate batch data

Description

Generate data from K multivaraite normal or multivariate t distributions with additional noise from batches. Assumes independence across columns. In each column the parameters are randomly permuted for both the groups and batches.

Usage

generateBatchDataLogPoisson(
  N,
  P,
  group_rates,
  batch_rates,
  group_weights,
  batch_weights,
  frac_known = 0.2,
  permute_variables = TRUE,
  scale_data = FALSE
)

Arguments

N

The number of items (rows) to generate.

P

The number of columns in the generated dataset.

group_rates

A vector of the group rates for the classes within a column.

batch_rates

A vector of the batch rates for the classes within a column. This is used to create a variable which has the sum of the appropriate batch and class rate, it might be better interpreted as the batch effect on the observed rate.

group_weights

One of either a K x B matrix of the expected proportion of each batch in each group or a K-vector of the expected proportion of the entire dataset in each group.

batch_weights

A vector of the expected proportion of N in each batch.

frac_known

The number of items with known labels.

permute_variables

Logical indicating if group and batch means and standard deviations should be permuted in each column or not (defaults to “TRUE“).

scale_data

Logical indicating if data should be mean centred and standardised (defaults to “FALSE“).

Value

A list of 5 objects; the data generated from the groups with and without batch effects, the label indicating the generating group, the batch label and the vector indicating training versus test.


[Package batchmix version 2.2.0 Index]