generateTestData_2D {clusternomics}R Documentation

Generate simulated 2D dataset for testing

Description

Generate simple 2D dataset with two contexts, where the data are generated from Gaussian distributions. The generated output contains two datasets, where each dataset contains 4 global clusters, originating from two local clusters in each context.

Usage

generateTestData_2D(groupCounts, means, variances = NULL)

Arguments

groupCounts

Number of data samples in each global cluster. It is assumed to be a vector of four elements: c(c11, c21, c12, c22) where cij is the number of samples coming from cluster i in context 1 and cluster j in context 2.

means

Means of the simulated clusters. It is assumed to be a vector of two elements: c(m1, m2) where m1 is the mean of the first cluster in both contexts, and m2 is the mean of the second cluster in both contexts. Because the data are two-dimensional, the mean is assumed to be the same in both dimensions.

variances

Optionally, it is possible to specify different variance for each of the clusters. The variance is assumed to be a vector of two elements: c(v1, v2) where v1 is the variance of the first cluster in both contexts, and v2 is the variance of the second cluster in both contexts. Because the data are two-dimensional, the variance is diagonal and the same in both dimensions.

Value

Returns the simulated datasets together with true assignmets.

data

List of datasets for each context. This can be used as an input for the contextCluster function.

groups

True cluster assignments that were used to generate the data.

Examples

groupCounts <- c(50, 10, 40, 60)
means <- c(-1.5,1.5)
testData <- generateTestData_1D(groupCounts, means)
# Use the dataset as an input for the contextCluster function for testing
datasets <- testData$data


[Package clusternomics version 0.1.1 Index]