GenerateClusterData {RHPCBenchmark}R Documentation

Generates clusters from multivariate normal distributions

Description

GenerateClusterData generates clusters of feature vectors drawn from multivariate normal (MVN) distributions. The mean values of the normal distribution corresponding to the first cluster is always at the origin. The remaining clusters are generated from MVN distributions with mean values at v_i and -v_i where v_i is the i-th unit vector. The clusters are generated in the following order by mean value of the MVN for each cluster: origin, v_1, -v_1, v_2, -v_2, v_3, -v_3,..., v_(numberOfClusters-1)/2, -v_(numberOfClusters-1)/2 (if numberOfClusters is odd) origin, v_1, -v_1, v_2, -v_2, v_3, -v_3,..., v_(numberOfClusters-1)/2 (if numberOfClusters is even).

Usage

GenerateClusterData(numberOfFeatures, numberOfVectorsPerCluster,
  numberOfClusters = 2 * numberOfFeatures + 1)

Arguments

numberOfFeatures

the number of features, the dimension of the feature space

numberOfVectorsPerCluster

the number of vectors to randomly generate for each cluster

numberOfClusters

the number of clusters to be generated. The value of this parameter must be in the interval [1,2*numberOfFeatures+1]

Value

a list containing a matrix of feature vectors featureVectors as rows of feature vectors, number of features numberOfFeatures, number of feature vectors numberOfFeatureVectors, and number of clusters numberOfClusters.


[Package RHPCBenchmark version 0.1.0 Index]