pstats {minerva}R Documentation

Compute pairwise statistics (MIC and normalized TIC) between variables (convenience function).

Description

For each statistic, the upper triangle of the matrix is stored by row (condensed matrix). If m is the number of variables, then for i < j < m, the statistic between (col) i and j is stored in k = m*i - i*(i+1)/2 - i - 1 + j. The length of the vectors is n = m*(m-1)/2.

Usage

pstats(x, alpha = 0.6, C = 15, est = "mic_approx")

Arguments

x

Numeric Matrix of m-by-n with n variables and m samples.

alpha

number (0, 1.0] or >=4 if alpha is in (0,1] then B will be max(n^alpha, 4) where n is the number of samples. If alpha is >=4 then alpha defines directly the B parameter. If alpha is higher than the number of samples (n) it will be limited to be n, so B = min(alpha, n).

C

number (> 0) determines how many more clumps there will be than columns in every partition. Default value is 15, meaning that when trying to draw x grid lines on the x-axis, the algorithm will start with at most 15*x clumps.

est

string ("mic_approx", "mic_e") estimator. With est="mic_approx" the original MINE statistics will be computed, with est="mic_e" the equicharacteristic matrix is is evaluated and MIC_e and TIC_e are returned.

Value

A matrix of (n x (n-1)/2) rows and 4 columns. The first and second column are the indexes relative to the columns in the input matrix x for which the statistic is computed for. Column 3 contains the MIC statistic, while column 4 contains the normalized TIC statistic.

Examples

## Create a matrix of random numbers
## 10 variables x 100 samples
x <- matrix(rnorm(1000), ncol=10)
res <- pstats(x)

head(res)


[Package minerva version 1.5.10 Index]