R: Function that implements the 'mictools' pipeline. In...

mictools {minerva}

R Documentation

Function that implements the `mictools` pipeline. In particular it computes the null and observed distribution of the `tic_e` measure

Description

Function that implements the mictools pipeline. In particular it computes the null and observed distribution of the tic_e measure

Usage

mictools(x, alpha = 9, C = 5, seed = 0, nperm = 2e+05, p.adjust.method = "BH")

Arguments

`x`	a numeric matrix with N samples on the rows and M variables on the columns (NxM).
`alpha`	float (0, 1.0] or >=4 if alpha is in (0,1] then B will be max(n^alpha, 4) where n is the number of samples. If alpha is >=4 then alpha defines directly the B parameter. If alpha is higher than the number of samples (n) it will be limited to be n, so B = min(alpha, n) Default value is 0.6 (see Details).
`C`	a positive integer number, the `C` parameter of the `mine` statistic. See `mine` function for further details.
`seed`	seed for random number generation reproducibility
`nperm`	integer, number of permutation to perform
`p.adjust.method`	method for pvalue adjustment, see `p.adjust` for available methods.

Details

This is a function to implement the 'mictools' pipeline. Differently from the python pipeline available on github we consider a data matrix of NxM with N samples by rows and M variables by columns as standard for R.

Value

A list of 5 named elements containing the following information of the computed statistic:

tic: This is a vector with the null distribution of tic_e values based on the permutation.
nulldist: Null distribution of the tic_e measure. It is a data.frame of 4 columns containing the histogram of the distribution of tic_e for each bin delimited by BinStart and BinEnd, the count for each bin NullCount and the cumulative distribution of the right tail area NullCumSum
obstic: data.frame with the observed tic_e values, the indexes of the variables between the tic is computed. If the input matrix has column names then the names are reported in the dataframe, otherwise "Var<i>" is added for each variable.
obsdists: data.frame similar to nulldist but with observed values of tic_e
pval: data.frame with the pvalue computed for each comparison. The adjusted pvalue is also reported based on the method chosen with the parameter p.adjust.method

References

D. Albanese, S. Riccadonna, C. Donati, P. Franceschi (2018) _A practical tool for Maximal Information Coefficient Analysis_ GigaScience, 7, 4, doi: 10.1093/gigascience/giy032

Examples

data(Spellman)
Spellman <- as.matrix(Spellman)
spellress <- mictools(Spellman[, 10:20], nperm=1000)

## Use a different pvalue correction method
spellressb <- mictools(Spellman[,10:20], nperm=1000, seed=1234, p.adjust.method="bonferroni")

## Distribution of tic_e null
hist(spellress$tic, breaks=100, main="Tic_e null distribution")
barplot(spellress$nulldist$NullCount)

## Distribution of the observed tic
hist(spellress$obstic$TIC)
barplot(spellress$obsdist$Count)

## Distribution of empirical pvalues
hist(spellress$pval$pval, breaks=50)

[Package minerva version 1.5.10 Index]

Function that implements the mictools pipeline. In particular it computes the null and observed distribution of the tic_e measure