mvMISE_e_perm {mvMISE}R Documentation

A function to obtain permutation-based p-values for fixed effects estimates in mvMISE_e

Description

This function calls mvMISE_e multiple times by permuting the row index (observations) of the covariate matrix X. It may take a long time to permute high-dimensional outcomes, but can be run in parallel using multiple nodes.

Usage

mvMISE_e_perm(nperm = 100, nnodes = 2, Y, X, id, Zidx = 1, maxIter = 100, tol = 0.001, 
    lambda = 0.05, cov_miss = NULL, miss_y = TRUE, sigma_diff = FALSE)

Arguments

nperm

the number of permutations.

nnodes

the number of nodes that will be used in parallel for permutations.

Y

an outcome matrix. Each row is a sample, and each column is an outcome variable, with potential missing values (NAs).

X

a covariate matrix. Each row is a sample, and each column is a covariate. The covariates can be common among all of the outcomes (e.g., age, gender) or outcome-specific. If a covariate is specific for the k-th outcome, one may set all the values corresponding to the other outcomes to be zero. If X is common across outcomes, the row number of X equals the row number of Y. Otherwise if X is outcome-specific, the row number of X equals the number of elements in Y, i.e., outcome-specific X is stacked across outcomes within each cluster. See the Examples for demonstration.

id

a vector for cluster/batch index, matching with the rows of Y, and X if it is not outcome specific.

Zidx

the column indices of matrix X used as the design matrix of random effects. The default is 1, i.e., a random intercept is included if the first column of X is a vector of 1s. If Zidx=c(1,2), then the model would estimate the random intercept and the random effects of the 2nd column in the covariate matrix X. The random-effects in this model are assumed to be independent.

maxIter

the maximum number of iterations for the EM algorithm.

tol

the tolerance level for the relative change in the observed-data log-likelihood function.

lambda

the tuning parameter for the graphical lasso penalty of the error precision matrix. It can be selected by AIC (an output).

cov_miss

the covariate that can be used in the missing-data model. If it is NULL, the missingness is assumed to be independent of the covariates. Check the Details for the missing-data model. If it is specified and the covariate is not outcome specific, its length equals the length of id. If it is outcome specific, the outcome-specific covariate is stacked across outcomes within each cluster.

miss_y

logical. If TRUE, the missingness depends on the outcome Y (see the Details). The default is TRUE. This outcome-dependent missing data pattern was motivated by and was observed in the mass-spectrometry-based quantitative proteomics data.

sigma_diff

logical. If TRUE, the sample error variance of the first sample is different from that for the rest of samples within each cluster. This option is designed and used when analyzing batch-processed proteomics data with the first sample in each cluster/batch being the common reference sample. The default is FALSE.

Value

The permutation based p-values for testing if fixed-effects (excluding the intercept) are zeros.

References

Jiebiao Wang, Pei Wang, Donald Hedeker, and Lin S. Chen. Using multivariate mixed-effects selection models for analyzing batch-processed proteomics data with non-ignorable missingness. Biostatistics. doi:10.1093/biostatistics/kxy022

Examples



data(sim_dat)

pval_perm = mvMISE_e_perm(nperm = 100, Y = sim_dat$Y, X = sim_dat$X, id = sim_dat$id)


[Package mvMISE version 1.0 Index]