cpg.perm {CpGassoc}R Documentation

Perform a Permutation Test of the Association Between Methylation and a Phenotype of Interest

Description

Calls cpg.assoc to get the observed P-values from the study and then performs a user-specified number of permutations to calculate an emperical p-value. In addition to the same test statistics computed by cpg.assoc, cpg.perm will compute the permutation p-values for the observed p-value, the number of Holm significant sites, and the number of FDR significant sites.

Usage

cpg.perm(beta.values, indep, covariates = NULL, nperm, data = NULL, seed = NULL,
logit.transform = FALSE, chip.id = NULL, subset = NULL, random = FALSE,
fdr.cutoff = 0.05, fdr.method = "BH",large.data=TRUE)

Arguments

beta.values

A vector, matrix, or data frame containing the beta values of interest (1 row per CpG site, 1 column per individual).

indep

A vector containing the main variable of interest. cpg.assoc will evaluate the association between indep and the beta values.

covariates

A data frame consisting of the covariates of interest. covariates can also be a matrix if it is a model matrix minus the intercept column. It can also be a vector if there is only one covariate of interest. Can also be a formula(e.g. ~cov1+cov2).

nperm

The number of permutations to be performed.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from the environment from which cpg.perm is called.

seed

The required seed for random number generation. If not input, will use R's internal seed.

logit.transform

logical. If TRUE, the logit transform of the beta values log(beta.val/(1-beta.val)) will be used. Any values equal to zero or one will be set to the next smallest or next largest value respectively; values <0 or >1 will be set to NA.

chip.id

An optional vector containing chip, batch identities, or other categorical factor of interest to the researcher. If specified, chip id will be included as a factor in the model.

subset

An optional logical vector specifying a subset of observations to be used in the fitting process.

random

logical. If TRUE, the chip.id will be processed as a random effect, and a random intercept model will be fitted.

fdr.cutoff

The threshold at which to compare the FDR values. The default setting is .05. Any FDR values less than .05 will be considered significant.

fdr.method

Character. Method used to calculate False Discovery Rate. Can be any of the methods listed in p.adjust or "qvalue" for John Storey's qvalue method (required to have qvalue package installed). The default method is "BH" for the Benjamini & Hochberg method.

large.data

Logical. Enables analyses of large datasets. When large.data=TRUE, cpg.assoc avoids memory problems by performing the analysis in chunks.

Value

The item returned will be of class "cpg.perm". It will contain all of the values of class cpg (cpg.assoc) and a few more:

permutation.matrix

A matrix consisting of the minimum observed P-value, the number of Holm significant CpG sites, and the number of FDR significant sites for each permutation.

gc.permutation.matrix

Similar to the permutation.matrix only in relation to the genomic control adjusted p-values.

perm.p.values

A data frame consisting of the permutation P-values, and the number of permutations performed.

perm.tstat

If one hundred or more permutations were performed and indep is a continuous variable, consists of the quantile .025 and .975 of observed t-statistcs for each permutation, ordered from smallest to largest. perm.tstat is used by plot.cpg.perm to compute the confidence intervals for the QQ plot of t-statistics. Otherwise NULL.

perm.pval

If one hundred or more permutations were performed, consists of the observed p-values for each permutation, ordered from smallest to largest. perm.pval is usd by plot.cpg.perm to compute the confidence intervals for the QQ plot of the p-values. Otherwise NULL.

Author(s)

Barfield, R.; Conneely, K.; Kilaru,V.
Maintainer: R. Barfield: <rbarfield01@fas.harvard.edu>

See Also

cpg.assoc cpg.work plot.cpg scatterplot cpg.combine manhattan plot.cpg.perm sort.cpg.perm sort.cpg cpg.qc

Examples

##Loading the data
data(samplecpg,samplepheno,package="CpGassoc")
###NOTE: If you are dealing with large data, do not specify large.data=FALSE.
###The default option is true.
##This will involve partitioning up the data and performing more gc() to clear up space
#Performing a permutation 10 times
Testperm<-cpg.perm(samplecpg[1:200,],samplepheno$weight,seed=2314,nperm=10,large.data=FALSE)
Testperm
#All the contents of CpGassoc are included in the output from Testperm

#summary function works on objects of class cpg.perm
summary(Testperm)


[Package CpGassoc version 2.60 Index]