R: Does the analysis between the CpG sites and phenotype of...

cpg.work {CpGassoc}

R Documentation

Does the analysis between the CpG sites and phenotype of interest

Description

Association Analysis Between Methylation Beta Values and Phenotype of Interest. This function contains the code that does the brunt of the work for cpg.assoc and cpg.perm.

Usage

cpg.work(beta.values, indep, covariates = NULL, data = NULL, logit.transform = FALSE,
chip.id = NULL, subset = NULL, random = FALSE, fdr.cutoff = 0.05, callarge = FALSE,
fdr.method = "BH", logitperm = FALSE,big.split=FALSE,return.data=FALSE)

Arguments

`beta.values`	A vector, matrix, or data frame containing the beta values of interest (1 row per CpG site, 1 column per individual).
`indep`	A vector containing the main variable of interest. `cpg.work` will evaluate the association between indep and the beta values.
`covariates`	A data frame consisting of the covariates of interest. covariates can also be a matrix if it is a model matrix minus the intercept column. It can also be a vector if there is only one covariate of interest. Can also be a formula (e.g. `~cov1+cov2`).
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in data, the variables are taken from the environment from which `cpg.work` is called.
`logit.transform`	logical. If `TRUE`, the logit transform of the beta values log(beta.val/(1-beta.val)) will be used. Any values equal to zero or one will be set to the next smallest or next largest value, respectively; values <0 or >1 will be set to NA.
`chip.id`	An optional vector containing chip, batch identities, or other categorical factor of interest to the researcher. If specified, chip id will be included as a factor in the model.
`subset`	an optional logical vector specifying a subset of observations to be used in the fitting process.
`random`	logical. If `TRUE`, the chip.id will be included in the model as a random effect, and a random intercept model will be fitted. If `FALSE`, chip.id will be included in the model as an ordinary categorical covariate, for a much faster analysis.
`fdr.cutoff`	The threshold at which to compare the FDR values. The default setting is .05. Any FDR values less than .05 will be considered significant.
`callarge`	logical. Used by `cpg.assoc` when it calls `cpg.work`. If `TRUE` it means that beta.values is actually split up from a larger data set and that `memory.limit` may be a problem. This tells `cpg.work` to perform more `rm()` and `gc()` to clear up space.
`fdr.method`	Character.Method used to calculate False Discovery Rate. Can be any of the methods listed in `p.adjust`. The default method is "BH" for the Benjamini & Hochberg method.
`logitperm`	Passes from `cpg.perm` when permutation test is performed. Stops from future checks involving the logistic transformation.
`big.split`	Passes from `cpg.assoc`. Internal flag to inform `cpg.work` that the large data did not need to be split up.
`return.data`	Logical. cpg.assoc can return dataframes containing the the variable of interest, covariates, and the chip id (if present). Defaults to FALSE. Set to TRUE if plan on using the downstream scatterplot functions).

Details

cpg.work does the analysis between the methylation and the phenotype of interest. It is called by cpg.assoc to do the brunt of the work. It can be called itself with the same input as cpg.assoc, it just cannot handle large data sets.

Value

cpg.work will return an object of class "cpg". The functions summary and plot can be called to get a summary of results and to create QQ plots. The output is in the same order as the original input. To sort it by p-value, use the sort function.

`results`	A data frame consisting of the statistics and P-values for each CpG site. Also has the adjusted p-value based on the fdr.method and whether the site was Holm significant.
`Holm.sig`	A list of sites that met criteria for Holm significance.
`FDR.sig`	A data.frame of the sites that were FDR significant by the fdr method.
`info`	A data frame consisting of the minimum P-value observed, the fdr method used, what the phenotype of interest was, and the number of covariates in the model.
`indep`	If `return.data=T`, the independent variable that was tested for association.
`covariates`	If `return.data=T`, data.frame or matrix of covariates, if specified (otherwise `NULL`).
`chip`	If `return.data=T`, chip.id vector, if specified (otherwise `NULL`).
`coefficients`	A data frame consisting of the degrees of freedom, and if object is continuous the intercept effect adjusted for possible covariates in the model, the estimated effect size, and the standard error. The degrees of freedom is used in `plot.cpg` to compute the genomic inflation factors.

Author(s)

Barfield, R.; Kilaru,V.; Conneely, K.
Maintainer: R. Barfield: <barfieldrichard8@gmail.com>

Examples

##See the examples listed in cpg.assoc for ways in which to use cpg.work.
##Just change the cpg.assoc to cpg.work.

[Package CpGassoc version 2.70 Index]