R: REGENT.model

REGENT.model {REGENT}

R Documentation

REGENT.model

Description

REGENT.model provides the population distribution of risk and proportion of the population in each risk category based on genetic(SNP) and environmental exposures.

Usage


REGENT.model(AnalysisName,LocusFile=NULL,EnvFile=NULL
,prev=0.001,cv=0.05,alpha=0.05,sims=100000
,indsims=100000,SmallSampAdjust=0.5,BaseRange=0.01
,PlotMax=5,Block=100)

Arguments

`AnalysisName`	String, must be provided. Output files will be named according to this argument. Running multiple analyses with the same name will cause previous files to be overwritten.
`LocusFile`	File path string. Location of file containing table of SNP input data. Required columns should have headers SNP, MAF, Ncase, Ncontrol. Risks should either be provided in one column with header RR, or two columns with headers RR_het and RR_hom. Other columns may be present but will not be used in the analysis. Each SNP is a row. Additional columns may be provided but will be ignored.
`EnvFile`	File path string. Location of file containing table of environmental risk data. Required columns should have headers Factor, Exposure, RR, SE. If multiple exposure levels exist, then the columns should be named Factor, RR1, Exposure1, SE1, RR2, Exposure2, SE2, etc. Each factor is a row. Additional columns may be provided but will be ignored
`prev`	Prevalance of the disease or trait. Default 0.001.
`cv`	Coefficient of variation. Default 0.05.
`alpha`	One minus the desired width of confidence intervals around multilocus risk estimates. Default 0.05 giving 95 percent confidence intervals.
`sims`	Number of simulations to perform for each single factor risk estimate, for obtaining confidence intervals. Default 100000.
`indsims`	Number of individuals in the simulated population, for obtaining multilocus genotype frequencies. Default 100000
`SmallSampAdjust`	Adjustment for small sample sizes, when calculating the standard error of homozygous risk genotypes. Default 0.5
`BaseRange`	Proportion of population used to calculate the baseline risk (the risk closest to the average in the population). This is to avoid choosing rare, uncertain risk estimates by chance. Default 0.01.
`PlotMax`	Value at which to truncate the Y-axis of risk distribution plots. High risks are typically rare and of less interest when assessing the distribution in the population. Default 5.
`Block`	Number of multilocus genotypes held in memory during confidence interval calculation. Higher values should decrease computation time. We advise increasing this substantially (10000+) on high performance systems. Default 100.

Details

4 files are created by REGENT.model.A)All model details, inputs and log information are written to the main output file which is named after the argument provided to AnalysisName.B)Colour and C)greyscale plots of the risk distribution are also provided, and D)the raw data used to create these in a text file.

See the example folder included in this package for the correct input file format.

Value

A list including elements

`categories`	Table giving upper and lower boundaries for each risk category: Reduced, Average, Elevated and High.
`baseline`	Single value specifying the baseline risk before rebasing; required when passing the object to REGENT.predict
`LocusFile`	Table of genetic data used for analysis. NULL if argument LocusFile was set to NULL.
`EnvFile`	Table of environmental data used for analysis. NULL if argument EnvFile was set to NULL.

Author(s)

Graham Goddard, Daniel Crouch and Cathryn Lewis. Email: djmcrouch@gmail.com

Examples


library(REGENT)

#Load example data from package

data("REGENT")

write.table(GeneticA,file="GeneticA.txt")
write.table(GeneticB,file="GeneticB.txt")
write.table(EnvironmentalA,file="EnvironmentalA.txt")
write.table(EnvironmentalB,file="EnvironmentalB.txt")

x=REGENT.model(AnalysisName="Example",LocusFile="GeneticA.txt",EnvFile="EnvironmentalA.txt")

x

[Package REGENT version 1.0.6 Index]