power {pbatR} | R Documentation |
Power Exploration
Description
Power has been completely rewritten from scratch, and is all done via monte carlo simulation internally now. These routines do not require pbat, and should run on any machine.
Usage
pbat.power(mode="continuous")
pbat.powerCmd(
numOffspring=1, numParents=2, numFamilies=500,
additionalOffspringPhenos=TRUE,
ascertainment="affected",
modelGen="additive", modelTest=modelGen,
afreqMarker=NA,
penAA=0.8, penAB=0.5, penBB=0.3,
heritability=0.0, contsAscertainmentLower=0.0,
contsAscertainmentUpper=1.0,
pDiseaseAlleleGivenMarkerAllele=1.0, afreqDSL=0.1,
alpha=0.01,
offset="default",
numSim=1000,
ITERATION_KILLER=200 )
Arguments
mode |
"continuous" or "dichotomous" |
numOffspring |
Family - number of offspring |
numParents |
Family - number of parents (0,1,2) |
numFamilies |
Family - number of families |
additionalOffspringPhenos |
Only used when you have missing parents; additional offspring phenotypes. 1 for yes, 0 for no. |
ascertainment |
'unaffected', 'affected', or 'na' for anyone |
modelGen |
The model used when generating the simulated data - one of 'additive', 'dominant', 'recessive'. |
modelTest |
The model used to test the simulated data. |
afreqMarker |
allele frequency at the marker |
penAA |
penetrance of AA genotype |
penAB |
penetrance of AB genotype |
penBB |
penetrance of BB genotype |
heritability |
heritibility - when this is zero, a binary trait according to the previously defined parameters is used; when it is nonzero, a continuous trait is used. |
contsAscertainmentLower |
Lower bound for affected ascertainment with a continuous trait, this is a vector for each member after the proband, defaulting to ‘0’. It represents the quantiles, so 0.05 would indicate that the lower 5 percent of the phenotypes should be removed. |
contsAscertainmentUpper |
Upper bound, defaults to ‘1’. |
pDiseaseAlleleGivenMarkerAllele |
Pr(Disease allele A|marker allele A) |
afreqDSL |
allele frequency at DSL, defaults to marker frequency. |
alpha |
significance level |
offset |
"default" uses the population prevalence for dichotomous traits and the population mean for continuous traits. If a number is specified, then that number is used as the offset. |
numSim |
Number of monte-carlo simulations. I'd use a smaller number while starting out with it, and then turn it up to a much higher number of iterations later on. |
ITERATION_KILLER |
Controls how many times to try to draw data when simulating, before giving up. A value of 0 indicates to never stop. This is useful if you are playing around and are considering a situation that is too difficult for this program to be able to simulate. |
Details
pbat.powerCmd(...) does not really do any range checking, primarily because I don't expect most will use it directly, and instead will use the friendly GUI interface for power exploration.
Be careful with the number of simulations! When you are first exploring, you can keep this low, but you should turn this all the way up before doing your final computation.
Note that some values of ‘pDiseaseAlleleGivenMarkerAllele’ in combination with ‘afreqMarker’ are not possible. These will return negative values (these are error codes for the GUI, which will provide more helpful messages).
Lastly, you might want to look into something like set.seed(1) e.g., if you want the results to be reproducable (set it to any number, but make note of this number, see set.seed for more details).
References
Hoffmann, T. and Lange, C. (2006) P2BAT: a massive parallel implementation of PBAT for genome-wide association studies in R. Bioinformatics. Dec 15;22(24):3103-5.
Horvath, Steve, Xu, Xin, and Laird, Nan M. The family based association test method: computing means and variances for general statistics. Tech Report.