hapRun {Haplin} | R Documentation |
Simulates genetic data and runs Haplin for each simulation
Description
Calculates Haplin results by first simulating genetic data, allowing a various number of family designs, and then running Haplin on the simulations.
The simulated data may contain of fetal effects, maternal effects and/or parent-of-origin effects.
The function allows for simulations and calculations on both autosomal and X-chromosome markers,
assuming Hardy-Weinberg equilibrium.
It enables simulation and calculation of gene-environment interaction effects, i.e, the input (relative risks, number of cases etc.) may vary across strata.
hapRun
calls haplin
, haplinStrat
or haplinSlide
to run on the simulated data files.
Usage
hapRun(nall, n.strata= 1, cases, controls, haplo.freq,
RR, RRcm, RRcf, RRstar, RR.mat, RRstar.mat, hapfunc = "haplin",
gen.missing.cases = NULL, gen.missing.controls = NULL,
n.sim = 1000, xchrom = FALSE, sim.comb.sex = "double", BR.girls, dire,
ask = TRUE, cpus = 1, slaveOutfile = "", ...)
Arguments
nall |
A vector of the number of alleles at each locus. |
n.strata |
The number of strata. |
cases |
A list of the number of case families. Each element is a vector of the number of families of the specified family design(s) in the corresponding stratum. The possible family designs, i.e., the possible names of the elements, are |
controls |
A list of the number of control families. Each element is a vector of the number of families of the specified family design(s) in the corresponding stratum. The possible family designs are |
haplo.freq |
A list of which each element is a numeric vector of the haplotype frequencies in each stratum. The frequencies are normalized and sum to one. The Details section shows how to implement this argument in agreement with the possible haplotypes. |
RR |
A list of which each element is a numeric vector of the relative risks in each stratum. The Details section shows how to implement this argument in agreement with the possible haplotypes. |
RRcm |
A list of numeric vectors. Each vector contains the relative risks associated with the haplotypes transmitted from the mother for this stratum. See Details for description of how to implement this argument in agreement with the possible haplotypes. |
RRcf |
A list of numeric vectors. Each vector contains the relative risks associated with the haplotypes transmitted from the father for this stratum. See Details for description of how to implement this argument in agreement with the possible haplotypes. |
RRstar |
A list of numeric vectors. Estimates how much double-dose children would deviate from the risk expected in a multiplicative dose-response relationship. |
RR.mat |
The interpretation is similar to |
RRstar.mat |
The interpretation is similar to |
hapfunc |
Defines which haplin function to run, the options being |
gen.missing.cases |
Generates missing values at random for the case families. Set to |
gen.missing.controls |
Generates missing values at random for the control families. Set to |
n.sim |
The number of simulations, i.e., the number of simulated data files. |
xchrom |
Logical. Equals |
sim.comb.sex |
To be used with |
BR.girls |
To be used with |
dire |
Gives the directory of the simulated data files. Missing by default, which means that none of the files are saved to files. |
ask |
Logical. If |
cpus |
Allows parallel processing of its analyses. The |
slaveOutfile |
Character. If |
... |
Arguments to be used by |
Details
hapRun
applies haplin
, haplinSlide
or haplinStrat
on each data file simulated by hapSim
.
It provides simulations on various family designs, i.e., triads, case-control, the hybrid design, and all intermediate designs.
The simulated files may accomodate fetal effects, maternal effects and/or parent-of-origin effects.
hapRun
allows simulation of both autosomal and X-chromosome markers, assuming Hardy-Weinberg equilibrium.
It also enables simulation and calculation of gene-environment interaction effects.
Details on how to implement the arguments listed above are provided by hapSim
and the Examples section below.
The stratum specific arguments may be simplified if the number of strata is one, or if the arguments are equal across all strata.
haplin
, haplinStrat
and haplinSlide
will run with default values unless otherwise specified by hapRun
.
For example, if hapfunc = "haplin"
, haplin
will use response = "free"
unless response = "mult"
is explicitly given as an argument in hapRun
.
Moreover, triads with missing data are only included in the haplin analysis if the argument use.missing
equals TRUE
(default in hapRun
). Please confer https://haplin.bitbucket.io/docu/Haplin_power.pdf for further details and examples.
For information on the arguments to be passed on to haplin
, haplinStrat
and haplinSlide
, please consult their help pages.
Note that RR.mat
and RRstar.mat
and RRcm
and RRcf
are required for hapSim
to simulate maternal and parent-of-origin effects, respectively.
To calculate these effects, however, arguments maternal = TRUE
and/or poo = TRUE
must be specified.
gen.missing.cases
and gen.missing.controls
are flexible arguments. By default, both equal NULL, which means that no missing data are generated at random.
If the arguments are single numbers, missing data are generated at random with this proportion for all cases and/or controls.
If the arguments are vectors of length equal to the number of loci, missing data are generated with the corresponding proportion for each locus.
The arguments can also be matrices with the number of rows equal to the number of loci and three columns.
Each row corresponds to a locus, and the columns correspond to mothers, fathers and children, respectively.
Value
If hapfunc = "haplin"
, hapRun
returns a dataframe consisting of results from running haplin
on each simulated file.
The first two columns are:
sim.no |
The name of the directory from which the results are calculated, i.e., the simulation number |
row.no |
The row number within each simulation |
haptable
gives detailed information of the full dataframe.
If hapfunc = "haplinSlide"
, hapRun
returns a list of which each element contains the results from a single run of haplinSlide
.
Consult suest
for a thorough description of the output. Note, however, that hapfunc = "haplinSlide"
is currently only implemented for diallelic markers, and the reference category is always chosen to be the first haplotype (see hapSim
for a description of the haplotype grid).
If hapfunc = "haplinStrat"
, haplinStrat
is used to estimate gene-effects in each stratum of the exposure covariate, and the results from all strata are compared using gxe
. hapRun
returns a list, where each element is the result of a single run of
gxe
.
Additionaly, if dire
is not missing by default, the simulated files from which the Haplin results are calculated, are stored in the given directory.
Author(s)
Miriam Gjerdevik,
with Hakon K. Gjessing
Professor of Biostatistics
Division of Epidemiology
Norwegian Institute of Public Health
References
Web Site: https://haplin.bitbucket.io
See Also
haplin
, haplinSlide
, hapSim
, haptable
, suest
, hapPower
, hapPowerAsymp
Examples
## Not run:
## Simulate Haplin results from 100 files using the multiplicative model in haplin.
## The files consist of fetal effects at two diallelic markers,
## corresponding to haplo.freq = rep(0.25, 4), RR = c(2,1,1,1)
## and RRstar = c(1,1,1,1). That is, the first allele has a doubled risk
## relative to the rest. The data consists of a combination of
## 100 case triads and 100 control triads with no missing data.
## No environmental factors are considered, i.e. the number of strata is one.
hapRun(nall = c(2,2), n.strata = 1, cases = c(mfc=100), controls = c(mfc=100),
haplo.freq = rep(0.25,4), RR = c(2,1,1,1), RRstar = c(1,1,1,1),
hapfunc = "haplin", response = "mult", n.sim = 100, dire = "simfiles", ask = FALSE)
## Simulate power from 100 files applying haplinStrat.
## The files consist of fetal and maternal effects at two diallelic markers.
## The data is simulated for 500 case triads and 200 control families in the first stratum,
## and 500 case triads and 500 control trids in the second.
## The fetal effects vary across strata,
## whereas the maternal effects are the same.
## One percent of the case triads are missing at random in the second stratum.
hapRun(nall = c(2,2), n.strata = 2, cases = c(mfc=500),
controls = list(c(mfc=200),c(mfc=500)), haplo.freq = rep(0.25,4), maternal = TRUE,
RR = list(c(1.5,1,1,1),c(1,1,1,1)), RRstar = c(1,1,1,1),
RR.mat = c(1.5,1,1,1), RRstar.mat = c(1,1,1,1),
gen.missing.cases = list(NULL,0.01), use.missing = TRUE, hapfunc = "haplinStrat",
n.sim = 100, ask = FALSE)
## Simulate Haplin results from 100 files using haplin.
## The files consist of fetal effects at one diallelic locus,
## corresponding to haplo.freq = rep(0.5,2), RR = c(1.5,1) and RRstar = c(1,1).
## We have a combination of 100 case triads and
## 100 control triads with no missing data.
## No environmental effects are considered.
hapRun(nall = c(2), n.strata = 1, cases = c(mfc=100), controls = c(mfc=100),
haplo.freq = rep(0.5,2), RR = c(1.5,1), RRstar = c(1,1),
hapfunc = "haplin", n.sim = 100, dire = "simfiles", ask = FALSE)
## End(Not run)