R: Convert BayPass read count and haploid pool size input files...

genobaypass2pooldata {poolfstat}

R Documentation

Convert BayPass read count and haploid pool size input files into a pooldata object

Description

Convert BayPass read count and haploid pool size input files into a pooldata object

Usage

genobaypass2pooldata(
  genobaypass.file = "",
  poolsize.file = "",
  snp.pos = NA,
  poolnames = NA,
  min.cov.per.pool = -1,
  max.cov.per.pool = 1e+06,
  min.maf = -1,
  verbose = TRUE
)

Arguments

`genobaypass.file`	The name (or a path) of the BayPass read count file (see the BayPass manual https://forgemia.inra.fr/mathieu.gautier/baypass_public/)
`poolsize.file`	The name (or a path) of the BayPass (haploid) pool size file (see the BayPass manual https://forgemia.inra.fr/mathieu.gautier/baypass_public/)
`snp.pos`	An optional two column matrix with nsnps rows containing the chromosome (or contig/scaffold) of origin and the position of each markers
`poolnames`	A character vector with the names of pool
`min.cov.per.pool`	Minimal allowed read count (per pool). If at least one pool is not covered by at least min.cov.perpool reads, the position is discarded
`max.cov.per.pool`	Maximal allowed read count (per pool). If at least one pool is covered by more than min.cov.perpool reads, the position is discarded
`min.maf`	Minimal allowed Minor Allele Frequency (computed from the ratio overall read counts for the reference allele over the read coverage)
`verbose`	If TRUE extra information is printed on the terminal

Details

Information on SNP position is only required for some graphical display or to carried out block-jacknife sampling estimation of confidence intervals. If no mapping information is given (default), SNPs will be assumed to be ordered on the same chromosome and separated by 1 bp. As blocks are defined with a number of consecutive SNPs (rather than a length), the latter assumption has actually no effect (except in the reported estimated block sizes in Mb).

Value

A pooldata object containing 7 elements:

"refallele.readcount": a matrix with nsnp rows and npools columns containing read counts for the reference allele (chosen arbitrarily) in each pool
"readcoverage": a matrix with nsnp rows and npools columns containing read coverage in each pool
"snp.info": a matrix with nsnp rows and four columns containing respectively the contig (or chromosome) name (1st column) and position (2nd column) of the SNP; the allele taken as reference in the refallele.readcount matrix (3rd column); and the alternative allele (4th column)
"poolsizes": a vector of length npools containing the haploid pool sizes
"poolnames": a vector of length npools containing the names of the pools
"nsnp": a scalar corresponding to the number of SNPs
"npools": a scalar corresponding to the number of pools

Examples

 make.example.files(writing.dir=tempdir())
 pooldata=popsync2pooldata(sync.file=paste0(tempdir(),"/ex.sync.gz"),poolsizes=rep(50,15))
 pooldata2genobaypass(pooldata=pooldata,writing.dir=tempdir())
 pooldata=genobaypass2pooldata(genobaypass.file=paste0(tempdir(),"/genobaypass"),
                               poolsize.file=paste0(tempdir(),"/poolsize"))

[Package poolfstat version 2.2.0 Index]