calc.maps.pc {MDSMap} | R Documentation |
Estimate marker positions using Principal Curves
Description
Reads a text file of pairwise recombination fractions and LOD scores, reduces to 2 or 3 dimensions using wMDS and projects onto a single dimension using principal curves to estimate marker positions.
Usage
calc.maps.pc(fname, spar = NULL, n = NULL, ndim = 2,
weightfn = "lod2", mapfn = "haldane")
Arguments
fname |
Character string the name of the file of recombination fractions and scores it should not contain any suffices (the file should be a .txt file as described below). |
spar |
Integer - the smoothing parameter for the principal curve. If NULL this will be done using leave one out cross validation. |
n |
Vector of integers or character strings containing markers to be omitted from the analysis. |
ndim |
Number of dimensions in which to perform the wMDS and fit the curve - can be 2 or 3. |
weightfn |
Character string specifying the values to use for the weight
matrix in the MDS |
mapfn |
Character string specifying the map function to use on the
recombination fractions |
Details
Reads a file of the form described below and casts the data into matrices of
pairwise recombination fractions and weights determined by the weightfn
parameter (LOD
or LOD^2^
) calculates a distance matrix from the map function.
Haldane is the default map function, none just uses recombination fractions
and the other alternative is Kosambi (see dmap
for details).
Performs both an weighted MDS on the distance matrix using smacofSym
and
smacofSphere
(de Leeuw & Mair 2009) and fits a
principal curve to map this to an interval (principal_curve
for details).
File names should be of the form fname.txt
and it is assumed that they are in
a tab or space separated file of the format displayed below. The first entry on
the first row is the number of markers to be analysed. Underneath this is a
table in which the first two columns contain marker names, the third column
contains the pairwise recombination fractions between the markers and the
fourth column the associated lod score. Note that marker names in the first
column vary more slowly than in the second column. Missing recombination pairs
are acceptable. Recombination fractions greater than 0.499999 are set to that
value.
nmarkers | |||
marker_1 | marker_2 | recombination fraction | LOD |
1 | 2 | . | . |
1 | 3 | . | . |
1 | 4 | . | . |
. | . | . | . |
. | . | . | . |
. | . | . | . |
2 | 3 | . | . |
2 | 4 | . | . |
. | . | . | . |
Value
A list (S3 class pcmap or pcmap3d depending on ndim) with the following elements:
smacofsym |
The unconstrained wMDS results. |
pc |
The results from the principal curve fit. |
distmap |
A symmetric matrix of pairwise distances between markers where the columns are in the estimated order. |
lodmap |
A symmetric matrix of lod scores associated with the distances in distmap. |
locimap |
A data frame of the markers containing the name of each marker, the number in the configuration plot if that is being used, the position of each marker in order of increasing distance and the nearest neighbour fit of the marker. |
length |
Integer giving the total length of the segment. |
removed |
A vector of the names of markers removed from the analysis. |
locikey |
A data frame showing the number associated with each marker name for interpreting the wMDS configuration plots. |
meannnfit |
The mean across all markers of the nearest neighbour fits. |
References
de Leeuw J, Mair P (2009) Multidimensional scaling using majorization: SMACOF in R. J Stat Softw 31: 1-30 http://www.jstatsoft.org/v31/i03/
Hastie T, Weingessel A, Bengtsson H, Cannoodt R (1999) princurve: Fits a Principal Curve in Arbitrary Dimension. ) R package version 2.1.2. https://CRAN.R-project.org/package=princurve
See Also
calc.maps.sphere
, calc.pair.rf.lod
, smacofSym
, smacofSphere
, map.to.interval
, dmap
Examples
map<-calc.maps.pc(system.file("extdata", "lgV.txt", package="MDSMap"),
ndim=2,weightfn='lod2',mapfn='kosambi')
plot(map)