fullscan {locStra} | R Documentation |
A full scan of the input data m
using a collection of windows given by the two-column matrix windows
. For each window, the data is processed using the function matrixFunction
(this could be, e.g., the covMatrix
function), then the processed data is summarized using the function summaryFunction
(e.g., the largest eigenvector computed with the function powerMethod
), and finally the global and local summaries are compared using the function comparisonFunction
(e.g., the vector correlation with R's function cor
). The function returns a two-column matrix which contains per row the global summary statistics (e.g., the correlation between the global and local eigenvectors) and the local summary statistics (e.g., the correlation between the local eigenvectors of the previous and current windows) for each window.
Description
A full scan of the input data m
using a collection of windows given by the two-column matrix windows
. For each window, the data is processed using the function matrixFunction
(this could be, e.g., the covMatrix
function), then the processed data is summarized using the function summaryFunction
(e.g., the largest eigenvector computed with the function powerMethod
), and finally the global and local summaries are compared using the function comparisonFunction
(e.g., the vector correlation with R's function cor
). The function returns a two-column matrix which contains per row the global summary statistics (e.g., the correlation between the global and local eigenvectors) and the local summary statistics (e.g., the correlation between the local eigenvectors of the previous and current windows) for each window.
Usage
fullscan(m, windows, matrixFunction, summaryFunction, comparisonFunction)
Arguments
m |
A (sparse) matrix for which the full scan is sought. The input matrix is assumed to be oriented to contain the data for one individual per column. |
windows |
A two-column matrix containing per column the windows on which the data is scanned. The windows can be overlapping. The windows can be computed using the function |
matrixFunction |
Function on one matrix argument to process the data for each window (e.g., the covariance matrix). |
summaryFunction |
Function on one argument to summarize the output of the function |
comparisonFunction |
Function on two inputs to compute a comparison measure for the output of the function |
Value
A two-column matrix containing per row the global and local summary statistics for each window. Plotting the correlation data of the returned matrix gives a figure analogously to the figure shown here, which was generated with the example code below.
References
Dmitry Prokopenko, Julian Hecker, Edwin Silverman, Marcello Pagano, Markus Noethen, Christian Dina, Christoph Lange and Heide Fier (2016). Utilizing the Jaccard index to reveal population stratification in sequencing data: a simulation study and an application to the 1000 Genomes Project. Bioinformatics, 32(9):1366-1372.
Examples
require(locStra)
require(Matrix)
data(testdata)
cor2 <- function(x,y) ifelse(sum(x)==0 | sum(y)==0, 0, cor(x,y))
windowSize <- 10000
w <- makeWindows(nrow(testdata),windowSize,windowSize)
resCov <- fullscan(testdata,w,covMatrix,powerMethod,cor2)
resJac <- fullscan(testdata,w,jaccardMatrix,powerMethod,cor2)
resSMx <- fullscan(testdata,w,sMatrix,powerMethod,cor2)
resGRM <- fullscan(testdata,w,grMatrix,powerMethod,cor2)
resAll <- cbind(resCov[,1], resJac[,1], resSMx[,1], resGRM[,1])
xlabel <- "SNP position"
ylabel <- "correlation between global and local eigenvectors"
mainlabel <- paste("window size",windowSize)
matplot(w[,1],abs(resAll),type="b",xlab=xlabel,ylab=ylabel,ylim=c(0,1),main=mainlabel)
legend("topright",legend=c("Cov","Jaccard","s-Matrix","GRM"),pch=paste(1:ncol(resAll)))