CalcAndWriteDissimilarityMatrix {scellpam} | R Documentation |
CalcAndWriteDissimilarityMatrix
Description
Writes a binary symmetric matrix with the dissimilarities between ROWS of the data stored in a binary matrix in the scellpam package format.
Notice that, differently from the common practice in single cell, the rows represent cells. This is for efficiency reasons and it is transparent
to the user, as long as he/she has generated the binary matrix (with CsvToBinMat, dgCMatToBinMat or SceToBinMat) using the option transpose=TRUE.
The input matrix of vectors can be a full or a sparse matrix. Output matrix type can be float or double type (but look at the comments in 'Details').
Usage
CalcAndWriteDissimilarityMatrix(
ifname,
ofname,
distype = "L2",
restype = "float",
comment = "",
nthreads = 0L
)
Arguments
ifname |
A string with the name of the file containing the counts as a binary matrix, as written by CsvToBinMat, dgCMatToBinMat or SceToBinMat |
ofname |
A string with the name of the binary output file to contain the symmetric dissimilarity matrix. |
distype |
The dissimilarity to be calculated. It must be one of these strings: 'L1', 'L2', 'Pearson', 'Cos' or 'WEuc'. |
restype |
The data type of the result. It can be one of the strings 'float' or 'double'. Default: float (and don't change it unless you REALLY need to...). |
comment |
Comment to be added to the dissimilary matrix. Default: "" (no comment) |
nthreads |
Number of threads to be used for the parallel calculations with this meaning: |
Details
The parameter restype forces the output to be a matrix of either floats or doubles. Precision of float is normally good enough; but if you need
double precision (may be because you expect your results to be in a large range, two to three orders of magnitude), change it.
Nevertheless, notice that this at the expense of double memory usage, which is QUADRATIC with the number of individuals (rows) in your input matrix.
Value
No return value, called for side effects (creates a file)
Examples
Rf <- matrix(runif(50000),nrow=100)
tmpfile1=paste0(tempdir(),"/Rfullfloat.bin")
JWriteBin(Rf,tmpfile1,dtype="float",dmtype="full",
comment="Full matrix of floats, 100 rows, 500 columns")
JMatInfo(tmpfile1)
tmpdisfile1=paste0(tempdir(),"/RfullfloatDis.bin")
# Distance file calculated from the matrix stored as full
CalcAndWriteDissimilarityMatrix(tmpfile1,tmpdisfile1,distype="L2",
restype="float",comment="L2 distance matrix from full",nthreads=0)
JMatInfo(tmpdisfile1)
tmpfile2=paste0(tempdir(),"/Rsparsefloat.bin")
JWriteBin(Rf,tmpfile2,dtype="float",dmtype="sparse",
comment="Sparse matrix of floats, 100 rows, 500 columns")
JMatInfo(tmpfile2)
# Distance file calculated from the matrix stored as sparse
tmpdisfile2=paste0(tempdir(),"/RsparsefloatDis.bin")
CalcAndWriteDissimilarityMatrix(tmpfile2,tmpdisfile2,distype="L2",
restype="float",comment="L2 distance matrix from sparse",nthreads=0)
JMatInfo(tmpdisfile2)
# Read both versions
Dfu<-GetJManyRows(tmpdisfile1,c(1:nrow(Rf)))
Dsp<-GetJManyRows(tmpdisfile2,c(1:nrow(Rf)))
# and compare them
max(Dfu-Dsp)