corr_betw_matrices {lineup2} | R Documentation |
Calculate correlations between columns of two matrices
Description
For matrices x and y, calculate the correlation between columns of x and columns of y.
Usage
corr_betw_matrices(
x,
y,
what = c("paired", "bestright", "bestpairs", "all"),
corr_threshold = 0.9,
align_rows = TRUE,
cores = 1
)
Arguments
x |
A numeric matrix. |
y |
A numeric matrix with the same number of rows as |
what |
Indicates which correlations to calculate and return. See value, below. |
corr_threshold |
Threshold on correlations if |
align_rows |
If TRUE, align the rows in the two matrices by the row names. |
cores |
Number of CPU cores to use, for parallel calculations.
(If |
Details
Missing values (NA
) are ignored, and we calculate the correlation
using all complete pairs, as in stats::cor()
with
use="pairwise.complete.obs"
.
Value
If what="paired"
, the return value is a vector of
correlations, between columns of x
and the corresponding column of
y
. x
and y
must have the same number of columns.
If what="bestright"
, we return a data frame of size ncol(x)
by
3
, with the i
th row being the maximum correlation between
column i
of x
and a column of y
, and then the
y
-column index and y
-column name with that correlation. (In
case of ties, we give the first one.)
If what="bestpairs"
, we return a data frame with five columns,
containing all pairs of columns (with one in x
and one in y
)
with correlation \ge
corr_threshold
. Each row corresponds to a
column pair, and contains the correlation and then the x
- and
y
-column indices followed by the x
- and y
-column names.
If what="all"
, the output is a matrix of size ncol(x)
by
ncol(y)
, with all correlations between columns of x
and
columns of y
.
See Also
dist_betw_matrices()
, dist_betw_arrays()
Examples
# use the provided data, and first align the rows
aligned <- align_matrix_rows(lineup2ex$gastroc, lineup2ex$islet)
# correlations for each column in x with each in y
result_pairs <- corr_betw_matrices(aligned[[1]], aligned[[2]], "paired")
# subset columns to those with correlation > 0.75
gastroc <- lineup2ex$gastroc[,result_pairs > 0.75]
islet <- lineup2ex$islet[,result_pairs > 0.75]
# similarity matrix for the two sets of rows
# (by transposing and using what="all")
corr_betw_samples <- corr_betw_matrices(t(gastroc), t(islet), "all")
# for each column in x, find most correlated column in y
# (max in each row of result_all)
bestright <- corr_betw_matrices(t(gastroc), t(islet), "bestright")
# correlations that exceed a threshold
bestpairs <- corr_betw_matrices(t(gastroc), t(islet), "bestpairs", corr_threshold=0.8)