corr_betw_matrices {lineup2}R Documentation

Calculate correlations between columns of two matrices

Description

For matrices x and y, calculate the correlation between columns of x and columns of y.

Usage

corr_betw_matrices(
  x,
  y,
  what = c("paired", "bestright", "bestpairs", "all"),
  corr_threshold = 0.9,
  align_rows = TRUE,
  cores = 1
)

Arguments

x

A numeric matrix.

y

A numeric matrix with the same number of rows as x.

what

Indicates which correlations to calculate and return. See value, below.

corr_threshold

Threshold on correlations if what="bestpairs".

align_rows

If TRUE, align the rows in the two matrices by the row names.

cores

Number of CPU cores to use, for parallel calculations. (If 0, use parallel::detectCores().) Alternatively, this can be links to a set of cluster sockets, as produced by parallel::makeCluster().

Details

Missing values (NA) are ignored, and we calculate the correlation using all complete pairs, as in stats::cor() with use="pairwise.complete.obs".

Value

If what="paired", the return value is a vector of correlations, between columns of x and the corresponding column of y. x and y must have the same number of columns.

If what="bestright", we return a data frame of size ncol(x) by 3, with the ith row being the maximum correlation between column i of x and a column of y, and then the y-column index and y-column name with that correlation. (In case of ties, we give the first one.)

If what="bestpairs", we return a data frame with five columns, containing all pairs of columns (with one in x and one in y) with correlation \ge corr_threshold. Each row corresponds to a column pair, and contains the correlation and then the x- and y-column indices followed by the x- and y-column names.

If what="all", the output is a matrix of size ncol(x) by ncol(y), with all correlations between columns of x and columns of y.

See Also

dist_betw_matrices(), dist_betw_arrays()

Examples

# use the provided data, and first align the rows
aligned <- align_matrix_rows(lineup2ex$gastroc, lineup2ex$islet)

# correlations for each column in x with each in y
result_pairs <- corr_betw_matrices(aligned[[1]], aligned[[2]], "paired")

# subset columns to those with correlation > 0.75
gastroc <- lineup2ex$gastroc[,result_pairs > 0.75]
islet <- lineup2ex$islet[,result_pairs > 0.75]

# similarity matrix for the two sets of rows
# (by transposing and using what="all")
corr_betw_samples <- corr_betw_matrices(t(gastroc), t(islet), "all")

# for each column in x, find most correlated column in y
# (max in each row of result_all)
bestright <- corr_betw_matrices(t(gastroc), t(islet), "bestright")

# correlations that exceed a threshold
bestpairs <- corr_betw_matrices(t(gastroc), t(islet), "bestpairs", corr_threshold=0.8)


[Package lineup2 version 0.6 Index]