extractSubMatrix {NIMAA}R Documentation

Extract the non-missing submatrices from a given matrix.

Description

This function arranges the input matrix and extracts the submatrices with non-missing values or with a specific proportion of missing values (except for the elements-max submatrix). The result is also shown as plotly figure.

Usage

extractSubMatrix(
  x,
  shape = "All",
  verbose = FALSE,
  palette = "Greys",
  row.vars = NULL,
  col.vars = NULL,
  bar = 1,
  plot_weight = FALSE,
  print_skim = FALSE
)

Arguments

x

A matrix.

shape

A string array indicating the shape of the submatrix, by default is "All", other options are "Square", "Rectangular_row", "Rectangular_col", "Rectangular_element_max".

verbose

A logical value, If TRUE, the plot is saved as the .png file in the working directory. By default, it is FALSE.

palette

A string or number. Color palette used for the visualization. By default, it is 'Blues'.

row.vars

A string, the name for the row variable.

col.vars

A string, the name for the column variable.

bar

A numeric value. The cut-off percentage, i.e., the proportion of non-missing values. By default, it is set to 1, indicating that no missing values are permitted in the submatrices. This argument is not applicable to the elements-max sub-matrix.

plot_weight

A logical value, If TRUE, then the function prints submatrices with weights, otherwise it prints the submatrices with all weights as 1.

print_skim

A logical value, If TRUE, then the function prints skim information in console. By default, it is FALSE.

Details

This function performs row- and column-wise preprocessing in order to extract the largest submtrices. The distinction is that the first employs the original input matrix (row-wise), whereas the second employs the transposed matrix (column-wise). Following that, this function performs a "three-step arrangement" on the matrix, the first step being row-by-row arrangement, the second step being column-by-column arrangement, and the third step being total rearranging. Then, using four strategies, namely "Square", "Rectangular row", "Rectangular col", and "Rectangular element max", this function finds the largest possible submatrix (with no missing values), outputs the result, and prints the visualization. "Square" denotes the square submatric with the same number of rows and columns. "Rectangular_row" indicates the submatrices with the most rows. "Rectangular_col" denotes the submatrices with the most columns. "Rectangular_element_max" indicates the submatrices with the most elements which is typically a rectangular submatrix.

Value

A matrix or a list of matrices with non-missing (bar = 1) or a few missing values inside. Also, a specific heat map plot is generated to visualize the topology of missing values and the submatrix sub-setting from the original incidence matrix. Additionally, the nestedness temperature is included to indicate whether the original incidence matrix should be divided into several incidence matrices beforehand.

See Also

arrange, arrange_if

Examples

# load part of the beatAML data
beatAML_data <- NIMAA::beatAML[1:10000,]

# convert to incidence matrix
beatAML_incidence_matrix <- nominalAsBinet(beatAML_data)

# extract submatrices with non-missing values
sub_matrices <- extractSubMatrix(beatAML_incidence_matrix, col.vars = "patient_id",
 row.vars = "inhibitor")

[Package NIMAA version 0.2.1 Index]