dtangle {dtangle} | R Documentation |
Deconvolve cell type mixing proportions from gene expression data.
Description
Deconvolve cell type mixing proportions from gene expression data.
Usage
dtangle(Y, references = NULL, pure_samples = NULL, n_markers = NULL,
data_type = NULL, gamma = NULL, markers = NULL,
marker_method = "ratio", summary_fn = mean)
Arguments
Y |
Expression matrix. (Required) Two-dimensional numeric. Must implement Each row contains expression measurements for a particular sample. Each columm contains the measurements of the same gene over all individuals. Can either contain just the mixture samples to be deconvolved or both the mixture samples and the reference samples. See |
references |
Cell-type reference expression matrix. (Optional) Two-dimensional numeric. Must implement Each row contains expression measurements for a reference profile of a particular cell type. Columns contain measurements of reference profiles of a gene. Optionally may merge this matrix with |
pure_samples |
The pure sample indicies. (Optional) List of one-dimensional integer. Must implement The i-th element of the top-level list is a vector of indicies (rows of |
n_markers |
Number of marker genes. (Optional) One-dimensional numeric. How many markers genes to use for deconvolution. Can either be a single integer, vector of integers (one for each cell type), or single or vector of percentages (numeric in 0 to 1). If a single integer then all cell types use that number of markers. If a vector then the i-th element determines how many marker genes are used for the i-th cell type. If single percentage (in 0 to 1) then that percentage of markers are used for all types. If vector of percentages then that percentage used for each type, respectively. If not specified then top 10% of genes are used. |
data_type |
Type of expression measurements. (Optional) One-dimensional string. An optional string indicating the type of the expression measurements. This is used to set gamma to a pre-determined value based upon the data type. Valid values are for probe-level microarray as “microarray-probe”, gene-level microarray as “microarray-gene” or rna-seq as “rna-seq”. Alternatively can set |
gamma |
Expression adjustment term. (Optional) One-dimensional positive numeric. If provided as a single positive number then that value will be used for |
markers |
Marker gene indices. (Optional) List of one-dimensional integer. Top-level list should be same length as |
marker_method |
Method used to rank marker genes. (Optional) One-dimensional string. The method used to rank genes as markers. If not supplied defaults to “ratio”. Only used if markers are not provided to argument “markers”. Options are
|
summary_fn |
What summary statistic to use when aggregating expression measurements. (Optional) Function that takes a one-dimensional vector of numeric and returns a single numeric. Defaults to the mean. Other good options include the median. |
Value
List.
'estimates' a matrix estimated mixing proportions. One row for each sample, one column for each cell type.
'markers' list of vectors of marker used for each cell type. Each element of list is vector of columns of
Y
used as a marker for the i-th cell type.'n_markers' vector of number of markers used for each cell type.
'gamma' value of the sensitivity parameter gamma used by dtangle.
See Also
Examples
truth = shen_orr_ex$annotation$mixture
pure_samples <- lapply(1:3, function(i) {
which(truth[, i] == 1)
})
Y <- shen_orr_ex$data$log
n_markers = 20
dtangle(Y, pure_samples = pure_samples,
n_markers=n_markers,data_type='microarray-gene',marker_method = 'ratio')
n_markers = c(10,11,12)
dtangle(Y, pure_samples=pure_samples,
n_markers=n_markers,gamma=.8,marker_method = 'regression')