R: Take log ratio of the abundance of top features over bottom...

ratio_feature {tempted}

R Documentation

Take log ratio of the abundance of top features over bottom features

Description

Top and bottom ranking features are picked based on feature loadings (and their contrasts). The log ratio abundance of the top ranking features over the bottom ranking features is produced as the main result. This function and its result is designed for longitudinal microbiome data, and may not be meaningful for other type of temporal data.

Usage

ratio_feature(
  res_tempted,
  datlist,
  pct = 0.05,
  absolute = FALSE,
  contrast = NULL
)

Arguments

`res_tempted`	Output of `tempted`.
`datlist`	Output of `format_tempted(, transform="none")`, the temporal tensor that include the raw read counts.
`pct`	The percent of features to sum up. Default is 0.05, i.e. 5%.
`absolute`	`absolute = TRUE` means features are ranked by the absolute value of feature loadings, and the top `pct` percent of features are picked. `absolute = FALSE` means features are ranked by the original value of feature loadings, and the top and bottom `pct` percent of features are picked. Then ratio is taken as the abundance of the features with positive loading over the abundance of the features with negative loading.
`contrast`	A matrix choosing how components are combined, each column is a contrast of length r and used to calculate the linear combination of the feature loadings of r components.

Value

A list of results:

metafeature_ratio: The log ratio abundance of the top over bottom ranking features. It is a data.frame with five columns: "value" for the log ratio values, "subID" for the subject ID, and "timepoint" for the time points, and "PC" indicating which component was used to construct the meta feature.
contrast: The contrast used to linearly combine the components from input.
toppct: A matrix of TRUE/FALSE indicating which features are ranked top in each component (and contrast) and used as the numerator of the log ratio.
bottompct: A matrix of TRUE/FALSE indicating which features are ranked bottom in each component (and contrast) and used as the denominator of the log ratio.

References

Shi P, Martino C, Han R, Janssen S, Buck G, Serrano M, Owzar K, Knight R, Shenhav L, Zhang AR. (2023) Time-Informed Dimensionality Reduction for Longitudinal Microbiome Studies. bioRxiv. doi: 10.1101/550749. https://www.biorxiv.org/content/10.1101/550749.

Examples

# Take a subset of the samples so the example runs faster

# Here we are taking samples from the odd months
sub_sample <- rownames(meta_table)[(meta_table$day_of_life%/%12)%%2==1]
count_table_sub <- count_table[sub_sample,]
processed_table_sub <- processed_table[sub_sample,]
meta_table_sub <- meta_table[sub_sample,]

datlist <- format_tempted(count_table_sub,
                          meta_table_sub$day_of_life,
                          meta_table_sub$studyid,
                          pseudo=0.5,
                          transform="clr")

mean_svd <- svd_centralize(datlist, r=1)

res_tempted <- tempted(mean_svd$datlist, r=2, smooth=1e-5)

datalist_raw <- format_tempted(count_table_sub, meta_table_sub$day_of_life, meta_table_sub$studyid,
transform="none")

contrast <- cbind(c(1,1), c(1,-1))

res_ratio <- ratio_feature(res_tempted, datalist_raw, pct=0.1,
absolute=FALSE, contrast=contrast)

group <- unique(meta_table[, c("studyid", "delivery")])

# plot the log ratios

plot_metafeature(res_ratio$metafeature_ratio, group, bws=30)

[Package tempted version 0.1.1 Index]