R: Aggregate features using feature loadings

aggregate_feature {tempted}

R Documentation

Aggregate features using feature loadings

Description

This function aggregate the features into "meta features" by calculating a weighted summation of the features using feature loading of each component as weights. It can also aggregate features by using the combination of multiple components by ranking the features by a linear combination of feature loadings from multiple components.

Usage

aggregate_feature(
  res_tempted,
  mean_svd = NULL,
  datlist,
  pct = 1,
  contrast = NULL
)

Arguments

`res_tempted`	Output of `tempted`.
`mean_svd`	Output of `svd_centralize`.
`datlist`	Output of `format_tempted`, the original temporal tensor that will be aggregated.
`pct`	The percent of features to aggregate, features ranked by absolute value of the feature loading of each component. Default is 1, which means 100% of features are aggregated. Setting `pct=0.01` means top 1% of features is aggregated, where features are ranked in absolute value of feature loading of each component.
`contrast`	A matrix choosing how components are combined, each column is a contrast of length r and used to calculate the linear combination of the feature loadings of r components.

Value

A list of results.

metafeature_aggregate: The meta feature obtained by aggregating the observed temporal tensor. It is a data.frame with four columns: "value" for the meta feature values, "subID" for the subject ID, "timepoint" for the time points, and "PC" indicating which component was used to construct the meta feature.
metafeature_aggregate_est: The meta feature obtained by aggregating the denoised temporal tensor. It has the same structure as metafeature_aggregate.
contrast: The contrast used to linearly combine the components from input.
toppct: A matrix of TRUE/FALSE indicating which features are aggregated in each component and contrast.

References

Shi P, Martino C, Han R, Janssen S, Buck G, Serrano M, Owzar K, Knight R, Shenhav L, Zhang AR. (2023) Time-Informed Dimensionality Reduction for Longitudinal Microbiome Studies. bioRxiv. doi: 10.1101/550749. https://www.biorxiv.org/content/10.1101/550749.

Examples

# Take a subset of the samples so the example runs faster

# Here we are taking samples from the odd months
sub_sample <- rownames(meta_table)[(meta_table$day_of_life%/%12)%%2==1]
count_table_sub <- count_table[sub_sample,]
processed_table_sub <- processed_table[sub_sample,]
meta_table_sub <- meta_table[sub_sample,]

datlist <- format_tempted(count_table_sub,
                          meta_table_sub$day_of_life,
                          meta_table_sub$studyid,
                          pseudo=0.5,
                          transform="clr")

mean_svd <- svd_centralize(datlist, r=1)

res_tempted <- tempted(mean_svd$datlist, r=2, smooth=1e-5)

contrast <- matrix(c(1/2,1), 2, 1)

res_aggregate <- aggregate_feature(res_tempted,
                                   mean_svd,
                                   datlist,
                                   pct=1,
                                   contrast=contrast)

# plot the aggregated features


group <- unique(meta_table[, c("studyid", "delivery")])

plot_metafeature(res_aggregate$metafeature_aggregate, group, bws=30)

[Package tempted version 0.1.1 Index]