export_feature_expressions {familiar}R Documentation

Extract and export feature expressions.

Description

Extract and export feature expressions for the features in a familiarCollection.

Usage

export_feature_expressions(
  object,
  dir_path = NULL,
  evaluation_time = waiver(),
  export_collection = FALSE,
  ...
)

## S4 method for signature 'familiarCollection'
export_feature_expressions(
  object,
  dir_path = NULL,
  evaluation_time = waiver(),
  export_collection = FALSE,
  ...
)

## S4 method for signature 'ANY'
export_feature_expressions(
  object,
  dir_path = NULL,
  evaluation_time = waiver(),
  export_collection = FALSE,
  ...
)

Arguments

object

A familiarCollection object, or other other objects from which a familiarCollection can be extracted. See details for more information.

dir_path

Path to folder where extracted data should be saved. NULL will allow export as a structured list of data.tables.

evaluation_time

One or more time points that are used to create the outcome columns in expression plots. If not provided explicitly, this parameter is read from settings used at creation of the underlying familiarData objects. Only used for survival outcomes.

export_collection

(optional) Exports the collection if TRUE.

...

Arguments passed on to extract_feature_expression, as_familiar_collection

feature_similarity

Table containing pairwise distance between sample. This is used to determine cluster information, and indicate which samples are similar. The table is created by the extract_sample_similarity method.

data

A dataObject object, data.table or data.frame that constitutes the data that are assessed.

evaluation_times

One or more time points that are used for in analysis of survival problems when data has to be assessed at a set time, e.g. calibration. If not provided explicitly, this parameter is read from settings used at creation of the underlying familiarModel objects. Only used for survival outcomes.

feature_cluster_method

The method used to perform clustering. These are the same methods as for the cluster_method configuration parameter: none, hclust, agnes, diana and pam.

none cannot be used when extracting data regarding mutual correlation or feature expressions.

If not provided explicitly, this parameter is read from settings used at creation of the underlying familiarModel objects.

feature_linkage_method

The method used for agglomerative clustering in hclust and agnes. These are the same methods as for the cluster_linkage_method configuration parameter: average, single, complete, weighted, and ward.

If not provided explicitly, this parameter is read from settings used at creation of the underlying familiarModel objects.

feature_similarity_metric

Metric to determine pairwise similarity between features. Similarity is computed in the same manner as for clustering, and feature_similarity_metric therefore has the same options as cluster_similarity_metric: mcfadden_r2, cox_snell_r2, nagelkerke_r2, spearman, kendall and pearson.

If not provided explicitly, this parameter is read from settings used at creation of the underlying familiarModel objects.

sample_cluster_method

The method used to perform clustering based on distance between samples. These are the same methods as for the cluster_method configuration parameter: hclust, agnes, diana and pam.

none cannot be used when extracting data for feature expressions.

If not provided explicitly, this parameter is read from settings used at creation of the underlying familiarModel objects.

sample_linkage_method

The method used for agglomerative clustering in hclust and agnes. These are the same methods as for the cluster_linkage_method configuration parameter: average, single, complete, weighted, and ward.

If not provided explicitly, this parameter is read from settings used at creation of the underlying familiarModel objects.

sample_similarity_metric

Metric to determine pairwise similarity between samples. Similarity is computed in the same manner as for clustering, but sample_similarity_metric has different options that are better suited to computing distance between samples instead of between features: gower, euclidean.

The underlying feature data is scaled to the [0, 1] range (for numerical features) using the feature values across the samples. The normalisation parameters required can optionally be computed from feature data with the outer 5% (on both sides) of feature values trimmed or winsorised. To do so append ⁠_trim⁠ (trimming) or ⁠_winsor⁠ (winsorising) to the metric name. This reduces the effect of outliers somewhat.

If not provided explicitly, this parameter is read from settings used at creation of the underlying familiarModel objects.

verbose

Flag to indicate whether feedback should be provided on the computation and extraction of various data elements.

message_indent

Number of indentation steps for messages shown during computation and extraction of various data elements.

familiar_data_names

Names of the dataset(s). Only used if the object parameter is one or more familiarData objects.

collection_name

Name of the collection.

Details

Data is usually collected from a familiarCollection object. However, you can also provide one or more familiarData objects, that will be internally converted to a familiarCollection object. It is also possible to provide a familiarEnsemble or one or more familiarModel objects together with the data from which data is computed prior to export. Paths to the previous files can also be provided.

All parameters aside from object and dir_path are only used if object is not a familiarCollection object, or a path to one.

Feature expressions are computed by standardising each feature, i.e. sample mean is 0 and standard deviation is 1.

Value

A data.table (if dir_path is not provided), or nothing, as all data is exported to csv files.


[Package familiar version 1.4.6 Index]