R: (re)Normalise a "nacho" object

normalise {NACHO}

R Documentation

(re)Normalise a "nacho" object

Description

This function creates a list in which your settings, the raw counts and normalised counts are stored, using the result from a call to load_rcc().

Usage

normalise(
  nacho_object,
  housekeeping_genes = nacho_object[["housekeeping_genes"]],
  housekeeping_predict = nacho_object[["housekeeping_predict"]],
  housekeeping_norm = nacho_object[["housekeeping_norm"]],
  normalisation_method = nacho_object[["normalisation_method"]],
  n_comp = nacho_object[["n_comp"]],
  remove_outliers = nacho_object[["remove_outliers"]],
  outliers_thresholds = nacho_object[["outliers_thresholds"]]
)

Arguments

`nacho_object`	[list] A list object of class `"nacho"` obtained from `load_rcc()` or `normalise()`.
`housekeeping_genes`	[character] A vector of names of the miRNAs/mRNAs that should be used as housekeeping genes. Default is `NULL`.
`housekeeping_predict`	[logical] Boolean to indicate whether the housekeeping genes should be predicted (`TRUE`) or not (`FALSE`). Default is `FALSE`.
`housekeeping_norm`	[logical] Boolean to indicate whether the housekeeping normalisation should be performed. Default is `TRUE`.
`normalisation_method`	[character] Either `"GEO"` or `"GLM"`. Character string to indicate normalisation using the geometric mean (`"GEO"`) or a generalized linear model (`"GLM"`). Default is `"GEO"`.
`n_comp`	[numeric] Number indicating the number of principal components to compute. Cannot be more than n-1 samples. Default is `10`.
`remove_outliers`	[logical] A boolean to indicate if outliers should be excluded.
`outliers_thresholds`	[list] List of thresholds to exclude outliers.

Details

Outliers definition (remove_outliers = TRUE):

Binding Density (BD) < 0.1
Binding Density (BD) > 2.25
Field of View (FoV) < 75
Positive Control Linearity (PCL) < 0.95
Limit of Detection (LoD) < 2
Positive normalisation factor (Positive_factor) < 0.25
Positive normalisation factor (Positive_factor) > 4
Housekeeping normalisation factor (house_factor) < 1/11
Housekeeping normalisation factor (house_factor) > 11

Value

[list] A list containing parameters and data.

access: [character] Value passed to load_rcc() in id_colname.
housekeeping_genes: [character] Value passed to load_rcc() or normalise().
housekeeping_predict: [logical] Value passed to load_rcc().
housekeeping_norm: [logical] Value passed to load_rcc() or normalise().
normalisation_method: [character] Value passed to load_rcc() or normalise().
remove_outliers: [logical] Value passed to normalise().
n_comp: [numeric] Value passed to load_rcc().
data_directory: [character] Value passed to load_rcc().
pc_sum: [data.frame] A data.frame with n_comp rows and four columns: "Standard deviation", "Proportion of Variance", "Cumulative Proportion" and "PC".
nacho: [data.frame] A data.frame with all columns from the sample sheet ssheet_csv and all computed columns, i.e., quality-control metrics and counts, with one sample per row.
outliers_thresholds: [list] A list of the quality-control thresholds used.
raw_counts: [data.frame] Raw counts with probes as rows and samples as columns. With "CodeClass" (first column), the type of the probes and "Name" (second column), the Name of the probes.
normalised_counts: [data.frame] Normalised counts with probes as rows and samples as columns. With "CodeClass" (first column)), the type of the probes and "Name" (second column), the name of the probes.

Examples


data(GSE74821)
GSE74821_norm <- normalise(
  nacho_object = GSE74821,
  housekeeping_norm = TRUE,
  normalisation_method = "GEO",
  remove_outliers = TRUE
)

if (interactive()) {
  library(GEOquery)
  library(NACHO)

  # Import data from GEO
  gse <- GEOquery::getGEO(GEO = "GSE74821")
  targets <- Biobase::pData(Biobase::phenoData(gse[[1]]))
  GEOquery::getGEOSuppFiles(GEO = "GSE74821", baseDir = tempdir())
  utils::untar(
    tarfile = file.path(tempdir(), "GSE74821", "GSE74821_RAW.tar"),
    exdir = file.path(tempdir(), "GSE74821")
  )
  targets$IDFILE <- list.files(
    path = file.path(tempdir(), "GSE74821"),
    pattern = ".RCC.gz$"
  )
  targets[] <- lapply(X = targets, FUN = iconv, from = "latin1", to = "ASCII")
  utils::write.csv(
    x = targets,
    file = file.path(tempdir(), "GSE74821", "Samplesheet.csv")
  )

  # Read RCC files and format
  nacho <- load_rcc(
    data_directory = file.path(tempdir(), "GSE74821"),
    ssheet_csv = file.path(tempdir(), "GSE74821", "Samplesheet.csv"),
    id_colname = "IDFILE"
  )

  # (re)Normalise data by removing outliers
  nacho_norm <- normalise(
    nacho_object = nacho,
    remove_outliers = TRUE
  )

  # (re)Normalise data with "GLM" method and removing outliers
  nacho_norm <- normalise(
    nacho_object = nacho,
    normalisation_method = "GLM",
    remove_outliers = TRUE
  )
}

[Package NACHO version 2.0.6 Index]