R: Reads Affymetrix probe data (APD) as units (probesets)

readApdUnits {aroma.apd}

R Documentation

Reads Affymetrix probe data (APD) as units (probesets)

Description

Reads Affymetrix probe data (APD) as units (probesets) by using the unit and group definitions in the corresponding Affymetrix CDF file.

If more than one APD file is read, all files are assumed to be of the same chip type, and have the same read map, if any. It is not possible to read APD files of different types at the same time.

Usage

## Default S3 method:
readApdUnits(filenames, units=NULL, ..., transforms=NULL, cdf=NULL,
  stratifyBy=c("nothing", "pmmm", "pm", "mm"), addDimnames=FALSE, readMap="byMapType",
  dropArrayDim=TRUE, verbose=FALSE)

Arguments

`filenames`	The filenames of the APD files. All APD files must be of the same chip type.
`units`	An `integer` `vector` of unit indices specifying which units to be read. If `NULL`, all units are read.
`...`	Additional arguments passed to `readApd`().
`transforms`	A `list` of exactly `length(filenames)` `function`s. If `NULL`, no transformation is performed. Values read are passed through the corresponding transform function before being returned.
`cdf`	A `character` filename of a CDF file, or a CDF `list` structure. If `NULL`, the CDF file is searched for by `findCdf` first starting from the current directory and then from the directory where the first APD file is.
`stratifyBy`	Argument passed to low-level method `readCdfCellIndices`.
`addDimnames`	If `TRUE`, dimension names are added to arrays, otherwise not. The size of the returned APD structure in bytes increases by 30-40% with dimension names.
`readMap`	A `vector` remapping cell indices to file indices. If `"byMapType"`, the read map of type according to APD header will be search for and read. It is much faster to specify the read map explicitly compared with searching for it each time. If `NULL`, no map is used.
`dropArrayDim`	If `TRUE` and only one array is read, the elements of the group field do not have an array dimension.
`verbose`	See `Verbose`.

Value

A named list where the names corresponds to the names of the units read. Each element of the list is in turn a list structure with groups (aka blocks).

Speed

Since the cell indices are semi-randomized across the array and with units (probesets), it is very unlikely that the read will consist of subsequent cells (which would be faster to read). However, the speed of this method, which uses FileVector to read data, is comparable to the speed of readCelUnits, which uses the Fusion SDK (readCel) to read data.

Author(s)

Henrik Bengtsson

Examples


library("R.utils") # Arguments

verbose <- Arguments$getVerbose(TRUE)

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# 1. Scan for existing CEL files
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# a) Scan current directory for CEL files
files <- list.files(pattern="[.](cel|CEL)$")
files <- files[!file.info(files)$isdir]

if (length(files) > 0 && require("affxparser")) {
  # b) Corresponding APD filenames
  celNames <- files
  apdNames <- gsub(pattern, ".apd", files)
 
  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  # 1. Copy the probe intensities from a CEL to an APD file
  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  for (kk in 1) {
    verbose && enter(verbose, "Reading CEL file #", kk)
    cel <- readCel(celNames[kk])
    verbose && exit(verbose)
 
    if (!file.exists(apdNames[kk])) {
      verbose && enter(verbose, "Creating APD file #", kk)
      chipType <- cel$header$chiptype
      writeApd(apdNames[kk], data=cel$intensities, chipType=chipType)
      verbose && exit(verbose)
    }
 
    verbose && enter(verbose, "Verifying APD file #", kk)
    apd <- readApd(apdNames[kk])
    verbose && exit(verbose)
    stopifnot(identical(apd$intensities, cel$intensities))
  }
 
 
  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  # 2. Read a subset of the units
  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  units <- c(1, 20:205)
  cel <- readCelUnits(celNames[1], units=units)
  apd <- readApdUnits(apdNames[1], units=units)
  stopifnot(identical(apd, cel))
 
 
  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  # 3. The same, but stratified on PMs and MMs
  # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  apd <- readApdUnits(apdNames[1], units=units, stratifyBy="pmmm",
                                                addDimnames=TRUE)
} # if (length(files) > 0)

[Package aroma.apd version 0.7.0 Index]