R: Binned smoothing of a matrix column by column

colBinnedSmoothing.matrix {aroma.core}

R Documentation

Binned smoothing of a matrix column by column

Description

Binned smoothing of a matrix column by column.

Usage

## S3 method for class 'matrix'
colBinnedSmoothing(Y, x=seq_len(nrow(Y)), w=NULL, xOut=NULL, xOutRange=NULL,
  from=min(x, na.rm = TRUE), to=max(x, na.rm = TRUE), by=NULL, length.out=length(x),
  na.rm=TRUE, FUN="median", ..., verbose=FALSE)

Arguments

`Y`	A `numeric` JxI `matrix` (or a `vector` of length J.)
`x`	A (optional) `numeric` `vector` specifying the positions of the J entries. The default is to assume uniformly distributed positions.
`w`	A optional `numeric` `vector` of prior weights for each of the J entries.
`xOut`	Optional `numeric` `vector` of K bin center locations.
`xOutRange`	Optional Kx2 `matrix` specifying the boundary locations for K bins, where each row represents a bin `[x0,x1)`. If not specified, the boundaries are set to be the midpoints of the bin centers, such that the bins have maximum lengths without overlapping. Vice verse, if `xOut` is not specified, then `xOut` is set to be the mid points of the `xOutRange` boundaries.
`from`, `to`, `by`, `length.out`	If neither `xOut` nor `xOutRange` is specified, the `xOut` is generated uniformly from these arguments, which specify the center location of the first and the last bin, and the distance between the center locations, utilizing the `seq`() function. Argument `length.out` can be used as an alternative to `by`, in case it specifies the total number of bins instead.
`FUN`	A `function`.
`na.rm`	If `TRUE`, missing values are excluded, otherwise not.
`...`	Not used.
`verbose`	See `Verbose`.

Details

Note that all zero-length bins [x0,x1) will get result in an NA value, because such bins contain no data points. This also means that colBinnedSmoothing(Y, x=x, xOut=xOut) where xOut contains duplicated values, will result in some zero-length bins and hence NA values.

Value

Returns a numeric KxI matrix (or a vector of length K) where K is the total number of bins. The following attributes are also returned:

xOut: The center locations of each bin.
xOutRange: The bin boundaries.
count: The number of data points within each bin (based solely on argument x).
binWidth: The average bin width.

Author(s)

Henrik Bengtsson

Examples

# Number of tracks
I <- 4

# Number of data points per track
J <- 100

# Simulate data with a gain in track 2 and 3
x <- 1:J
Y <- matrix(rnorm(I*J, sd=1/2), ncol=I)
Y[30:50,2:3] <- Y[30:50,2:3] + 3

# Uniformly distributed equal-sized bins
Ys3 <- colBinnedSmoothing(Y, x=x, from=2, by=3)
Ys5 <- colBinnedSmoothing(Y, x=x, from=3, by=5)

# Custom bins
xOutRange <- t(matrix(c(
  1, 11,
 11, 31,
 31, 41,
 41, 51,
 51, 81,
 81, 91,
 91,101
), nrow=2))
YsC <- colBinnedSmoothing(Y, x=x, xOutRange=xOutRange)

# Custom bins specified by center locations with
# maximized width relative to the neighboring bins.
xOut <- c(6, 21, 36, 46, 66, 86, 96)
YsD <- colBinnedSmoothing(Y, x=x, xOut=xOut)

xlim <- range(x)
ylim <- c(-3,5)
layout(matrix(1:I, ncol=1))
par(mar=c(3,3,1,1)+0.1, pch=19)
for (ii in 1:I) {
  plot(NA, xlim=xlim, ylim=ylim)
  points(x, Y[,ii], col="#999999")

  xOut <- attr(Ys3, "xOut")
  lines(xOut, Ys3[,ii], col=2)
  points(xOut, Ys3[,ii], col=2)

  xOut <- attr(Ys5, "xOut")
  lines(xOut, Ys5[,ii], col=3)
  points(xOut, Ys5[,ii], col=3)

  xOut <- attr(YsC, "xOut")
  lines(xOut, YsC[,ii], col=4)
  points(xOut, YsC[,ii], col=4, pch=15)

  xOut <- attr(YsD, "xOut")
  lines(xOut, YsD[,ii], col=5)
  points(xOut, YsD[,ii], col=5, pch=15)

  if (ii == 1) {
    legend("topright", pch=c(19,19,15,15), col=c(2,3,4,5),
           c("by=3", "by=5", "Custom #1", "Custom #2"), horiz=TRUE, bty="n")
  }
}


# Sanity checks
xOut <- x
YsT <- colBinnedSmoothing(Y, x=x, xOut=xOut)
stopifnot(all(YsT == Y))
stopifnot(all(attr(YsT, "counts") == 1))

xOut <- attr(YsD, "xOut")
YsE <- colBinnedSmoothing(YsD, x=xOut, xOut=xOut)
stopifnot(all(YsE == YsD))
stopifnot(all(attr(YsE, "xOutRange") == attr(YsD, "xOutRange")))
stopifnot(all(attr(YsE, "counts") == 1))

# Scramble ordering of loci
idxs <- sample(x)
x2 <- x[idxs]
Y2 <- Y[idxs,,drop=FALSE]
Y2s <- colBinnedSmoothing(Y2, x=x2, xOut=x2)
stopifnot(all(attr(Y2s, "xOut") == x2))
stopifnot(all(attr(Y2s, "counts") == 1))
stopifnot(all(Y2s == Y2))

xOut <- x[seq(from=2, to=J, by=3)]
YsT <- colBinnedSmoothing(Y, x=x, xOut=xOut)
stopifnot(all(YsT == Ys3))
stopifnot(all(attr(YsT, "counts") == 3))

xOut <- x[seq(from=3, to=J, by=5)]
YsT <- colBinnedSmoothing(Y, x=x, xOut=xOut)
stopifnot(all(YsT == Ys5))
stopifnot(all(attr(YsT, "counts") == 5))

[Package aroma.core version 3.3.1 Index]