R: DDCpredict

DDCpredict {cellWise}

R Documentation

DDCpredict

Description

Based on a DDC fit on an initial (training) data set X, this function analyzes a new (test) data set Xnew.

Usage

DDCpredict(Xnew, InitialDDC, DDCpars = NULL)

Arguments

`Xnew`	The new data (test data), which must be a matrix or a data frame. It must always be provided. Its columns (variables) should correspond to those of `InitialDDC$remX`.
`InitialDDC`	The output of the `DDC` function on the initial (training) dataset. Must be provided.
`DDCpars`	The input options to be used for the prediction. By default the options of InitialDDC are used.

Value

A list with components:

`DDCpars`	the options used in the call, see `DDC`.
`locX`	the locations of the columns, from `InitialDDC`.
`scaleX`	the scales of the columns, from `InitialDDC`.
`Z`	`Xnew` standardized by `locX` and `scaleX`.
`nbngbrs`	predictions use a combination of `nbngbrs` columns.
`ngbrs`	for each column, the list of its neighbors, from `InitialDDC`.
`robcors`	for each column, the correlations with its neighbors, from `InitialDDC`.
`robslopes`	slopes to predict each column by its neighbors, from `InitialDDC`.
`deshrinkage`	for each connected column, its deshrinkage factor used in `InitialDDC`.
`Xest`	predicted values for every cell of `Xnew`.
`scalestres`	scale estimate of the residuals (`Xnew` - `Xest`), from `InitialDDC`.
`stdResid`	columnwise standardized residuals of `Xnew`.
`indcells`	positions of cellwise outliers in `Xnew`.
`Ti`	outlyingness of rows in `Xnew`.
`medTi`	median of the `Ti` in `InitialDDC`.
`madTi`	mad of the `Ti` in `InitialDDC`.
`indrows`	row numbers of the outlying rows in `Xnew`.
`indNAs`	positions of the `NA`'s in `Xnew`.
`indall`	positions of `NA`'s and outlying cells in `Xnew`.
`Ximp`	`Xnew` where all cells in indall are imputed by their prediction.

Author(s)

Rousseeuw P.J., Van den Bossche W.

References

Hubert, M., Rousseeuw, P.J., Van den Bossche W. (2019). MacroPCA: An all-in-one PCA method allowing for missing values as well as cellwise and rowwise outliers. Technometrics, 61(4), 459-473. (link to open access pdf)

Examples

library(MASS) 
set.seed(12345) 
n <- 100; d <- 10
A <- matrix(0.9, d, d); diag(A) = 1
x <- mvrnorm(n, rep(0,d), A)
x[sample(1:(n * d), 50, FALSE)] <- NA
x[sample(1:(n * d), 50, FALSE)] <- 10
x <- cbind(1:n, x)
DDCx <- DDC(x)
xnew <- mvrnorm(50, rep(0,d), A)
xnew[sample(1:(50 * d), 50, FALSE)] <- 10
predict.out <- DDCpredict(xnew, DDCx)
cellMap(D = xnew, R = predict.out$stdResid,
columnlabels = 1:d, rowlabels = 1:50)

# For more examples, we refer to the vignette:
## Not run: 
vignette("DDC_examples")

## End(Not run)

[Package cellWise version 2.5.3 Index]