R: Bagged k-nearest neighbors survival prediction

bnnSurvival {bnnSurvival}

R Documentation

Bagged k-nearest neighbors survival prediction

Description

Bootstrap aggregated (bagged) version of the k-nearest neighbors survival probability prediction method (Lowsky et al. 2013). In addition to the bootstrapping of training samples, the features can be subsampled in each base learner.

Usage

bnnSurvival(formula, data, k = max(1, nrow(data)/10),
  num_base_learners = 50, num_features_per_base_learner = NULL,
  metric = "mahalanobis", weighting_function = function(x) {     x * 0 + 1
  }, replace = TRUE, sample_fraction = NULL)

Arguments

`formula`	Object of class formula or character describing the model to fit.
`data`	Training data of class data.frame.
`k`	Number nearest neighbors to use. If a vector is given, the optimal k of these values is found using 5-fold cross validation.
`num_base_learners`	Number of base learners to use for bootstrapping.
`num_features_per_base_learner`	Number of features randomly selected in each base learner. Default: all.
`metric`	Metric d(x,y) used to measure the distance between observations. Currently only "mahalanobis".
`weighting_function`	Weighting function w(d(,x,y)) used to weight the observations based on their distance.
`replace`	Sample with or without replacement.
`sample_fraction`	Fraction of observations to sample in [0,1]. Default is 1 for `replace=TRUE`, and 0.6321 for `replace=FALSE`.

Details

For a description of the k-nearest neighbors survival probability prediction method see (Lowsky et al. 2013). Please note, that parallel processing, as currently implemented, does not work on Microsoft Windows platforms.

The weighting function needs to be defined for all distances >= 0. The default function is constant 1, a possible alternative is w(x) = 1/(1+x).

To use the non-bagged version as in Lowsky et al. 2013, use num_base_learners=1, replace=FALSE and sample_fraction=1.

Value

bnnSurvivalEnsemble object. Use predict() with a new data set to predict survival probabilites.

Author(s)

Marvin N. Wright

References

Lowsky, D.J. et al. (2013). A K-nearest neighbors survival probability prediction method. Stat Med, 32(12), 2062-2069.

Examples

require(bnnSurvival)

## Use only 1 core
options(mc.cores = 1)

## Load a dataset and split in training and test data
require(survival)
n <- nrow(veteran)
idx <- sample(n, 2/3*n)
train_data <- veteran[idx, ]
test_data <- veteran[-idx, ]

## Create model with training data and predict for test data
model <- bnnSurvival(Surv(time, status) ~ trt + karno + diagtime + age + prior, train_data, 
                     k = 20, num_base_learners = 10, num_features_per_base_learner = 3)
result <- predict(model, test_data)

## Plot survival curve for the first observations
plot(timepoints(result), predictions(result)[1, ])

[Package bnnSurvival version 0.1.5 Index]