nft2 {nftbart} | R Documentation |
Fit NFT BART models.
Description
The nft2()/nft()
function is for fitting
NFT BART (Nonparametric Failure Time
Bayesian Additive Regression Tree) models
with different train/test matrices for
f
and sd
functions.
Usage
nft2(
## data
xftrain, xstrain, times, delta=NULL,
xftest=matrix(nrow=0, ncol=0),
xstest=matrix(nrow=0, ncol=0),
rm.const=TRUE, rm.dupe=TRUE,
## multi-threading
tc=getOption("mc.cores", 1),
##MCMC
nskip=1000, ndpost=2000, nadapt=1000, adaptevery=100,
chvf=NULL, chvs=NULL,
method="spearman", use="pairwise.complete.obs",
pbd=c(0.7, 0.7), pb=c(0.5, 0.5),
stepwpert=c(0.1, 0.1), probchv=c(0.1, 0.1),
minnumbot=c(5, 5),
## BART and HBART prior parameters
ntree=c(50, 10), numcut=100,
xifcuts=NULL, xiscuts=NULL,
power=c(2, 2), base=c(0.95, 0.95),
## f function
fmu=NA, k=5, tau=NA, dist='weibull',
## s function
total.lambda=NA, total.nu=10, mask=NULL,
## survival analysis
K=100, events=NULL, TSVS=FALSE,
## DPM LIO
drawDPM=1L,
alpha=1, alpha.a=1, alpha.b=0.1, alpha.draw=1,
neal.m=2, constrain=1,
m0=0, k0.a=1.5, k0.b=7.5, k0=1, k0.draw=1,
a0=3, b0.a=2, b0.b=1, b0=1, b0.draw=1,
## misc
na.rm=FALSE, probs=c(0.025, 0.975), printevery=100,
transposed=FALSE, pred=FALSE
)
nft(
## data
x.train, times, delta=NULL, x.test=matrix(nrow=0, ncol=0),
rm.const=TRUE, rm.dupe=TRUE,
## multi-threading
tc=getOption("mc.cores", 1),
##MCMC
nskip=1000, ndpost=2000, nadapt=1000, adaptevery=100,
chv=NULL,
method="spearman", use="pairwise.complete.obs",
pbd=c(0.7, 0.7), pb=c(0.5, 0.5),
stepwpert=c(0.1, 0.1), probchv=c(0.1, 0.1),
minnumbot=c(5, 5),
## BART and HBART prior parameters
ntree=c(50, 10), numcut=100, xicuts=NULL,
power=c(2, 2), base=c(0.95, 0.95),
## f function
fmu=NA, k=5, tau=NA, dist='weibull',
## s function
total.lambda=NA, total.nu=10, mask=NULL,
## survival analysis
K=100, events=NULL, TSVS=FALSE,
## DPM LIO
drawDPM=1L,
alpha=1, alpha.a=1, alpha.b=0.1, alpha.draw=1,
neal.m=2, constrain=1,
m0=0, k0.a=1.5, k0.b=7.5, k0=1, k0.draw=1,
a0=3, b0.a=2, b0.b=1, b0=1, b0.draw=1,
## misc
na.rm=FALSE, probs=c(0.025, 0.975), printevery=100,
transposed=FALSE, pred=FALSE
)
Arguments
xftrain |
n x pf matrix of predictor variables for the training data. |
xstrain |
n x ps matrix of predictor variables for the training data. |
x.train |
n x p matrix of predictor variables for the training data. |
times |
n x 1 vector of the observed times for the training data. |
delta |
n x 1 vector of the time type for the training data: 0, for right-censoring; 1, for an event; and, 2, for left-censoring. |
xftest |
m x pf matrix of predictor variables for the test set. |
xstest |
m x ps matrix of predictor variables for the test set. |
x.test |
m x p matrix of predictor variables for the test set. |
rm.const |
To remove constant variables or not. |
rm.dupe |
To remove duplicate variables or not. |
tc |
Number of OpenMP threads to use. |
nskip |
Number of MCMC iterations to burn-in and discard. |
ndpost |
Number of MCMC iterations kept after burn-in. |
nadapt |
Number of MCMC iterations for adaptation prior to burn-in. |
adaptevery |
Adapt MCMC proposal distributions every |
chvf , chvs , chv |
Predictor correlation matrix used as a pre-conditioner for MCMC change-of-variable proposals. |
method , use |
Correlation options for change-of-variable proposal pre-conditioner. |
pbd |
Probability of performing a birth/death proposal, otherwise perform a rotate proposal. |
pb |
Probability of performing a birth proposal given that we choose to perform a birth/death proposal. |
stepwpert |
Initial width of proposal distribution for peturbing cut-points. |
probchv |
Probability of performing a change-of-variable proposal. Otherwise, only do a perturb proposal. |
minnumbot |
Minimum number of observations required in leaf (terminal) nodes. |
ntree |
Vector of length two for the number of trees used for the mean model and the number of trees used for the variance model. |
numcut |
Number of cutpoints to use for each predictor variable. |
xifcuts , xiscuts , xicuts |
More detailed construction of cut-points can be specified
by the |
power |
Power parameter in the tree depth penalizing prior. |
base |
Base parameter in the tree depth penalizing prior. |
fmu |
Prior parameter for the center of the mean model. |
k |
Prior parameter for the mean model. |
tau |
Desired |
dist |
Distribution to be passed to intercept-only AFT model to center |
total.lambda |
A rudimentary estimate of the process standard deviation. Used in calibrating the variance prior. |
total.nu |
Shape parameter for the variance prior. |
mask |
If a proportion is provided, then said quantile
of |
K |
Number of grid points for which to estimate survival probability. |
events |
Grid points for which to estimate survival probability. |
TSVS |
Setting to |
drawDPM |
Whether to utilize DPM or not. |
alpha |
Initial value of DPM concentration parameter. |
alpha.a |
Gamma prior parameter setting for DPM concentration parameter
where E[ |
alpha.b |
See |
alpha.draw |
Whether to draw |
neal.m |
The number of additional atoms for Neal 2000 DPM algorithm 8. |
constrain |
Whether to perform constained DPM or unconstrained. |
m0 |
Center of the error distribution: defaults to zero. |
k0.a |
First Gamma prior argument for |
k0.b |
Second Gamma prior argument for |
k0 |
Initial value of |
k0.draw |
Whether to fix k0 or draw it if from the DPM LIO prior
hierarchy: |
a0 |
First Gamma prior argument for |
b0.a |
First Gamma prior argument for |
b0.b |
Second Gamma prior argument for |
b0 |
Initial value of |
b0.draw |
Whether to fix b0 or draw it from the DPM LIO prior
hierarchy: |
na.rm |
Value to be passed to the |
probs |
Value to be passed to the |
printevery |
Outputs MCMC algorithm status every printevery iterations. |
transposed |
Specify |
pred |
Specify |
Details
nft2()/nft()
is the function to fit time-to-event data. The most general form of the model allowed is
Y({\bf x})=mu+f({\bf x})+sd({\bf x})Z
where E
follows a nonparametric error distribution
by default.
The nft2()/nft()
function returns a fit object of S3 class type
nft2/nft
that is essentially a list containing the following items.
Value
ots , oid , ovar , oc , otheta |
These are |
sts , sid , svar , sc , stheta |
Similarly, these are |
fmu |
The constant |
f.train , s.train |
The trained |
f.train.mean , s.train.mean |
The posterior mean of the trained
|
f.trees , s.trees |
Character strings representing the trained fits
of |
dpalpha |
The draws of the DPM concentration parameter
|
dpn , dpn. |
The number of atom clusters per DPM, |
dpmu |
The draws of the DPM parameter |
dpmu. |
The draws of the DPM parameter |
dpwt. |
The weights for efficient DPM calculations by atom clusters
(as opposed to subjects) for use with |
dpsd , dpsd. |
Similarly, the draws of the DPM parameter |
dpC |
The indices |
z.train |
The data values/augmentation draws of |
f.tmind/f.tavgd/f.tmaxd |
The min/average/max tier degree of trees in the |
s.tmind/s.tavgd/s.tmaxd |
The min/average/max tier degree of trees in the |
f.varcount , s.varcount |
Variable importance counts of branch
decision rules for each |
f.varcount.mean , s.varcount.mean |
Similarly, the posterior mean
of the variable importance counts for each |
f.varprob , s.varprob |
Similarly, re-weighting the posterior mean
of the variable importance counts as sum-to-one probabilities for each
|
LPML |
The log Pseudo-Marginal Likelihood as typically calculated for right-/left-censoring. |
pred |
The object returned from the |
soffset |
See |
aft |
The AFT model fit used to initialize NFT BART. |
elapsed |
The elapsed time of the run in seconds. |
Author(s)
Rodney Sparapani: rsparapa@mcw.edu
References
Sparapani R., Logan B., Maiers M., Laud P., McCulloch R. (2023) Nonparametric Failure Time: Time-to-event Machine Learning with Heteroskedastic Bayesian Additive Regression Trees and Low Information Omnibus Dirichlet Process Mixtures Biometrics (ahead of print) <doi:10.1111/biom.13857>.
See Also
Examples
##library(nftbart)
data(lung)
N=length(lung$status)
##lung$status: 1=censored, 2=dead
##delta: 0=censored, 1=dead
delta=lung$status-1
## this study reports time in days rather than weeks or months
times=lung$time
times=times/7 ## weeks
## matrix of covariates
x.train=cbind(lung[ , -(1:3)])
## lung$sex: Male=1 Female=2
## token run just to test installation
post=nft2(x.train, x.train, times, delta, K=0,
nskip=0, ndpost=10, nadapt=4, adaptevery=1)
set.seed(99)
post=nft2(x.train, x.train, times, delta, K=0)
XPtr=TRUE
x.test = rbind(x.train, x.train)
x.test[ , 2]=rep(1:2, each=N)
K=75
events=seq(0, 150, length.out=K+1)
pred = predict(post, x.test, x.test, K=K, events=events[-1],
XPtr=XPtr, FPD=TRUE)
plot(events, c(1, pred$surv.fpd.mean[1:K]), type='l', col=4,
ylim=0:1,
xlab=expression(italic(t)), sub='weeks',
ylab=expression(italic(S)(italic(t), italic(x))))
lines(events, c(1, pred$surv.fpd.upper[1:K]), lty=2, lwd=2, col=4)
lines(events, c(1, pred$surv.fpd.lower[1:K]), lty=2, lwd=2, col=4)
lines(events, c(1, pred$surv.fpd.mean[K+1:K]), lwd=2, col=2)
lines(events, c(1, pred$surv.fpd.upper[K+1:K]), lty=2, lwd=2, col=2)
lines(events, c(1, pred$surv.fpd.lower[K+1:K]), lty=2, lwd=2, col=2)
legend('topright', c('Adv. lung cancer\nmortality example',
'M', 'F'), lwd=2, col=c(0, 4, 2), lty=1)