bassPCA {BASS}R Documentation

Bayesian Adaptive Spline Surfaces (BASS) with PCA decomposition of response

Description

Decomposes a multivariate or functional response onto a principal component basis and fits a BASS model to each basis coefficient.

Usage

bassPCA(
  xx = NULL,
  y = NULL,
  dat = NULL,
  n.pc = NULL,
  perc.var = 99,
  n.cores = 1,
  parType = "fork",
  center = T,
  scale = F,
  ...
)

Arguments

xx

a data frame or matrix of predictors with dimension n x p. Categorical predictors should be included as factors.

y

a response matrix (functional response) with dimension n x m.

dat

optional (for more control) list with elements xx (same as above), y (same as above), n.pc (number of principal components used), basis (principal components with dimension m x n.pc), newy (reduced dimension y with dimension n.pc x n), trunc.error (optional truncation error with dimension n x m), y.m (vector mean removed before PCA with dimension m), y.s (vector sd scaled before PCA with dimension m). If dat is specified, xx, y and n.pc do not need to be specified.

n.pc

number of principal components to use

perc.var

optionally specify percent of variance to explain instead of n.pc

n.cores

integer number of cores (threads) to use

parType

either "fork" or "socket". Forking is typically faster, but not compatible with Windows. If n.cores==1, parType is ignored.

center

logical whether to subtract the mean before getting the principal components, or else a numeric vector of dimension m for the center to be used

scale

logical whether to divide by the standard deviation before getting the principal components, or else a numeric vector of dimension m for the scale to be used

...

arguements to be passed to bass function calls.

Details

Gets the PCA decomposition of the response y, and fits a bass model to each PCA basis coefficient, bass(dat$xx,dat$newy[i,],...) for i in 1 to n.pc, possibly in parallel.

Value

An object of class 'bassBasis' with two elements:

mod.list

list (of length n.pc) of individual bass models

dat

same as dat above

See Also

predict.bassBasis for prediction and sobolBasis for sensitivity analysis.

Examples

## Not run: 
  ## simulate data (Friedman function)
f<-function(x){
  10*sin(pi*x[,1]*x[,2])+20*(x[,3]-.5)^2+10*x[,4]+5*x[,5]
}
## simulate data (Friedman function with first variable as functional)
sigma<-.1 # noise sd
n<-500 # number of observations
nfunc<-50 # size of functional variable grid
xfunc<-seq(0,1,length.out=nfunc) # functional grid
x<-matrix(runif(n*9),n,9) # 9 non-functional variables, only first 4 matter
X<-cbind(rep(xfunc,each=n),kronecker(rep(1,nfunc),x)) # to get y
y<-matrix(f(X),nrow=n)+rnorm(n*nfunc,0,sigma)

## fit BASS
library(parallel)
mod<-bassPCA(x,y,n.pc=5,n.cores=min(5,parallel::detectCores()))
plot(mod$mod.list[[1]])
plot(mod$mod.list[[2]])
plot(mod$mod.list[[3]])
plot(mod$mod.list[[4]])
plot(mod$mod.list[[5]])

hist(mod$dat$trunc.error)

## prediction
npred<-100
xpred<-matrix(runif(npred*9),npred,9)
Xpred<-cbind(rep(xfunc,each=npred),kronecker(rep(1,nfunc),xpred))
ypred<-matrix(f(Xpred),nrow=npred)
pred<-predict(mod,xpred,mcmc.use=1:1000) # posterior predictive samples (each is a curve)
matplot(ypred,apply(pred,2:3,mean),type='l',xlab='observed',ylab='mean prediction')
abline(a=0,b=1,col=2)
matplot(t(ypred),type='l') # actual
matplot(t(apply(pred,2:3,mean)),type='l') # mean prediction

## sensitivity
sens<-sobolBasis(mod,int.order = 2,ncores = max(parallel::detectCores()-2,1),
                 mcmc.use=1000) # for speed, only use a few samples
plot(sens)


## calibration
x.true<-runif(9,0,1) # what we are trying to learn
yobs<-f(cbind(xfunc,kronecker(rep(1,nfunc),t(x.true)))) +
  rnorm(nfunc,0,.1) # calibration data (with measurement error)
plot(yobs)

cal<-calibrate.bassBasis(y=yobs,mod=mod,
                         discrep.mean=rep(0,nfunc),
                         discrep.mat=diag(nfunc)[,1:2]*.0000001,
                         sd.est=.1,
                         s2.df=50,
                         s2.ind=rep(1,nfunc),
                         meas.error.cor=diag(nfunc),
                         bounds=rbind(rep(0,9),rep(1,9)),
                         nmcmc=10000,
                         temperature.ladder=1.05^(0:30),type=1)

nburn<-5000
uu<-seq(nburn,10000,5)

pairs(rbind(cal$theta[uu,1,],x.true),col=c(rep(1,length(uu)),2),ylim=c(0,1),xlim=c(0,1))

pred<-apply(predict(mod,cal$theta[uu,1,],nugget = T,trunc.error = T,
      mcmc.use = cal$ii[uu]),3,function(x) diag(x)+rnorm(length(uu),0,sqrt(cal$s2[uu,1,1])))
qq<-apply(pred,2,quantile,probs=c(.025,.975))
matplot(t(qq),col='lightgrey',type='l')
lines(yobs,lwd=3)



## End(Not run)
## minimal example for CRAN testing
mod<-bassPCA(1:10,matrix(1:20,10),n.pc=2,nmcmc=2,nburn=1)

[Package BASS version 1.3.1 Index]