probit_bartBMA {bartBMA} | R Documentation |

This is an implementation of Bayesian Additive Regression Trees (Chipman et al. 2018) using Bayesian Model Averaging (Hernandez et al. 2018).

```
probit_bartBMA(x.train, ...)
## Default S3 method:
probit_bartBMA(
x.train,
y.train,
a = 3,
nu = 3,
sigquant = 0.9,
c = 1000,
pen = 12,
num_cp = 20,
x.test = matrix(0, 0, 0),
num_rounds = 5,
alpha = 0.95,
beta = 2,
split_rule_node = 0,
gridpoint = 0,
maxOWsize = 100,
num_splits = 5,
gridsize = 10,
zero_split = 1,
only_max_num_trees = 1,
min_num_obs_for_split = 2,
min_num_obs_after_split = 2,
exact_residuals = 1,
spike_tree = 0,
s_t_hyperprior = 1,
p_s_t = 0.5,
a_s_t = 1,
b_s_t = 3,
lambda_poisson = 10,
less_greedy = 0,
...
)
```

`x.train` |
Training data covariate matrix |

`...` |
Further arguments. |

`y.train` |
Training data outcome vector. |

`a` |
This is a parameter that influences the variance of terminal node parameter values. Default value a=3. |

`nu` |
This is a hyperparameter in the distribution of the variance of the error term. THe inverse of the variance is distributed as Gamma (nu/2, nu*lambda/2). Default value nu=3. |

`sigquant` |
Calibration quantile for the inverse chi-squared prior on the variance of the error term. |

`c` |
This determines the size of Occam's Window |

`pen` |
This is a parameter used by the Pruned Exact Linear Time Algorithm when finding changepoints. Default value pen=12. |

`num_cp` |
This is a number between 0 and 100 that determines the proportion of changepoints proposed by the changepoint detection algorithm to keep when growing trees. Default num_cp=20. |

`x.test` |
Test data covariate matrix. Default x.test=matrix(0.0,0,0). |

`num_rounds` |
Number of trees. (Maximum number of trees in a sum-of-tree model). Default num_rounds=5. |

`alpha` |
Parameter in prior probability of tree node splitting. Default alpha=0.95 |

`beta` |
Parameter in prior probability of tree node splitting. Default beta=1 |

`split_rule_node` |
Binary variable. If equals 1, then find a new set of potential splitting points via a changepoint algorithm after adding each split to a tree. If equals zero, use the same set of potential split points for all splits in a tree. Default split_rule_node=0. |

`gridpoint` |
Binary variable. If equals 1, then a grid search changepoint detection algorithm will be used. If equals 0, then the Pruned Exact Linear Time (PELT) changepoint detection algorithm will be used (Killick et al. 2012). Default gridpoint=0. |

`maxOWsize` |
Maximum number of models to keep in Occam's window. Default maxOWsize=100. |

`num_splits` |
Maximum number of splits in a tree |

`gridsize` |
This integer determines the size of the grid across which to search if gridpoint=1 when finding changepoints for constructing trees. |

`zero_split` |
Binary variable. If equals 1, then zero split trees can be included in a sum-of-trees model. If equals zero, then only trees with at least one split can be included in a sum-of-trees model. |

`only_max_num_trees` |
Binary variable. If equals 1, then only sum-of-trees models containing the maximum number of trees, num_rounds, are selected. If equals 0, then sum-of-trees models containing less than num_rounds trees can be selected. The default is only_max_num_trees=1. |

`min_num_obs_for_split` |
This integer determines the minimum number of observations in a (parent) tree node for the algorithm to consider potential splits of the node. |

`min_num_obs_after_split` |
This integer determines the minimum number of observations in a child node resulting from a split in order for a split to occur. If the left or right chikd node has less than this number of observations, then the split can not occur. |

`exact_residuals` |
Binary variable. If equal to 1, then trees are added to sum-of-tree models within each round of the algorithm by detecting changepoints in the exact residuals. If equals zero, then changepoints are detected in residuals that are constructed from approximate predictions. |

`spike_tree` |
If equal to 1, then the Spike-and-Tree prior will be used, otherwise the standard BART prior will be used. The number of splitting variables has a beta-binomial prior. The number of terminal nodes has a truncated Poisson prior, and then a uniform prior is placed on the set of valid constructions of trees given the splitting variables and number of terminal nodes. |

`s_t_hyperprior` |
If equals 1 and spike_tree equals 1, then a beta distribution hyperprior is placed on the variable inclusion probabilities for the spike and tree prior. The hyperprior parameters are a_s_t and b_s_t. |

`p_s_t` |
If spike_tree=1 and s_t_hyperprior=0, then p_s_t is the prior variable inclusion probability. |

`a_s_t` |
If spike_tree=1 and s_t_hyperprior=1, then a_s_t is a parameter of a beta distribution hyperprior |

`b_s_t` |
If spike_tree=1 and s_t_hyperprior=1, then b_s_t is a parameter of a beta distribution hyperprior |

`lambda_poisson` |
This is a parameter for the Spike-and-Tree prior. It is the parameter for the (truncated and conditional on the number of splitting variables) Poisson prior on the number of terminal nodes. |

`less_greedy` |
If equal to one, then a less greedy model search algorithm is used. |

The following objects are returned by bartbma:

`fitted.values` |
The vector of predictions of the outcome for all training observations. |

`sumoftrees` |
This is a list of lists of matrices. The outer list corresponds to a list of sum-of-tree models, and each element of the outer list is a list of matrices describing the structure of the trees within a sum-of-tree model. See details. |

`obs_to_termNodesMatrix` |
This is a list of lists of matrices. The outer list corresponds to a list of sum-of-tree models, and each element of the outer list is a list of matrices describing to which node each of the observations is allocated to at all depths of each tree within a sum-of-tree model. See details. |

`bic` |
This is a vector of BICs for each sum-of-tree model. |

`test.preds` |
A vector of test data predictions. This output only is given if there is test data in the input. |

`sum_residuals` |
CURRENTLY INCORRECT OUTPUT. A List (over sum-of-tree models) of lists (over single trees in a model) of vectors of partial residuals. Unless the maximum number of trees in a model is one, in which case the output is a list (over single tree models) of vectors of partial residuals, which are all equal to the outcome vector. |

`numvars` |
This is the total number of variables in the input training data matrix. |

`call` |
match.call returns a call in which all of the specified arguments are specified by their full names. |

`y_minmax` |
Range of the input training data outcome vector. |

`response` |
Input taining data outcome vector. |

`nrowTrain` |
number of observations in the input training data. |

`sigma` |
sd(y.train)/(max(y.train)-min(y.train)) |

`a` |
input parameter |

`nu` |
input parameter |

`lambda` |
parameter determined by the inputs sigma, sigquant, and nu |

`fitted.probs` |
In-sample fitted probabilities |

`fitted.classes` |
In-sample fitted classes |

```
#Example from BART package (McCulloch et al. 2019)
set.seed(99)
n=100
x = sort(-2+4*runif(n))
X=matrix(x,ncol=1)
f = function(x) {return((1/2)*x^3)}
FL = function(x) {return(exp(x)/(1+exp(x)))}
pv = FL(f(x))
y = rbinom(n,1,pv)
probit_bartBMA(x.train = X,y.train = y)
```

[Package *bartBMA* version 1.0 Index]