stages_hclust {stagedtrees}R Documentation

Learn a staged tree with hierarchical clustering

Description

Build a stage event tree with k stages for each variable by clustering stage probabilities with hierarchical clustering.

Usage

stages_hclust(
  object,
  distance = "totvar",
  k = NA,
  method = "complete",
  ignore = object$name_unobserved,
  limit = length(object$tree),
  scope = NULL,
  score = function(x) {
     return(-BIC(x))
 }
)

Arguments

object

an object of class sevt with fitted probabilities and data, as returned by full or sevt_fit.

distance

character, the distance measure to be used, either a possible method for dist or one of the following: "totvar", "hellinger".

k

integer or (named) vector: number of clusters, that is stages per variable. Values will be recycled if needed. If NA (default) a search of the number of stage is performed with respect to the maximization of the score function. NA and integer can be mixed to fix the number of stage for some variables and use the score to select others.

method

the agglomeration method to be used in hclust.

ignore

vector of stages which will be ignored and left untouched. By default the name of the unobserved stages stored in object$name_unobserved.

limit

the maximum number of variables to consider.

scope

names of the variables to consider.

score

A function. Score to maximize for automatic selection of the number of stages. Used if k=NA for some variables.

Details

hclust_sevt performs hierarchical clustering of the initial stage probabilities in object and it aggregates them into the specified number of stages (k). A different number of stages for the different variables in the model can be specified by supplying a (named) vector via the argument k. If k is NA for some variables, all possible number of stages will be checked and the one that maximize the score will be selected.

Value

A staged event tree object.

Examples

data("Titanic")
model <- stages_hclust(full(Titanic, join_unobserved = TRUE, lambda = 1), k = 2)
summary(model)

### or search k via BIC minimization
model1 <- stages_hclust(full(Titanic), k = NA)

[Package stagedtrees version 2.3.0 Index]