initialize {ddtlcm}R Documentation

Initialize the MH-within-Gibbs algorithm for DDT-LCM

Description

Initialize the MH-within-Gibbs algorithm for DDT-LCM

Usage

initialize(
  K,
  data,
  item_membership_list,
  c = 1,
  c_order = 1,
  method_lcm = "random",
  method_dist = "euclidean",
  method_hclust = "ward.D",
  method_add_root = "min_cor",
  fixed_initials = list(),
  fixed_priors = list(),
  alpha = 0,
  theta = 0,
  ...
)

Arguments

K

number of classes (integer)

data

an NxJ matrix of multivariate binary responses, where N is the number of individuals, and J is the number of granular items

item_membership_list

a list of G elements, where the g-th element contains the column indices of data corresponding to items in major group g

c

hyparameter of divergence function a(t)

c_order

equals 1 (default) or 2 to choose divergence function a(t) = c/(1-t) or c/(1-t)^2.

method_lcm

a character. If random (default), the initial LCM parameters will be random values. If poLCA, the initial LCM parameters will be EM algorithm estimates from the poLCA function.

method_dist

string specifying the distance measure to be used in dist(). This must be one of "euclidean" (defaults), "maximum", "manhattan", "canberra", "binary" or "minkowski". Any unambiguous substring can be given.

method_hclust

string specifying the distance measure to be used in hclust(). This should be (an unambiguous abbreviation of) one of "ward.D" (defaults), "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).

method_add_root

string specifying the method to add the initial branch to the tree output from hclust(). This should be one of "min_cor" (the absolute value of the minimum between-class correlation; default) or "sample_ddt" (randomly sample a small divergence time from the DDT process with c = 100)

fixed_initials

a named list of fixed initial values, including the initial values for tree ("phylo4d"), response_prob, class_probability, class_assignments, Sigma_by_group, and c. Default is NULL. See

fixed_priors

a named list of fixed prior hyperparameters, including the the Gamma prior for c, inverse-Gamma prior for sigma_g^2, and Dirichlet prior for pi. Moreover, we allow for a type III generalized logistic distribution such that f(eta; a_pg) = theta. This becomes a standard logistic distribution when a_pg = 1. See Dalla Valle, L., Leisen, F., Rossini, L., & Zhu, W. (2021). A Pólya–Gamma sampler for a generalized logistic regression. Journal of Statistical Computation and Simulation, 91(14), 2899-2916. An example input list is list("shape_c" = 1, "rate_c" = 1, "shape_sigma" = rep(2, G), "rate_sigma" = rep(2, G), "a_pg" = 1.0), where G is the number of major item groups. Default is NULL.

alpha, theta

hyparameter of branching probability a(t) Gamma(m-alpha) / Gamma(m+1+theta) For DDT, alpha = theta = 0

...

optional arguments for the poLCA function

Value

phylo4d object of tree topology

See Also

ddtlcm_fit()

Other initialization functions: initialize_hclust(), initialize_poLCA()

Examples

# load the MAP tree structure obtained from the real HCHS/SOL data
data(data_synthetic)
# extract elements into the global environment
list2env(setNames(data_synthetic, names(data_synthetic)), envir = globalenv())
K <- 3
G <- length(item_membership_list)
fixed_initials <- list("c" = 5)
fixed_priors <- list("rate_sigma" = rep(3, G), "shape_c" = 2, "rate_c" = 2)
initials <- initialize(K, data = response_matrix, item_membership_list,
  c=1, c_order=1, fixed_initials = fixed_initials, fixed_priors = fixed_priors)

[Package ddtlcm version 0.2.1 Index]