R: Compute a Bayesian Consensus Clustering model for mixed-type...

BCC.multi {BCClong}

R Documentation

Compute a Bayesian Consensus Clustering model for mixed-type longitudinal data

Description

This function performs clustering on mixed-type (continuous, discrete and categorical) longitudinal markers using Bayesian consensus clustering method with MCMC sampling

Usage

BCC.multi(
  mydat,
  id,
  time,
  center = 1,
  num.cluster,
  formula,
  dist,
  alpha.common = 0,
  initials = NULL,
  sigma.sq.e.common = 1,
  hyper.par = list(delta = 1, a.star = 1, b.star = 1, aa0 = 0.001, bb0 = 0.001, cc0 =
    0.001, ww0 = 0, vv0 = 1000, dd0 = 0.001, rr0 = 4, RR0 = 3),
  c.ga.tunning = NULL,
  c.theta.tunning = NULL,
  adaptive.tunning = 0,
  tunning.freq = 20,
  initial.cluster.membership = "random",
  input.initial.local.cluster.membership = NULL,
  input.initial.global.cluster.membership = NULL,
  seed.initial = 2080,
  burn.in,
  thin,
  per,
  max.iter
)

Arguments

`mydat`	list of R longitudinal features (i.e., with a length of R), where R is the number of features. The data should be prepared in a long-format (each row is one time point per individual).
`id`	a list (with a length of R) of vectors of the study id of individuals for each feature. Single value (i.e., a length of 1) is recycled if necessary
`time`	a list (with a length of R) of vectors of time (or age) at which the feature measurements are recorded
`center`	1: center the time variable before clustering, 0: no centering
`num.cluster`	number of clusters K
`formula`	a list (with a length of R) of formula for each feature. Each formula is a twosided linear formula object describing both the fixed-effects and random effects part of the model, with the response (i.e., longitudinal feature) on the left of a ~ operator and the terms, separated by + operations, or the right. Random-effects terms are distinguished by vertical bars (\|) separating expressions for design matrices from grouping factors. See formula argument from the lme4 package
`dist`	a character vector (with a length of R) that determines the distribution for each feature. Possible values are "gaussian" for a continuous feature, "poisson" for a discrete feature (e.g., count data) using a log link and "binomial" for a dichotomous feature (0/1) using a logit link. Single value (i.e., a length of 1) is recycled if necessary
`alpha.common`	1 - common alpha, 0 - separate alphas for each outcome
`initials`	List of initials for: zz, zz.local ga, sigma.sq.u, sigma.sq.e, Default is NULL
`sigma.sq.e.common`	1 - estimate common residual variance across all groups, 0 - estimate distinct residual variance, default is 1
`hyper.par`	hyper-parameters of the prior distributions for the model parameters. The default hyper-parameters values will result in weakly informative prior distributions.
`c.ga.tunning`	tuning parameter for MH algorithm (fixed effect parameters), each parameter corresponds to an outcome/marker, default value equals NULL
`c.theta.tunning`	tuning parameter for MH algorithm (random effect), each parameter corresponds to an outcome/marker, default value equals NULL
`adaptive.tunning`	adaptive tuning parameters, 1 - yes, 0 - no, default is 1
`tunning.freq`	tuning frequency, default is 20
`initial.cluster.membership`	"mixAK" or "random" or "PAM" or "input" - input initial cluster membership for local clustering, default is "random"
`input.initial.local.cluster.membership`	if use "input", option input.initial.cluster.membership must not be empty, default is NULL
`input.initial.global.cluster.membership`	input initial cluster membership for global clustering default is NULL
`seed.initial`	seed for initial clustering (for initial.cluster.membership = "mixAK") default is 2080
`burn.in`	the number of samples disgarded. This value must be smaller than max.iter.
`thin`	the number of thinning. For example, if thin = 10, then the MCMC chain will keep one sample every 10 iterations
`per`	specify how often the MCMC chain will print the iteration number
`max.iter`	the number of MCMC iterations.

Value

Returns a BCC class model contains clustering information

Examples

# import dataframe
data(epil)
# example only, larger number of iteration required for accurate result
fit.BCC <-  BCC.multi (
       mydat = list(epil$anxiety_scale,epil$depress_scale),
       dist = c("gaussian"),
       id = list(epil$id),
       time = list(epil$time),
       formula =list(y ~ time + (1|id)),
       num.cluster = 2,
       burn.in = 3,
       thin = 1,
       per =1,
       max.iter = 8)

[Package BCClong version 1.0.3 Index]