est_changepoints {LDATS}R Documentation

Use ptMCMC to estimate the distribution of change point locations

Description

This function executes ptMCMC-based estimation of the change point location distributions for multinomial Time Series analyses.

Usage

est_changepoints(
  data,
  formula,
  nchangepoints,
  timename,
  weights,
  control = list()
)

Arguments

data

data.frame including [1] the time variable (indicated in timename), [2] the predictor variables (required by formula) and [3], the multinomial response variable (indicated in formula) as verified by check_timename and check_formula. Note that the response variables should be formatted as a data.frame object named as indicated by the response entry in the control list, such as gamma for a standard TS analysis on LDA output.

formula

formula defining the regression between relationship the change points. Any predictor variable included must also be a column in data and any (multinomial) response variable must be a set of columns in data, as verified by check_formula.

nchangepoints

integer corresponding to the number of change points to include in the model. 0 is a valid input (corresponding to no change points, so a singular time series model), and the current implementation can reasonably include up to 6 change points. The number of change points is used to dictate the segmentation of the time series into chunks fit with separate models dictated by formula.

timename

character element indicating the time variable used in the time series.

weights

Optional class numeric vector of weights for each document. Defaults to NULL, translating to an equal weight for each document. When using multinom_TS in a standard LDATS analysis, it is advisable to weight the documents by their total size, as the result of LDA is a matrix of proportions, which does not account for size differences among documents. For most models, a scaling of the weights (so that the average is 1) is most appropriate, and this is accomplished using document_weights.

control

A list of parameters to control the fitting of the Time Series model including the parallel tempering Markov Chain Monte Carlo (ptMCMC) controls. Values not input assume defaults set by TS_control.

Value

List of saved data objects from the ptMCMC estimation of change point locations (unless nchangepoints is 0, then NULL is returned).

Examples


  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  formula <- gamma ~ 1
  nchangepoints <- 1
  control <- TS_control()
  data <- data[order(data[,"newmoon"]), ]
  rho_dist <- est_changepoints(data, formula, nchangepoints, "newmoon", 
                               weights, control)



[Package LDATS version 0.3.0 Index]