R: Initialize and update the change point matrix used in the...

prep_cpts {LDATS}

R Documentation

Initialize and update the change point matrix used in the ptMCMC algorithm

Description

Each of the chains is initialized by prep_cpts using a draw from the available times (i.e. assuming a uniform prior), the best fit (by likelihood) draw is put in the focal chain with each subsequently worse fit placed into the subsequently hotter chain. update_cpts updates the change points after every iteration in the ptMCMC algorithm.

Usage

prep_cpts(data, formula, nchangepoints, timename, weights, control = list())

update_cpts(cpts, swaps)

Arguments

`data`	`data.frame` including [1] the time variable (indicated in `timename`), [2] the predictor variables (required by `formula`) and [3], the multinomial response variable (indicated in `formula`) as verified by `check_timename` and `check_formula`. Note that the response variables should be formatted as a `data.frame` object named as indicated by the `response` entry in the `control` list, such as `gamma` for a standard TS analysis on LDA output.
`formula`	`formula` defining the regression relationship between the change points, see `formula`. Any predictor variable included must also be a column in `data` and any (multinomial) response variable must be a set of columns in `data`, as verified by `check_formula`.
`nchangepoints`	`integer` corresponding to the number of change points to include in the model. 0 is a valid input (corresponding to no change points, so a singular time series model), and the current implementation can reasonably include up to 6 change points. The number of change points is used to dictate the segmentation of the data for each continuous model and each LDA model.
`timename`	`character` element indicating the time variable used in the time series. Defaults to `"time"`. The variable must be integer-conformable or a `Date`. If the variable named is a `Date`, the input is converted to an integer, resulting in the timestep being 1 day, which is often not desired behavior.
`weights`	Optional class `numeric` vector of weights for each document. Defaults to `NULL`, translating to an equal weight for each document. When using `multinom_TS` in a standard LDATS analysis, it is advisable to weight the documents by their total size, as the result of `LDA` is a matrix of proportions, which does not account for size differences among documents. For most models, a scaling of the weights (so that the average is 1) is most appropriate, and this is accomplished using `document_weights`.
`control`	A `list` of parameters to control the fitting of the Time Series model including the parallel tempering Markov Chain Monte Carlo (ptMCMC) controls. Values not input assume defaults set by `TS_control`.
`cpts`	The existing matrix of change points.
`swaps`	Chain configuration after among-temperature swaps.

Value

list of [1] matrix of change points (rows) for each temperature (columns) and [2] vector of log-likelihood values for each of the chains.

Examples


  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  saves <- prep_saves(1, TS_control())
  inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights,
                               TS_control())
  cpts <- prep_cpts(data, gamma ~ 1, 1, "newmoon", weights, TS_control())
  ids <- prep_ids(TS_control())
  for(i in 1:TS_control()$nit){
    steps <- step_chains(i, cpts, inputs)
    swaps <- swap_chains(steps, inputs, ids)
    saves <- update_saves(i, saves, steps, swaps)
    cpts <- update_cpts(cpts, swaps)
    ids <- update_ids(ids, swaps)
  }

[Package LDATS version 0.3.0 Index]