est_regressors {LDATS}R Documentation

Estimate the distribution of regressors, unconditional on the change point locations

Description

This function uses the marginal posterior distributions of the change point locations (estimated by est_changepoints) in combination with the conditional (on the change point locations) posterior distributions of the regressors (estimated by multinom_TS) to estimate the marginal posterior distribution of the regressors, unconditional on the change point locations.

Usage

est_regressors(rho_dist, data, formula, timename, weights, control = list())

Arguments

rho_dist

List of saved data objects from the ptMCMC estimation of change point locations (unless nchangepoints is 0, then NULL) returned from est_changepoints.

data

data.frame including [1] the time variable (indicated in timename), [2] the predictor variables (required by formula) and [3], the multinomial response variable (indicated in formula) as verified by check_timename and check_formula. Note that the response variables should be formatted as a data.frame object named as indicated by the response entry in the control list, such as gamma for a standard TS analysis on LDA output.

formula

formula defining the regression between relationship the change points. Any predictor variable included must also be a column in data and any (multinomial) response variable must be a set of columns in data, as verified by check_formula.

timename

character element indicating the time variable used in the time series.

weights

Optional class numeric vector of weights for each document. Defaults to NULL, translating to an equal weight for each document. When using multinom_TS in a standard LDATS analysis, it is advisable to weight the documents by their total size, as the result of LDA is a matrix of proportions, which does not account for size differences among documents. For most models, a scaling of the weights (so that the average is 1) is most appropriate, and this is accomplished using document_weights.

control

A list of parameters to control the fitting of the Time Series model including the parallel tempering Markov Chain Monte Carlo (ptMCMC) controls. Values not input assume defaults set by TS_control.

Details

The general approach follows that of Western and Kleykamp (2004), although we note some important differences. Our regression models are fit independently for each chunk (segment of time), and therefore the variance-covariance matrix for the full model has 0 entries for covariances between regressors in different chunks of the time series. Further, because the regression model here is a standard (non-hierarchical) softmax (Ripley 1996, Venables and Ripley 2002, Bishop 2006), there is no error term in the regression (as there is in the normal model used by Western and Kleykamp 2004), and so the posterior distribution used here is a multivariate normal, as opposed to a multivariate t, as used by Western and Kleykamp (2004).

Value

matrix of draws (rows) from the marginal posteriors of the coefficients across the segments (columns).

References

Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer, New York, NY, USA.

Ripley, B. D. 1996. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, UK.

Venables, W. N. and B. D. Ripley. 2002. Modern and Applied Statistics with S. Fourth Edition. Springer, New York, NY, USA.

Western, B. and M. Kleykamp. 2004. A Bayesian change point model for historical time series analysis. Political Analysis 12:354-374. link.

Examples


  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  formula <- gamma ~ 1
  nchangepoints <- 1
  control <- TS_control()
  data <- data[order(data[,"newmoon"]), ]
  rho_dist <- est_changepoints(data, formula, nchangepoints, "newmoon", 
                               weights, control)
  eta_dist <- est_regressors(rho_dist, data, formula, "newmoon", weights, 
                             control)



[Package LDATS version 0.3.0 Index]