R: Perform Hamiltonian Monte Carlo sampling

cnvrg_HMC {CNVRG}

R Documentation

Perform Hamiltonian Monte Carlo sampling

Description

This function uses a compiled Dirichlet multinomial model and performs Hamiltonian Monte Carlo sampling of posteriors using 'Stan'. After sampling it is important to check convergence. Use the summary function and shinystan to do this. If you use this function then credit 'Stan' and 'RStan' along with this package.

Usage

cnvrg_HMC(
  countData,
  starts,
  ends,
  algorithm = "NUTS",
  chains = 2,
  burn = 500,
  samples = 1000,
  thinning_rate = 2,
  cores = 1,
  params_to_save = c("pi", "p")
)

Arguments

`countData`	A matrix or data frame of counts.The first field should be sample names and the subsequent fields should be integer data. Data should be arranged so that the first n rows correspond to one treatment group and the next n rows correspond with the next treatment group, and so on. The row indices for the first and last sample in these groups are fed into this function via 'starts' and 'ends'.
`starts`	A vector defining the indices that correspond to the first sample in each treatment group. The indexer function can help with this.
`ends`	A vector defining the indices that correspond to the last sample in each treatment group. The indexer function can help with this.
`algorithm`	The algorithm to use when sampling. Either 'NUTS' or 'HMC' or 'Fixed_param'. If unsure, then be like a squirrel. This is "No U turn sampling". The abbreviation is from 'Stan'.
`chains`	The number of chains to run.
`burn`	The warm up or 'burn in' time.
`samples`	How many samples from the posterior to save.
`thinning_rate`	Thinning rate to use during sampling.
`cores`	The number of cores to use.
`params_to_save`	The parameters from which to save samples. Can be 'p', 'pi', 'theta'.

Details

It can be helpful to use the indexer function to automatically identify the indices needed for the 'starts' and 'ends' parameters. See the vignette for an example.

Warning: data must be input in the correct organized format or this function will not provide accurate results. See vignette if you are unsure how to organize data. Warning: depending upon size of data to be analyzed this function can take a very long time to run.

Value

A fitted 'Stan' object that includes the samples from the parameters designated.

Examples

#simulate an OTU table
com_demo <-matrix(0, nrow = 10, ncol = 10)
com_demo[1:5,] <- c(rep(3,5), rep(7,5)) #Alternates 3 and 7
com_demo[6:10,] <- c(rep(7,5), rep(3,5)) #Reverses alternation
fornames <- NA
for(i in 1:length(com_demo[1,])){
fornames[i] <- paste("otu_", i, sep = "")
}
sample_vec <- NA
for(i in 1:length(com_demo[,1])){
sample_vec[i] <- paste("sample", i, sep = "_")
}
com_demo <- data.frame(sample_vec, com_demo)
names(com_demo) <- c("sample", fornames)

#These are toy data, many more samples, multiple chains, and a longer burn
#are likely advisable for real data.
fitstan_HMC <- cnvrg_HMC(com_demo,starts = c(1,6),
ends=c(5,10),
chains = 1,
burn = 100,
samples = 150,
thinning_rate = 2)

[Package CNVRG version 1.0.0 Index]