gjamSimData {gjam}R Documentation

Simulated data for gjam analysis

Description

Simulates data for analysis by gjam.

Usage

  gjamSimData(n = 1000, S = 10, Q = 5, x = NULL, nmiss = 0, typeNames, effort = NULL)

Arguments

n

Sample size

S

Number of response variables (columns) in y, typically less than n

Q

Number of predictors (columns) in design matrix x << n

x

design matrix, if supplied n and Q will be set to nrow(x) and ncol(x), respectively

nmiss

Number of missing values to in x << n

typeNames

Character vector of data types, see Details

effort

List containing 'columns' specifying columns to which effort applies, and 'values', a length-n vector of effort per observation.

Details

Generates simulated data and parameters for analysis by gjam. Because both parameters and data are stochastic, not all simulations will give good results.

typeNames can be 'PA' (presenceAbsence), 'CA' (continuous), 'DA' (discrete), 'FC' (fractional composition), 'CC' (count composition), 'OC' (ordinal counts), and 'CAT' (categorical levels). If more than one 'CAT' is included, each defines a multilevel categorical reponse. One additional type, 'CON' (continuous), is not censored at zero by default.

If defined as a single character value typeNames applies to all columns in y. If not, typeNames is length-S character vector, identifying each response by column in y. If a column 'CAT' is included, a random number of levels will be generated, a, b, c, ....

A more detailed vignette is can be obtained with:

browseVignettes('gjam')

website 'http://sites.nicholas.duke.edu/clarklab/code/'.

Value

formula

R formula for model, e.g., ~ x1 + x2

xdata

data.frame includes columns for predictors in the design matrix

ydata

data.frame for the simulated response

y

response as a n by S matrix as assembled in gjam.

w

n by S latent states

typeY

vector of data types corresponding to columns in y, see Details

typeNames

vector of data types corresponding to columns in ydata

trueValues

list containing true parameter values beta (regression coefficients), sigma (covariance matrix), corSpec (correlation matrix corresponding to sigma), and cuts (partition matrix for ordinal data).

effort

see Arguments.

Author(s)

James S Clark, jimclark@duke.edu

References

Clark, J.S., D. Nemergut, B. Seyednasrollah, P. Turner, and S. Zhang. 2016. Generalized joint attribute modeling for biodiversity analysis: Median-zero, multivariate, multifarious data. Ecological Monographs 87, 34-56.

See Also

gjam

Examples

## Not run: 
## ordinal data, show true parameter values
sim <- gjamSimData(S = 5, typeNames = 'OC')  
sim$ydata[1:5,]                              # example data
sim$trueValues$cuts                          # simulated partition
sim$trueValues$beta                          # coefficient matrix

## continuous data censored at zero, note latent w for obs y = 0
sim <- gjamSimData(n = 5, S = 5, typeNames = 'CA')  
sim$w
sim$y

## continuous and discrete data
types <- c(rep('DA',5), rep('CA',4))
sim   <- gjamSimData(n = 10, S = length(types), Q = 4, typeNames = types)
sim$typeNames
sim$ydata
                             
## composition count data  
sim <- gjamSimData(n = 10, S = 8, typeNames = 'CC')
totalCount <- rowSums(sim$ydata)
cbind(sim$ydata, totalCount)  # data with sample effort

## multiple categorical responses - compare matrix y and data.frqme ydata
types <- rep('CAT',2)
sim   <- gjamSimData(S = length(types), typeNames = types)
head(sim$ydata)
head(sim$y)

## discrete abundance, heterogeneous effort 
S   <- 5
n   <- 1000
ef  <- list( columns = 1:S, values = round(runif(n,.5,5),1) )
sim <- gjamSimData(n, S, typeNames = 'DA', effort = ef)
sim$effort$values[1:20]

## combinations of scales, partition only for 'OC' columns
types <- c('OC','OC','OC','CC','CC','CC','CC','CC','CA','CA','PA','PA')
sim   <- gjamSimData(S = length(types), typeNames = types)
sim$typeNames                           
head(sim$ydata)
sim$trueValues$cuts

## End(Not run)

[Package gjam version 2.6.2 Index]