gen_traj_data {clustra}R Documentation

Data Generators

Description

Generates a collection of longitudinal responses with possibly varying lengths and varying numbers of observations. Support is start . . . 0 . . . end, where start~uniform(s_range) and end~uniform(e_range), so that all trajectories are aligned at 0 but can start and end at different times. Zero is the intervention time.

Usage

gen_traj_data(
  n_id,
  types,
  intercepts,
  m_obs,
  s_range,
  e_range,
  noise = c(0, abs(mean(intercepts)/20)),
  min_obs = 3
)

Arguments

n_id

Vector whose length is the number of clusters, giving the number of id's to generate in each cluster.

types

A vector of integers from c(1, 2, 3) of same length as n_id, indicating curve type: constant, sine portion, sigmoid portion, respectively.

intercepts

A vector of first responses at minimum time for the curve base vectors of same length as n_id. Each type-intercept combination should be unique for unique clusters.

m_obs

Mean number of observation per id. Provides lambda parameter in rpois.

s_range

A vector of length 2, giving the min and max limits of uniformly generated start observation time.

e_range

A vector of length 2, giving the min and max limits of uniformly generated end observation time.

noise

Vector of length 2 giving the mean and sd of added N(mean, sd) noise.

min_obs

Minimum number of observations in addition to zero time observation.

Value

A data table with one response per row and four columns: id, time, response, and true_group.

Details

Generate longitudinal data for a response variable. Trajectories start at time uniformly distributed in s_range and end at time uniformly distributed in e_range. Number of observations in a trajectory is Poisson(m_obs). The result is a number of trajectories, all starting at time 0, with different time spans, and with independently different numbers of observations within the time spans. Each trajectory follows one of three possible response functions possibly with a different mean and with added N(mean, sd) error.

Examples

data = gen_traj_data(n_id = c(50, 100), types = c(1, 2), 
  intercepts = c(100, 80), m_obs = 20, s_range = c(-365, -14), 
  e_range = c(0.5*365, 2*365))
head(data)
tail(data)


[Package clustra version 0.2.1 Index]