sim_seq {clickb} | R Documentation |
Simulate data
Description
This function simulate a sequential dataset from a mixture of first-order Markov models generating categorical sequences. The output is a dataframe, columns are "id" to identify a subject/sequence, "y" to identify a categorical observation related to the sequence and "clus" the cluster label.
Usage
sim_seq(M, K, ini.prob, trans.prob, clust.size, T.range)
Arguments
M |
is the number of components |
K |
is the number of Markov model states |
ini.prob |
is a list of initial probability vectors for each component |
trans.prob |
is a list of transition matrices for each component |
clust.size |
is a list of components' sizes |
T.range |
is a vector of two elements: minimum and maximum sequence length |
Value
Object of class data.frame
Author(s)
Furio Urso furio.urso@unipa.it
Examples
# Simulate dataset from a mixture of Markov models
M <- 3 # number of components
K <- 5 # number of states
# define initial and transition probabilities for each component
ini1<-c(0.35, 0, 0.3, 0.2, 0.15)
A1<-matrix(c(0.15, 0.1, 0.5, 0, 0.25,
0.2, 0, 0.1, 0.2, 0.5,
0.6, 0.1, 0.1, 0.2, 0,
0, 0.45, 0.35, 0.1, 0.1,
0.15, 0.25, 0, 0.1, 0.5),byrow=TRUE,nrow=5)
ini2<-c(0.25, 0, 0.2, 0.25, 0.3)
A2<-matrix(c(0,0.8,0,0,0.2,
0.2,0,0.8,0,0,
0,0.2,0,0.8,0,
0,0,0.2,0,0.8,
0.8,0,0,0.2,0),byrow=TRUE,nrow=5)
ini3<-c(0.3, 0, 0.25, 0.3, 0.15)
A3<-matrix(c(0,0.1,0.2,0,0.7,
0.7,0,0.2,0.1,0,
0.1,0.8,0,0.1,0,
0,0.1,0.7,0,0.2,
0.2,0,0,0.8,0),byrow=TRUE,nrow=5)
trans.prob <- list(A1, A2, A3)
ini.prob <- list(ini1, ini2, ini3)
# sizes i.e. number of sequences in each component
N.sim1<-20
N.sim2<-30
N.sim3<-50
clust.size <- list(N.sim1, N.sim2, N.sim3)
T.range <- c(5, 30) # sequences minimum length and maximum length
data<- sim_seq( M, K, ini.prob, trans.prob, clust.size, T.range)
[Package clickb version 0.1 Index]