| adjlg {spaMM} | R Documentation |
Simulated data set for testing sparse-precision code
Description
This is used in tests/test-adjacency-long.R
Usage
data("adjlg")
Format
Includes an adjacency matrix adjlgMat. and a data frame adjlg with 5474 observations on the following 8 variables.
IDa factor with levels
1to1000monthsa numeric vector
GENDERa character vector
AGEa numeric vector
X1a numeric vector
X2a numeric vector
montha numeric vector
BUYa numeric vector
Source
The simulation code shown below is derived from an example produced by Jeroen van den Ochtend. Following a change incorporated in spaMM version 3.8.0, that implied stricter checks of the input matrix, it appeared that the precision matrix generated by this example had inappropriate (repeated) dimnames. This example was then updated to reproduce past fitting results with a correctly formatted matrix. Note that changing the names of an adjacency matrix (as below) is generally unwise as it generally changes the statistical model because these names are matched whenever possible to levels of the grouping factor in the data.
The code was also modified to compensate for changes in R's default random number generator.
Examples
data(adjlg)
## See further usage in tests/test-adjacency-long.R
## Not run:
# as produced by:
library(data.table) ## Included data produced using version 1.10.4.3
library(igraph) ## Included data produced using version 1.2.1
rsample <- function(N=100, ## size of implied adjacency matrix
month_max=10,seed) {
if (is.integer(seed)) set.seed(seed)
dt <- data.table(ID=factor(1:N))
dt$months <- sample(1:month_max,N,replace=T) ## # of liens for each level of ID
dt$GENDER <- sample(c("MALE","FEMALE"),N,replace=TRUE)
dt$AGE <- sample(18:99,N,replace=T)
dt$X1 <- sample(1000:9900,N,replace=T)
dt$X2 <- runif(N)
dt <- dt[, c(.SD, month=data.table(seq(from=1, to=months, by = 1))), by = ID]
dt[,BUY := 0]
dt[month.V1==months,BUY := sample(c(0,1),1),by=ID]
setnames(dt,"month.V1","month")
#### create adjacency matrix
Network <- data.table(OUT=sample(dt$ID,N*month_max*4/10))
Network$IN <- sample(dt$ID,N*month_max*4/10)
Network <- Network[IN != OUT]
Network <- unique(Network)
g <- graph.data.frame(Network,directed=F)
g <- add_vertices(g,sum(!unique(dt$ID) %in% V(g)),
name=unique(dt[!dt$ID %in% V(g),list(ID)])) # => improper names
Network <- as_adjacency_matrix(g,sparse = TRUE,type="both")
colnames(Network) <- rownames(Network) <- seq(nrow(Network)) # post-v3.8.0 names
return(list(data=dt,adjMatrix=Network))
}
RNGkind("Mersenne-Twister", "Inversion", "Rounding" )
set.seed(123)
adjlg_sam <- rsample(N=1000,seed=NULL)
RNGkind("Mersenne-Twister", "Inversion", "Rejection" )
#
adjlg <- as.data.frame(adjlg_sam$data)
adjlgMat <- adjlg_sam$adjMatrix
## End(Not run)