adjlg {spaMM} | R Documentation |
Simulated data set for testing sparse-precision code
Description
This is used in tests/test-adjacency-long.R
Usage
data("adjlg")
Format
Includes an adjacency matrix adjlgMat
. and a data frame adjlg
with 5474 observations on the following 8 variables.
ID
a factor with levels
1
to1000
months
a numeric vector
GENDER
a character vector
AGE
a numeric vector
X1
a numeric vector
X2
a numeric vector
month
a numeric vector
BUY
a numeric vector
Source
The simulation code shown below is derived from an example produced by Jeroen van den Ochtend. Following a change incorporated in spaMM version 3.8.0, that implied stricter checks of the input matrix, it appeared that the precision matrix generated by this example had inappropriate (repeated) dimnames. This example was then updated to reproduce past fitting results with a correctly formatted matrix. Note that changing the names of an adjacency matrix (as below) is generally unwise as it generally changes the statistical model because these names are matched whenever possible to levels of the grouping factor in the data.
The code was also modified to compensate for changes in R's default random number generator.
Examples
data(adjlg)
## See further usage in tests/test-adjacency-long.R
## Not run:
# as produced by:
library(data.table) ## Included data produced using version 1.10.4.3
library(igraph) ## Included data produced using version 1.2.1
rsample <- function(N=100, ## size of implied adjacency matrix
month_max=10,seed) {
if (is.integer(seed)) set.seed(seed)
dt <- data.table(ID=factor(1:N))
dt$months <- sample(1:month_max,N,replace=T) ## # of liens for each level of ID
dt$GENDER <- sample(c("MALE","FEMALE"),N,replace=TRUE)
dt$AGE <- sample(18:99,N,replace=T)
dt$X1 <- sample(1000:9900,N,replace=T)
dt$X2 <- runif(N)
dt <- dt[, c(.SD, month=data.table(seq(from=1, to=months, by = 1))), by = ID]
dt[,BUY := 0]
dt[month.V1==months,BUY := sample(c(0,1),1),by=ID]
setnames(dt,"month.V1","month")
#### create adjacency matrix
Network <- data.table(OUT=sample(dt$ID,N*month_max*4/10))
Network$IN <- sample(dt$ID,N*month_max*4/10)
Network <- Network[IN != OUT]
Network <- unique(Network)
g <- graph.data.frame(Network,directed=F)
g <- add_vertices(g,sum(!unique(dt$ID) %in% V(g)),
name=unique(dt[!dt$ID %in% V(g),list(ID)])) # => improper names
Network <- as_adjacency_matrix(g,sparse = TRUE,type="both")
colnames(Network) <- rownames(Network) <- seq(nrow(Network)) # post-v3.8.0 names
return(list(data=dt,adjMatrix=Network))
}
RNGkind("Mersenne-Twister", "Inversion", "Rounding" )
set.seed(123)
adjlg_sam <- rsample(N=1000,seed=NULL)
RNGkind("Mersenne-Twister", "Inversion", "Rejection" )
#
adjlg <- as.data.frame(adjlg_sam$data)
adjlgMat <- adjlg_sam$adjMatrix
## End(Not run)