generateData {covdepGE} | R Documentation |
Generate Covariate-Dependent Data
Description
Generate a 1
-dimensional extraneous covariate
and p
-dimensional Gaussian data with a precision matrix that varies as
a continuous function of the extraneous covariate. This data is distributed
similar to that used in the simulation study from (1)
Usage
generateData(p = 5, n1 = 60, n2 = 60, n3 = 60, Z = NULL, true_precision = NULL)
Arguments
p |
positive integer; number of variables in the data matrix. |
n1 |
positive integer; number of observations in the first interval.
|
n2 |
positive integer; number of observations in the second interval.
|
n3 |
positive integer; number of observations in the third interval.
|
Z |
|
true_precision |
|
Value
Returns list with the following values:
X |
a |
Z |
a |
true_precision |
list of |
interval |
vector of length |
Extraneous Covariate
If Z = NULL
, then the generation of Z
is as follows:
The first n1
observations have z_i
from from a uniform
distribution on the interval (-3, -1)
(the first interval).
Observations n1 + 1
to n1 + n2
have z_i
from from a uniform
distribution on the interval (-1, 1)
(the second interval).
Observations n1 + n2 + 1
to n1 + n2 + n3
have z_i
from a
uniform distribution on the interval (1, 3)
(the third interval).
Precision Matrices
If true_precision = NULL
, then the generation of the true precision
matrices is as follows:
All precision matrices have 2
on the diagonal and 1
in the
(2, 3)/ (3, 2)
positions.
Observations in the first interval have a 1
in the
(1, 2) / (1, 2)
positions, while observations in the third interval
have a 1
in the (1, 3)/ (3, 1)
positions.
Observations in the second interval have 2
entries that vary as a
linear function of their extraneous covariate. Let
\beta = 1/2
. Then, the (1, 2)/(2, 1)
positions for
the i
-th observation in the second interval are
\beta\cdot(1 - z_i)
, while the (1, 3)/ (3, 1)
entries are \beta\cdot(1 + z_i)
.
Thus, as z_i
approaches -1
from the right, the associated
precision matrix becomes more similar to the matrix for observations in the
first interval. Similarly, as z_i
approaches 1
from the left,
the matrix becomes more similar to the matrix for observations in the third
interval.
Examples
## Not run:
library(ggplot2)
# get the data
set.seed(12)
data <- generateData()
X <- data$X
Z <- data$Z
interval <- data$interval
prec <- data$true_precision
# get overall and within interval sample sizes
n <- nrow(X)
n1 <- sum(interval == 1)
n2 <- sum(interval == 2)
n3 <- sum(interval == 3)
# visualize the distribution of the extraneous covariate
ggplot(data.frame(Z = Z, interval = as.factor(interval))) +
geom_histogram(aes(Z, fill = interval), color = "black", bins = n %/% 5)
# visualize the true precision matrices in each of the intervals
# interval 1
matViz(prec[[1]], incl_val = TRUE) +
ggtitle(paste0("True precision matrix, interval 1, observations 1,...,", n1))
# interval 2 (varies continuously with Z)
cat("\nInterval 2, observations ", n1 + 1, ",...,", n1 + n2, sep = "")
int2_mats <- prec[interval == 2]
int2_inds <- c(5, n2 %/% 2, n2 - 5)
lapply(int2_inds, function(j) matViz(int2_mats[[j]], incl_val = TRUE) +
ggtitle(paste("True precision matrix, interval 2, observation", j + n1)))
# interval 3
matViz(prec[[length(prec)]], incl_val = TRUE) +
ggtitle(paste0("True precision matrix, interval 3, observations ",
n1 + n2 + 1, ",...,", n1 + n2 + n3))
# fit the model and visualize the estimated graphs
(out <- covdepGE(X, Z))
plot(out)
# visualize the posterior inclusion probabilities for variables (1, 3) and (1, 2)
inclusionCurve(out, 1, 2)
inclusionCurve(out, 1, 3)
## End(Not run)