generateData {covdepGE} | R Documentation |
Generate Covariate-Dependent Data
Description
Generate a -dimensional extraneous covariate
and
-dimensional Gaussian data with a precision matrix that varies as
a continuous function of the extraneous covariate. This data is distributed
similar to that used in the simulation study from (1)
Usage
generateData(p = 5, n1 = 60, n2 = 60, n3 = 60, Z = NULL, true_precision = NULL)
Arguments
p |
positive integer; number of variables in the data matrix. |
n1 |
positive integer; number of observations in the first interval.
|
n2 |
positive integer; number of observations in the second interval.
|
n3 |
positive integer; number of observations in the third interval.
|
Z |
|
true_precision |
|
Value
Returns list with the following values:
X |
a |
Z |
a |
true_precision |
list of |
interval |
vector of length |
Extraneous Covariate
If Z = NULL
, then the generation of Z
is as follows:
The first n1
observations have from from a uniform
distribution on the interval
(the first interval).
Observations n1 + 1
to n1 + n2
have from from a uniform
distribution on the interval
(the second interval).
Observations n1 + n2 + 1
to n1 + n2 + n3
have from a
uniform distribution on the interval
(the third interval).
Precision Matrices
If true_precision = NULL
, then the generation of the true precision
matrices is as follows:
All precision matrices have on the diagonal and
in the
positions.
Observations in the first interval have a in the
positions, while observations in the third interval
have a
in the
positions.
Observations in the second interval have entries that vary as a
linear function of their extraneous covariate. Let
. Then, the
positions for
the
-th observation in the second interval are
, while the
entries are
.
Thus, as approaches
from the right, the associated
precision matrix becomes more similar to the matrix for observations in the
first interval. Similarly, as
approaches
from the left,
the matrix becomes more similar to the matrix for observations in the third
interval.
Examples
## Not run:
library(ggplot2)
# get the data
set.seed(12)
data <- generateData()
X <- data$X
Z <- data$Z
interval <- data$interval
prec <- data$true_precision
# get overall and within interval sample sizes
n <- nrow(X)
n1 <- sum(interval == 1)
n2 <- sum(interval == 2)
n3 <- sum(interval == 3)
# visualize the distribution of the extraneous covariate
ggplot(data.frame(Z = Z, interval = as.factor(interval))) +
geom_histogram(aes(Z, fill = interval), color = "black", bins = n %/% 5)
# visualize the true precision matrices in each of the intervals
# interval 1
matViz(prec[[1]], incl_val = TRUE) +
ggtitle(paste0("True precision matrix, interval 1, observations 1,...,", n1))
# interval 2 (varies continuously with Z)
cat("\nInterval 2, observations ", n1 + 1, ",...,", n1 + n2, sep = "")
int2_mats <- prec[interval == 2]
int2_inds <- c(5, n2 %/% 2, n2 - 5)
lapply(int2_inds, function(j) matViz(int2_mats[[j]], incl_val = TRUE) +
ggtitle(paste("True precision matrix, interval 2, observation", j + n1)))
# interval 3
matViz(prec[[length(prec)]], incl_val = TRUE) +
ggtitle(paste0("True precision matrix, interval 3, observations ",
n1 + n2 + 1, ",...,", n1 + n2 + n3))
# fit the model and visualize the estimated graphs
(out <- covdepGE(X, Z))
plot(out)
# visualize the posterior inclusion probabilities for variables (1, 3) and (1, 2)
inclusionCurve(out, 1, 2)
inclusionCurve(out, 1, 3)
## End(Not run)