hull_sample {mvGPS} | R Documentation |
Sample Points Along a Convex Hull
Description
To define a proper estimable region with multivariate exposure we construct a convex hull of the data in order to maintain the positivity identifying assumption. We also provide options to create trimmed versions of the convex hull to further restrict to high density regions in multidimensional space.
Usage
hull_sample(
X,
num_grid_pts = 500,
grid_type = "regular",
trim_hull = FALSE,
trim_quantile = NULL
)
Arguments
X |
numeric matrix of n by m dimensions. Each row corresponds to a point in m-dimensional space. |
num_grid_pts |
integer scalar denoting the number of parameters to search for over the convex hull. Default is 500. |
grid_type |
character value indicating the type of grid to sample from
the convex hull from |
trim_hull |
logical indicator of whether to restrict convex hull. Default is FALSE |
trim_quantile |
numeric scalar between \[0.5, 1\] representing the quantile value to trim the convex hull. Only used if trim_hull is set to TRUE. |
Details
Assume that X
is an n\times m
matrix representing the multivariate
exposure of interest. We can define the convex hull of these observations as
H. There are two distinct processes for defining H depending
on whether m=2
or m>2
.
If m=2
, we use the chull
function to define the
vertices of the convex hull. The algorithm implemented is described in Eddy (1977).
If m>2
, we use the convhulln
function. This algorithm
for obtaining the convex hull in m-dimensional space uses Qhull described in
Barber et al. (1996). Currently this function returns
only the vertex set hpts_vs
without the grid sample points. There are
options to visualize the convex hull when m=3
using triangular facets,
but there are no implementable solutions to sample along convex hulls in higher
dimensions.
To restrict the convex hull to higher density regions of the exposure we can
apply trimming. To apply trimming set trim_hull=TRUE
and specify
trim_quantile=q
where q
must be in \[0.5, 1\]. Along each
exposure dimension we then calculate the upper and lower bounds using the
quantile
function, i.e., quantile(q)
and
quantile(1-q)
. Any observations that have a value above or below these
sample quantiles is excluded. The remaining observations that fall completely
within the sample quantiles across all dimensions are used to estimate the
convex hull. We return X
that represents the observations used.
If trim_hull=FALSE
, then X
is unchanged. However, if trimming
is applied then X
contains only the remaining observations after trimming.
Value
-
hpts_vs
: vertices of the convex hull in m-dimensional space -
grid_pts
: values of grid points sampled over the corresponding convex hull -
X
: data used to generate convex hull which may be trimmed
References
Barber CB, Dobkin DP, Huhdanpaa H (1996).
“The quickhull algorithm for convex hulls.”
ACM Transactions on Mathematical Software (TOMS), 22(4), 469-483.
Eddy WF (1977).
“A new convex hull algorithm for planar sets.”
ACM Transactions on Mathematical Software (TOMS), 3(4), 398-403.
Examples
#generating exposure with m=3
D <- matrix(unlist(lapply(seq_len(3), function(m) rnorm(100))), nrow=100)
#first using only the first two variables we can return hpts_vs and grid_pts
D_hull <- hull_sample(D[, 1:2])
#when m>2 we only return hpts_vs and grid_pts is NULL
D_hull_large <- hull_sample(D)
is.null(D_hull_large$grid_pts)
#we can also apply trimming to the convex hull and return this reduced matrix
D_hull_trim <- hull_sample(D[, 1:2], trim_hull=TRUE, trim_quantile=0.95)
dim(D_hull$X)
dim(D_hull_trim$X)
#alternatively, we can also define the number of points to sample from for grid_pts
small_grid <- hull_sample(D[, 1:2], num_grid_pts=100)
length(D_hull$grid_pts)
length(small_grid$grid_pts)