hull_sample {mvGPS}R Documentation

Sample Points Along a Convex Hull

Description

To define a proper estimable region with multivariate exposure we construct a convex hull of the data in order to maintain the positivity identifying assumption. We also provide options to create trimmed versions of the convex hull to further restrict to high density regions in multidimensional space.

Usage

hull_sample(
  X,
  num_grid_pts = 500,
  grid_type = "regular",
  trim_hull = FALSE,
  trim_quantile = NULL
)

Arguments

X

numeric matrix of n by m dimensions. Each row corresponds to a point in m-dimensional space.

num_grid_pts

integer scalar denoting the number of parameters to search for over the convex hull. Default is 500.

grid_type

character value indicating the type of grid to sample from the convex hull from spsample

trim_hull

logical indicator of whether to restrict convex hull. Default is FALSE

trim_quantile

numeric scalar between \[0.5, 1\] representing the quantile value to trim the convex hull. Only used if trim_hull is set to TRUE.

Details

Assume that X is an n\times m matrix representing the multivariate exposure of interest. We can define the convex hull of these observations as H. There are two distinct processes for defining H depending on whether m=2 or m>2.

If m=2, we use the chull function to define the vertices of the convex hull. The algorithm implemented is described in Eddy (1977).

If m>2, we use the convhulln function. This algorithm for obtaining the convex hull in m-dimensional space uses Qhull described in Barber et al. (1996). Currently this function returns only the vertex set hpts_vs without the grid sample points. There are options to visualize the convex hull when m=3 using triangular facets, but there are no implementable solutions to sample along convex hulls in higher dimensions.

To restrict the convex hull to higher density regions of the exposure we can apply trimming. To apply trimming set trim_hull=TRUE and specify trim_quantile=q where q must be in \[0.5, 1\]. Along each exposure dimension we then calculate the upper and lower bounds using the quantile function, i.e., quantile(q) and quantile(1-q). Any observations that have a value above or below these sample quantiles is excluded. The remaining observations that fall completely within the sample quantiles across all dimensions are used to estimate the convex hull. We return X that represents the observations used. If trim_hull=FALSE, then X is unchanged. However, if trimming is applied then X contains only the remaining observations after trimming.

Value

References

Barber CB, Dobkin DP, Huhdanpaa H (1996). “The quickhull algorithm for convex hulls.” ACM Transactions on Mathematical Software (TOMS), 22(4), 469-483.

Eddy WF (1977). “A new convex hull algorithm for planar sets.” ACM Transactions on Mathematical Software (TOMS), 3(4), 398-403.

Examples

#generating exposure with m=3
D <- matrix(unlist(lapply(seq_len(3), function(m) rnorm(100))), nrow=100)

#first using only the first two variables we can return hpts_vs and grid_pts
D_hull <- hull_sample(D[, 1:2])

#when m>2 we only return hpts_vs and grid_pts is NULL
D_hull_large <- hull_sample(D)
is.null(D_hull_large$grid_pts)

#we can also apply trimming to the convex hull and return this reduced matrix
D_hull_trim <- hull_sample(D[, 1:2], trim_hull=TRUE, trim_quantile=0.95)
dim(D_hull$X)
dim(D_hull_trim$X)

#alternatively, we can also define the number of points to sample from for grid_pts
small_grid <- hull_sample(D[, 1:2], num_grid_pts=100)
length(D_hull$grid_pts)
length(small_grid$grid_pts)


[Package mvGPS version 1.2.2 Index]