sk_sample_vg {snapKrig} | R Documentation |
Sample point pair absolute differences for use in semi-variogram estimation
Description
Compute the absolute differences for point pairs in g
, along with their separation
distances. If no sample point index is supplied (in idx
), the function samples points
at random using sk_sample_pt
.
Usage
sk_sample_vg(
g,
n_pp = 10000,
idx = NULL,
n_bin = 25,
n_layer_max = NA,
quiet = FALSE
)
Arguments
g |
any grid object accepted or returned by |
n_pp |
integer maximum number of point pairs to sample |
idx |
optional integer vector indexing the points to sample |
n_bin |
integer number of distance bins to assign (passed to |
n_layer_max |
integer, maximum number of layers to sample (for multi-layer |
quiet |
logical, suppresses console output |
Details
In a set of n points there are n_pp(n) = (n^2-n)/2 possible point pairs. This
expression is inverted to determine the maximum number of sample points in g
to use
in order to satisfy the argument n_pp
, the maximum number of point pairs to sample.
A random sub-sample of idx
is taken as needed. By default n_pp=1e4
which results
in n=141
.
The mean of the point pair absolute values ('dabs') for a given distance interval is the
classical estimator of the variogram. This and two other robust methods are implemented
in sk_plot_semi
. These statistics are sensitive to the choice of distance bins. They
are added automatically by a call to sk_add_bins
(with n_bin
) but users can also set
up bins manually by adjusting the 'bin' column of the output.
For multi-layer g
, the function samples observed point locations once and re-uses this
selection in all layers. At most n_layer_max
layers are sampled in this way (default is
the square root of the number of layers, rounded up)
Value
A data frame with a row for each sampled point pair. Fields include 'dabs' and 'd',
the absolute difference in point values and the separation distance, along with the vector
index, row and column numbers, and component (x, y) distances for each point pair. 'bin'
indicates membership in one of n_bin
categories.
See Also
sk sk_sample_pt sk_add_bins
Examples
# make example grid and reference covariance model
gdim = c(22, 15)
n = prod(gdim)
g_empty = sk(gdim)
pars = sk_pars(g_empty, 'mat')
# generate sample data and sample semi-variogram
g_obs = sk_sim(g_empty, pars)
vg = sk_sample_vg(g_obs)
str(vg)
# pass to plotter and overlay the model that generated the data
sk_plot_semi(vg, pars)
# repeat with smaller sample sizes
sk_plot_semi(sk_sample_vg(g_obs, 1e2), pars)
sk_plot_semi(sk_sample_vg(g_obs, 1e3), pars)
# use a set of specific points
n_sp = 10
( n_sp^2 - n_sp ) / 2 # the number of point pairs
vg = sk_sample_vg(g_obs, idx=sample.int(n, n_sp))
sk_plot_semi(vg, pars)
# non-essential examples skipped to stay below 5s exec time on slower machines
# repeat with all point pairs sampled (not recommended for big data sets)
vg = sk_sample_vg(g_obs, n_pp=Inf)
sk_plot_semi(vg, pars)
( n^2 - n ) / 2 # the number of point pairs
## example with multiple layers
# generate five layers
g_obs_multi = sk_sim(g_empty, pars, n_layer=5)
# by default, a sub-sample of sqrt(n_layers) is selected
vg = sk_sample_vg(g_obs_multi)
sk_plot_semi(vg, pars)
# change this behaviour with n_layer_max
vg = sk_sample_vg(g_obs_multi, n_layer_max=5)
sk_plot_semi(vg, pars)