set_spatial_grid {SeaVal} | R Documentation |
Set Spatial Grid Attributes to a Data Table
Description
This function creates the spatial grid attribute for a data table. If the data table already has such an attribute, missing information is filled in. In particular, the function checks whether a grid is regular, allowing for rounding errors in the grid coordinates, see details below. By default the grid coordinates are rounded to a regular grid if they are very close to being regular. While this sounds dangerous, it is almost always desirable to treat coordinates like that when working with data tables.
Usage
set_spatial_grid(
dt,
coor_cns = NULL,
check_regular = TRUE,
regular_tolerance = 1,
verbose = FALSE
)
Arguments
dt |
A data table object. |
coor_cns |
Optional character vector of length two indicating the names of the spatial coordinates
within the data table in order |
check_regular |
A logical indicating whether to check for regularity of the grid. This should essentially always be done but can be suppressed for speed.
Defaults to |
regular_tolerance |
Value >= 0 specifying the amount of rounding error we allow for still recognizing a grid as regular.
Given in percent of the minimum of |
verbose |
Logical. If |
Details
The grid attribute is a named list with (some of) the following pages:
coor_cns
: Character vector of length two specifying the names of the data-table-columns containing the spatial grids (in order x,y).x,y
: Numeric vectors of all unique x- and y-coordinates in increasing order (NAs not included).regular
: Logical. Is the grid regular? See details below.dx,dy
: Step sizes of the regular grid (only contained ifregular = TRUE
). By convention we setdx
to 9999 if only one x-coordinate is present, likewise fordy
.complete
: Logical. Is the regular grid complete? See details below.
We call a grid regular if there is a coordinate (x0,y0)
and positive values dx
, dy
,
such that each coordinate of the grid can be written as (x0 + n*dx,y0 + m*dy)
for integers n
,m
.
Importantly, a regular grid does not need to be "a complete rectangle", we allow for missing coordinates, see details below.
We call it a regular complete grid if the grid contains these numbers for all integers n
, m
between some limits n_min
and n_max
,
respectively m_min
, m_max
.
Checking regularity properly is a difficult problem, because we allow for missing coordinates
in the grid and allow for rounding errors.
For the treatment of rounding errors it is not recommended to set regular_tolerance
to NULL
or a very small value
(e.g. 0.1 or smaller). In this case, grids that are regular in praxis are frequently not recognized as regular:
Take for example the three x-coordinates 1, 1.5001, 2.4999. They are supposed to be rounded to 1 digit after
the comma and then the grid is regular with dx = 0.5
. However, if regular_tolerance
is NULL, the grid will be marked as irregular.
Similarly, if regular_tolerance
is too small, the function is not allowed to make rounding errors of 0.0001
and the grid will also not be recognized as regular.
When it comes to the issue of missing values in the grid, we are (deliberately) a bit sloppy and only check whether
the coordinates are part of a grid with dx
being the minimum x
-difference between two coordinates,
and similar dy
. This may not detect regularity, when we have data that is sparse on a regular grid.
An example would be the three lon/lat coordinates c(0,0)
, c(2,0)
, c(5,0)
. They clearly lie on the regular integer-lon/lat-
grid. However, the grid would show as not regular, because dx
is not checked for smaller values than 2.
This choice is on purpose, since for most applications grids with many (or mostly) holes should be treated as irregular (e.g. plotting, upscaling, etc.).
The most important case of regular but not complete grids is gridded data that is restricted to a certain region, e.g. a country
or restricted to land. This is what we think of when we think of a regular incomplete grid, and for such data the check works perfectly.
Note that at the very bottom it is the definition of regularity itself that is a bit tricky:
If we allow dx
, dy
to go all the way down to the machine-delta,
then pretty much any set of coordinates represented in a computer is part of a regular grid.
This hints at testing and detecting regularity actually depending on how small you're willing to make your dx
,dy
.
An example in 1 dimension: consider the three 1-dimensional coordinates 0
, 1
, and m/n
, with m
and n
integers
without common divisors and m>n
. It is not difficult to see that these coordinates are part of a regular grid and that the
largest dx
for detecting this is 1/n. This shows that you can have very small coordinate sets that are in theory regular, but their regularity
can be arbitrarily hard to detect. An example of a grid that is truely not regular are the three x
-coordinates 0,1,a with a irrational.
Value
Nothing, the attributes of dt are set in the parent environment. Moreover, the grid coordinates may be rounded If regular
Examples
dt = data.table(lon = 1:4, lat = rep(1:2,each = 2), some_data = runif(4))
print(dt)
attr(dt,'grid')
set_spatial_grid(dt)
attr(dt,'grid')