MazamaLocationUtils {MazamaLocationUtils} | R Documentation |
Manage Spatial Metadata for Known Locations
Description
A suite of utility functions for discovering and managing metadata associated with sets of spatially unique "known locations".
This package is intended to be used in support of data management activities associated with fixed locations in space. The motivating fields include both air and water quality monitoring where fixed sensors report at regular time intervals.
Details
When working with environmental monitoring time series, one of the first things
you have to do is create unique identifiers for each individual time series. In
an ideal world, each environmental time series would have both a
locationID
and a deviceID
that uniquely identify the specific instrument
making measurements and the physical location where measurements are made. A
unique timeseriesID
could
be produced as locationID_deviceID
. Metadata associated with each
timeseriesID
would contain basic information needed for downstream analysis
including at least:
timeseriesID, locationID, deviceID, longitude, latitude, ...
An extended time series for an occasionally re-positioned sensor would group by
deviceID
.Multiple sensors placed at a single location could be be grouped by
locationID
.Maps would be created using
longitude, latitude
.Time series would be accessed from a secondary
data
table withtimeseriesID
.
Unfortunately, we are rarely supplied with a truly unique and truly spatial
locationID
. Instead we often use deviceID
or an associated non-spatial
identifier as a stand-in for locationID
.
Complications we have seen include:
GPS-reported longitude and latitude can have jitter in the fourth or fifth decimal place making it challenging to use them to create a unique
locationID
.Sensors are sometimes re-positioned in what the scientist considers the "same location".
Data for a single sensor goes through different processing pipelines using different identifiers and is later brought together as two separate time series.
The spatial scale of what constitutes a "single location" depends on the instrumentation and scientific question being asked.
Deriving location-based metadata from spatial datasets is computationally intensive unless saved and identified with a unique
locationID
.Automated searches for spatial metadata occasionally produce incorrect results because of the non-infinite resolution of spatial datasets.
This package attempts to address all of these issues by maintaining a table of known locations for which CPU intensive spatial data calculations have already been performed. While requests to add new locations to the table may take some time, searches for spatial metadata associated with existing locations are simple lookups.
Working in this manner will solve the problems initially mentioned but also provides further useful functionality.
Administrators can correct entries in the
collectionName
table. (e.g. locations in river bends that even high resolution spatial datasets mis-assign)Additional, non-automatable metadata can be added to
collectionName
. (e.g. commonly used location names within a community of practice)Different field campaigns can have separate
collectionName
tables..csv
or.rda
versions of well populated tables can be downloaded from a URL and used locally, giving scientists working with known locations instant access to spatial data that otherwise requires special skills, large datasets and lots of compute cycles.