tidync {tidync} | R Documentation |
Tidy NetCDF
Description
Connect to a NetCDF source and allow use of hyper_*
verbs for slicing with
hyper_filter()
, extracting data with hyper_array()
and [hyper_tibble()
from an activated grid. By default the largest grid encountered is
activated, seeactivate()
.
Usage
tidync(x, what, ...)
## S3 method for class 'character'
tidync(x, what, ...)
## S3 method for class 'tidync_data'
tidync(x, what, ...)
Arguments
x |
path to a NetCDF file |
what |
(optional) character name of grid (see |
... |
reserved for arguments to methods, currently ignored |
Details
The print method for tidync includes a lot of information about which
variables exist on which dimensions, and if any slicing (hyper_filter()
)
operations have occurred these are summarized as 'start' and 'count'
modifications relative to the dimension lengths. See print
for these details, and hyper_vars for programmatic access to
this information
Many NetCDF forms are supported and tidync tries to reduce the interpretation applied to a given source. The NetCDF system defines a 'grid' for storing array data, where 'grid' is the array 'shape', or 'set of dimensions'). There may be several grids in a single source and so we introduce the concept of grid 'activation'. Once activated, all downstream tasks apply to the set of variables that exist on that grid.
NetCDF sources with numeric types are chosen by default, even if existing 'NC_CHAR' type variables are on the largest grid. When read any 'NC_CHAR' type variables are exploded into single character elements so that dimensions match the source.
Grids
A grid is an instance of a particular set of dimensions, which can be shared by more than one variable. This is not the 'rank' of a variable (the number of dimensions) since a single data set may have many 3D variables composed of different sets of axes/dimensions. There's no formality around the concept of 'shape', as far as we know.
A dimension may have length zero, but this is a special case for a "measure" dimension, we think. (It doesn't mean the product of the dimensions is zero, for example).
Limitations
Files with compound types are not yet supported and should fail gracefully. Groups are not yet supported.
We haven't yet explored 'HDF5' in detail, so any feedback is appreciated. Major use of compound types is made by https://github.com/sosoc/croc.
Examples
## a SeaWiFS (S) Level-3 Mapped (L3m) monthly (MO) chlorophyll-a (CHL)
## remote sensing product at 9km resolution (at the equator)
## from the NASA ocean colour group in NetCDF4 format (.nc)
## for 31 day period January 2008 (S20080012008031)
f <- "S20080012008031.L3m_MO_CHL_chlor_a_9km.nc"
l3file <- system.file("extdata/oceandata", f, package= "tidync")
## skip on Solaris
if (!tolower(Sys.info()[["sysname"]]) == "sunos") {
tnc <- tidync(l3file)
print(tnc)
}
## very simple Unidata example file, with one dimension
## Not run:
uf <- system.file("extdata/unidata", "test_hgroups.nc", package = "tidync")
recNum <- tidync(uf) %>% hyper_tibble()
print(recNum)
## End(Not run)
## a raw grid of Southern Ocean sea ice concentration from IFREMER
## it is 12.5km resolution passive microwave concentration values
## on a polar stereographic grid, on 2 October 2017, displaying the
## "hole in the ice" made famous here:
## https://tinyurl.com/ycbchcgn
ifr <- system.file("extdata/ifremer", "20171002.nc", package = "tidync")
ifrnc <- tidync(ifr)
ifrnc %>% hyper_tibble(select_var = "concentration")