netcdf_to_dt {SeaVal}R Documentation

function for converting netcdfs to long data tables.

Description

The function converts netcdfs into long data.tables. Be aware that the data table can be much larger in memory, especially if you have many dimension variables.

Usage

netcdf_to_dt(
  nc,
  vars = NULL,
  verbose = 2,
  trymerge = TRUE,
  subset_list = NULL,
  keep_nas = FALSE
)

Arguments

nc

Either a character string with the name of the .nc file (including path), or an object of type ncdf4.

vars

Which variables should be read from the netcdf? Either a character vector of variable names, or an integer vector of variable indices. If this is NULL, all variables are read.

verbose

Either 0, 1 or 2. How much information should be printed? The default (2) is to print the entire netcdf information (as output by ncdf4::nc_open), 1 just prints the units for all variables, 0 (or any other input) prints nothing.

trymerge

logical. If TRUE, a single data table containing all variables is returned, else a list of data tables, one for each variable. The latter is much more memory efficient if you have multiple variables depending on different dimensions.

subset_list

A named list for reading only subsets of the data. Currently only 'rectangle subsetting' is provided, i.e. you can provide two limit values for each dimension and everything between will be read. The names of the pages of subset_list must correspond to the names of dimension variables in the netcdf, and each page should contain a (two-element-)range vector. For example, subsetting a global dataset to just East Africa could look like this: subset_list = list(latitude = c(-15,25),longitude = c(20,55)). Non-rectangular subsetting during reading a netcdf seems to be difficult, see ncvar_get. Every dimension variable not named in subset_list is read entirely.

keep_nas

Should missing values be kept? If FALSE (the default), missing values are not included in the returned data table. If this is set to TRUE, the data table is constructed from the full data-cube (meaning its number of rows is the product of the length of the dimension variables, even if many coordinates have missing data). This makes the returned data table potentially much larger and is almost never an advantage. It is only allowed, because it can make complex bookkeeping tasks easier (specifically upscaling many CHIRPS-netcdfs with the same coordinates while saving the upscaling weights in a matrix).

Value

A data table if trymerge == TRUE or else a list of data tables.

Examples

# filename of example-netcdf file:
fn = system.file("extdata", "example.nc", package="SeaVal")

dt = netcdf_to_dt(fn)
print(dt)



[Package SeaVal version 1.1.1 Index]