H5D-class {hdf5r} | R Documentation |
Class for representing HDF5 datasets
Description
In HDF5, datasets can be located in a group (see H5Group
) or at the
root of a file (see H5File
). They can be created either with a pre-existing R-object
(arrays as well as data.frames are supported, but not lists or other complex objects), or by specifying
an explicit datatype (for available datatypes see h5types$overview
as well as the dimension.
In addition, other features are supported such as transparent compression (for which a chunk-size can be selected).
Details
In order to create a dataset, the create_dataset
methods of either H5Group
or
H5File
should be used. Please see the documentation there for how to create them.
The most important parts of a dataset are the
- Space
The space of the dataset. It describes the dimension of the dataset as well as the maximum dimensions. Can be obtained using the
get_space
of theH5S
object.- Datatype
The datatypes that is being used in the dataset. Can be obtained using the
get_type
method. SeeH5T
to get more information about using datatypes.
In order to read and write datasets, the read
and write
methods are available. In addition, the standard way of using
[
to access arrays is supported as well (see H5S_H5D_subset_assign
for more help).
Other information/action of possible interest are
- Storage size
The size of the dataset can be extracted using
get_storage_size
- Size change
The size of the dataset can be changed using the
set_extent
method
Please also note the active methods
- dims
Dimension of the dataset
- maxdims
Maximum dimensions of the dataset
- chunk_dims
Dimension of the chunks
- key_info
Returns the space, type, property-list and dimensions
Value
Object of class H5D
.
Methods
new(id = NULL)
-
Initializes a new dataset-object. Only for internal use. Use the
create_dataset
function forH5Group
andH5File
objectsParameters
- id
For internal use only
get_space()
-
This function implements the HDF5-API function H5Dget_space. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_space_status()
-
This function implements the HDF5-API function H5Dget_space_status. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_type(native = TRUE)
-
This function implements the HDF5-API function H5Dget_type. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_create_plist()
-
This function implements the HDF5-API function H5Dget_create_plist. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_access_plist()
-
This function implements the HDF5-API function H5Dget_access_plist. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_offset()
-
This function implements the HDF5-API function H5Dget_offset. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_storage_size()
-
This function implements the HDF5-API function H5Dget_storage_size. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
vlen_get_buf_size(type, space)
-
This function implements the HDF5-API function H5Dvlen_get_buf_size. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
vlen_reclaim(buffer, type, space, dataset_xfer_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Dvlen_reclaim. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
read_low_level(file_space = h5const$H5S_ALL, mem_space = NULL, mem_type = NULL, dataset_xfer_pl = h5const$H5P_DEFAULT, flags = getOption("hdf5r.h5tor_default"), set_dim = FALSE, dim_to_set = NULL, drop = TRUE)
-
This function is for advanced users. It is recommended to use
read
instead or the[
interface. This function implements the HDF5-API function H5Dread, with minor changes to the API to accommodate R. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details. It reads the data in the dataset as specified bymem_space
and return it as an R-objParameters
- file_space
An HDF5-space, represented as class
H5S
that determines which part of the dataset is being read. Can also be given as an id- mem_space
The space as it is represented in memory; advanced feature; may be removed in the future. Can also be given as an id.
- mem_type
Memory type; extracted from the dataset if null (can be passed in for efficiency reasons Can also be given as an id.
- dataset_xfer_pl
Dataset transfer property list. See
H5P_DATASET_XFER
- flags
Conversion rules for integer values. See also
h5const
- set_dim
If
TRUE
, the dimension attribute is set in the return value. How it is set is determined bydim_to_set
.- dim_to_set
The dimension to set; Has to be numeric and needs to be specified if
set_dim
isTRUE
. If the result is a data.frame, i.e. the data-type is a compound, then the dimension is ignored as the correct dimension is already set.- drop
Logical. Should dimensions of length 1 be dropped (R-default for arrays)
read(args = NULL, dataset_xfer_pl = h5const$H5P_DEFAULT, flags = getOption("hdf5r.h5tor_default"), drop = TRUE, envir = parent.frame())
-
Main interface for reading data from the dataset. It is the function that is used by
[
, where all indices are being passed in the parameterargs
.Parameters
- args
The indices for each dimension to subset given as a list. This makes this easier to use as a programmatic API. For interactive use we recommend the use of the
[
operator. If set toNULL
, the entire dataset will be read.- envir
The environment in which to evaluate
args
- dataset_xfer_pl
An object of class
H5P_DATASET_XFER
.- flags
Some flags governing edge cases of conversion from HDF5 to R. This is related to how integers are being treated and the issue of R not being able to natively represent 64bit integers and not at all being able to represent unsigned 64bit integers (even using add-on packages). The constants governing this are part of
h5const
. The relevant ones start with the termH5TOR
and are documented there. The default set here returns a regular 32bit integer if it doesn't lead to an overflow and returns a 64bit integer from thebit64
package otherwise. For 64bit unsigned int that are larger than 64bit signed int, it return adouble
. This looses precision, however.- drop
Logical. When reading data, should dimensions of size 1 be dropped.
Return
The data that was read as an R object
write_low_level(robj, file_space = h5const$H5S_ALL, mem_space = NULL, mem_type = NULL, dataset_xfer_pl = h5const$H5P_DEFAULT, flush = getOption("hdf5r.flush_on_write"))
-
This function is for advanced users. It is recommended to use
read
instead or the[<-
interface as used for arrays. This function implements the HDF5-API function H5Dwrite, with some changes to accommodate R. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details. It writes that data from therobj
into the dataset.Parameters
- robj
The object to write into the dataset
- mem_space
The space as it is represented in memory; advanced feature; may be removed in the future
- mem_type
Memory type; extracted from the dataset if null (can be passed in for efficiency reasons
- file_space
An HDF5-space, represented as class
H5S
that determines which part of the dataset is being written.- dataset_xfer_pl
Dataset transfer property list. See
H5P_DATASET_XFER
- flush
Should a flush be done after the write
write(args, value, dataset_xfer_pl = h5const$H5P_DEFAULT, envir = parent.frame())
-
Main interface for writing data to the dataset. It is the function that is used by
[<-
, where all indices are being passed in the parameterargs
.Parameters
- args
The indices for each dimension to subset given as a list. This makes this easier to use as a programmatic API. For interactive use we recommend the use of the
[
operator. If set toNULL
, the entire dataset will be read.- value
The data to write to the dataset
- envir
The environment in which to evaluate
args
- dataset_xfer_pl
An object of class
H5P_DATASET_XFER
.
Return
The HDF5 dataset object, returned invisibly
set_extent(dims)
-
This function implements the HDF5-API function H5Dset_extent. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_fill_value()
-
This function implements the HDF5-API function H5Pget_fill_value, automatically supplying the datatype of the dataset for convenience. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_p.html for details.
create_reference(...)
-
This function implements the HDF5-API function H5Rcreate. The parameters are interpreted as in '['. The function always create
H5R_DATASET_REGION
references Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_r.html for details. print(..., max.attributes = 10)
-
Prints information for the dataset
Parameters
- ...
ignored
- max.attributes
Maximum number of attribute names to print
obj_info(remove_internal_use_only = TRUE)
-
This function implements the HDF5-API function H5Oget_info. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_o.html for details.
get_obj_name()
-
This function implements the HDF5-API function H5Iget_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_i.html for details.
create_attr(attr_name, robj = NULL, dtype = NULL, space = NULL)
-
This function implements the HDF5-API function H5Acreate2. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_open(attr_name)
-
This function implements the HDF5-API function H5Aopen. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
create_attr_by_name(attr_name, obj_name, robj = NULL, dtype = NULL, space = NULL, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Acreate_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_open_by_name(attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Aopen_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_open_by_idx(n, obj_name, idx_type = h5const$H5_INDEX_NAME, order = h5const$H5_ITER_NATIVE, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Aopen_by_idx. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_exists_by_name(attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Aexists_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_exists(attr_name)
-
This function implements the HDF5-API function H5Aexists. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_rename_by_name(old_attr_name, new_attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Arename_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_rename(old_attr_name, new_attr_name)
-
This function implements the HDF5-API function H5Arename. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_delete(attr_name)
-
This function implements the HDF5-API function H5Adelete. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_delete_by_name(attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Adelete_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_delete_by_idx(n, obj_name, idx_type = h5const$H5_INDEX_NAME, order = h5const$H5_ITER_NATIVE, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Adelete_by_idx. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_info_by_name(attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Aget_info_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_info_by_idx(n, obj_name, idx_type = h5const$H5_INDEX_NAME, order = h5const$H5_ITER_NATIVE, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Aget_info_by_idx. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_name_by_idx(n, obj_name, idx_type = h5const$H5_INDEX_NAME, order = h5const$H5_ITER_NATIVE, link_access_pl = h5const$H5P_DEFAULT)
-
This function implements the HDF5-API function H5Aget_name_by_idx. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_get_number()
-
This function implements the HDF5-API function H5Aget_num_attrs. Please see the documentation at https://support.hdfgroup.org/HDF5/doc/RM/RM_H5A.html#Annot-NumAttrs for details.
flush(scope = h5const$H5F_SCOPE_GLOBAL)
-
This function implements the HDF5-API function H5Fflush. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_f.html for details.
get_filename()
-
This function implements the HDF5-API function H5Fget_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_f.html for details.
dims()
-
Get the dimension of the dataset
maxdims()
-
Get the maximal dimension of the dataset
chunk_dims()
-
Return the dimension of the chunks. NA if the dataset is not chunked
key_info()
-
Returns the key types as a list, consisting of type, space, dataset_create_pl, type_size_raw, type_size_variable, dims and chunk_dims. type_size_raw versus variable differs for variable length types, which return
Inf
for type_size_variable and the underlying size for type_size_raw
Author(s)
Holger Hoefling
Examples
# First create a file to create datasets in it
fname <- tempfile(fileext = ".h5")
file <- H5File$new(fname, mode = "a")
# Show the 3 different ways how to create a dataset
file[["directly"]] <- matrix(1:10, ncol=2)
file$create_dataset("from_robj", matrix(1:10, ncol=2))
dset <- file$create_dataset("basic", dtype=h5types$H5T_NATIVE_INT,
space=H5S$new("simple", dims=c(5, 2), maxdims=c(10,2)), chunk_dims=c(5,2))
# Different ways of reading the dataset
dset$read(args=list(1:5, 1))
dset$read(args=list(1:5, quote(expr=)))
dset$read(args=list(1:5, NULL))
dset[1:5, 1]
dset[1:5, ]
dset[1:5, NULL]
# Writing to the dataset
dset$write(args=list(1:3, 1:2), value=11:16)
dset[4:5, 1:2] <- -(1:4)
dset[,]
# Extract key information
dset$dims
dset$maxdims
dset$chunk_dims
dset$key_info
dset
file$close_all()
file.remove(fname)