| H5D-class {hdf5r} | R Documentation |
Class for representing HDF5 datasets
Description
In HDF5, datasets can be located in a group (see H5Group) or at the
root of a file (see H5File). They can be created either with a pre-existing R-object
(arrays as well as data.frames are supported, but not lists or other complex objects), or by specifying
an explicit datatype (for available datatypes see h5types$overview as well as the dimension.
In addition, other features are supported such as transparent compression (for which a chunk-size can be selected).
Details
In order to create a dataset, the create_dataset methods of either H5Group or
H5File should be used. Please see the documentation there for how to create them.
The most important parts of a dataset are the
- Space
The space of the dataset. It describes the dimension of the dataset as well as the maximum dimensions. Can be obtained using the
get_spaceof theH5Sobject.- Datatype
The datatypes that is being used in the dataset. Can be obtained using the
get_typemethod. SeeH5Tto get more information about using datatypes.
In order to read and write datasets, the read and write methods are available. In addition, the standard way of using
[ to access arrays is supported as well (see H5S_H5D_subset_assign for more help).
Other information/action of possible interest are
- Storage size
The size of the dataset can be extracted using
get_storage_size- Size change
The size of the dataset can be changed using the
set_extentmethod
Please also note the active methods
- dims
Dimension of the dataset
- maxdims
Maximum dimensions of the dataset
- chunk_dims
Dimension of the chunks
- key_info
Returns the space, type, property-list and dimensions
Value
Object of class H5D.
Methods
new(id = NULL)-
Initializes a new dataset-object. Only for internal use. Use the
create_datasetfunction forH5GroupandH5FileobjectsParameters
- id
For internal use only
get_space()-
This function implements the HDF5-API function H5Dget_space. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_space_status()-
This function implements the HDF5-API function H5Dget_space_status. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_type(native = TRUE)-
This function implements the HDF5-API function H5Dget_type. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_create_plist()-
This function implements the HDF5-API function H5Dget_create_plist. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_access_plist()-
This function implements the HDF5-API function H5Dget_access_plist. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_offset()-
This function implements the HDF5-API function H5Dget_offset. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_storage_size()-
This function implements the HDF5-API function H5Dget_storage_size. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
vlen_get_buf_size(type, space)-
This function implements the HDF5-API function H5Dvlen_get_buf_size. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
vlen_reclaim(buffer, type, space, dataset_xfer_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Dvlen_reclaim. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
read_low_level(file_space = h5const$H5S_ALL, mem_space = NULL, mem_type = NULL, dataset_xfer_pl = h5const$H5P_DEFAULT, flags = getOption("hdf5r.h5tor_default"), set_dim = FALSE, dim_to_set = NULL, drop = TRUE)-
This function is for advanced users. It is recommended to use
readinstead or the[interface. This function implements the HDF5-API function H5Dread, with minor changes to the API to accommodate R. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details. It reads the data in the dataset as specified bymem_spaceand return it as an R-objParameters
- file_space
An HDF5-space, represented as class
H5Sthat determines which part of the dataset is being read. Can also be given as an id- mem_space
The space as it is represented in memory; advanced feature; may be removed in the future. Can also be given as an id.
- mem_type
Memory type; extracted from the dataset if null (can be passed in for efficiency reasons Can also be given as an id.
- dataset_xfer_pl
Dataset transfer property list. See
H5P_DATASET_XFER- flags
Conversion rules for integer values. See also
h5const- set_dim
If
TRUE, the dimension attribute is set in the return value. How it is set is determined bydim_to_set.- dim_to_set
The dimension to set; Has to be numeric and needs to be specified if
set_dimisTRUE. If the result is a data.frame, i.e. the data-type is a compound, then the dimension is ignored as the correct dimension is already set.- drop
Logical. Should dimensions of length 1 be dropped (R-default for arrays)
read(args = NULL, dataset_xfer_pl = h5const$H5P_DEFAULT, flags = getOption("hdf5r.h5tor_default"), drop = TRUE, envir = parent.frame())-
Main interface for reading data from the dataset. It is the function that is used by
[, where all indices are being passed in the parameterargs.Parameters
- args
The indices for each dimension to subset given as a list. This makes this easier to use as a programmatic API. For interactive use we recommend the use of the
[operator. If set toNULL, the entire dataset will be read.- envir
The environment in which to evaluate
args- dataset_xfer_pl
An object of class
H5P_DATASET_XFER.- flags
Some flags governing edge cases of conversion from HDF5 to R. This is related to how integers are being treated and the issue of R not being able to natively represent 64bit integers and not at all being able to represent unsigned 64bit integers (even using add-on packages). The constants governing this are part of
h5const. The relevant ones start with the termH5TORand are documented there. The default set here returns a regular 32bit integer if it doesn't lead to an overflow and returns a 64bit integer from thebit64package otherwise. For 64bit unsigned int that are larger than 64bit signed int, it return adouble. This looses precision, however.- drop
Logical. When reading data, should dimensions of size 1 be dropped.
Return
The data that was read as an R object
write_low_level(robj, file_space = h5const$H5S_ALL, mem_space = NULL, mem_type = NULL, dataset_xfer_pl = h5const$H5P_DEFAULT, flush = getOption("hdf5r.flush_on_write"))-
This function is for advanced users. It is recommended to use
readinstead or the[<-interface as used for arrays. This function implements the HDF5-API function H5Dwrite, with some changes to accommodate R. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details. It writes that data from therobjinto the dataset.Parameters
- robj
The object to write into the dataset
- mem_space
The space as it is represented in memory; advanced feature; may be removed in the future
- mem_type
Memory type; extracted from the dataset if null (can be passed in for efficiency reasons
- file_space
An HDF5-space, represented as class
H5Sthat determines which part of the dataset is being written.- dataset_xfer_pl
Dataset transfer property list. See
H5P_DATASET_XFER- flush
Should a flush be done after the write
write(args, value, dataset_xfer_pl = h5const$H5P_DEFAULT, envir = parent.frame())-
Main interface for writing data to the dataset. It is the function that is used by
[<-, where all indices are being passed in the parameterargs.Parameters
- args
The indices for each dimension to subset given as a list. This makes this easier to use as a programmatic API. For interactive use we recommend the use of the
[operator. If set toNULL, the entire dataset will be read.- value
The data to write to the dataset
- envir
The environment in which to evaluate
args- dataset_xfer_pl
An object of class
H5P_DATASET_XFER.
Return
The HDF5 dataset object, returned invisibly
set_extent(dims)-
This function implements the HDF5-API function H5Dset_extent. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_d.html for details.
get_fill_value()-
This function implements the HDF5-API function H5Pget_fill_value, automatically supplying the datatype of the dataset for convenience. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_p.html for details.
create_reference(...)-
This function implements the HDF5-API function H5Rcreate. The parameters are interpreted as in '['. The function always create
H5R_DATASET_REGIONreferences Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_r.html for details. print(..., max.attributes = 10)-
Prints information for the dataset
Parameters
- ...
ignored
- max.attributes
Maximum number of attribute names to print
obj_info(remove_internal_use_only = TRUE)-
This function implements the HDF5-API function H5Oget_info. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_o.html for details.
get_obj_name()-
This function implements the HDF5-API function H5Iget_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_i.html for details.
create_attr(attr_name, robj = NULL, dtype = NULL, space = NULL)-
This function implements the HDF5-API function H5Acreate2. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_open(attr_name)-
This function implements the HDF5-API function H5Aopen. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
create_attr_by_name(attr_name, obj_name, robj = NULL, dtype = NULL, space = NULL, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Acreate_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_open_by_name(attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Aopen_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_open_by_idx(n, obj_name, idx_type = h5const$H5_INDEX_NAME, order = h5const$H5_ITER_NATIVE, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Aopen_by_idx. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_exists_by_name(attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Aexists_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_exists(attr_name)-
This function implements the HDF5-API function H5Aexists. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_rename_by_name(old_attr_name, new_attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Arename_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_rename(old_attr_name, new_attr_name)-
This function implements the HDF5-API function H5Arename. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_delete(attr_name)-
This function implements the HDF5-API function H5Adelete. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_delete_by_name(attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Adelete_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_delete_by_idx(n, obj_name, idx_type = h5const$H5_INDEX_NAME, order = h5const$H5_ITER_NATIVE, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Adelete_by_idx. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_info_by_name(attr_name, obj_name, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Aget_info_by_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_info_by_idx(n, obj_name, idx_type = h5const$H5_INDEX_NAME, order = h5const$H5_ITER_NATIVE, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Aget_info_by_idx. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_name_by_idx(n, obj_name, idx_type = h5const$H5_INDEX_NAME, order = h5const$H5_ITER_NATIVE, link_access_pl = h5const$H5P_DEFAULT)-
This function implements the HDF5-API function H5Aget_name_by_idx. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_a.html for details.
attr_get_number()-
This function implements the HDF5-API function H5Aget_num_attrs. Please see the documentation at https://support.hdfgroup.org/HDF5/doc/RM/RM_H5A.html#Annot-NumAttrs for details.
flush(scope = h5const$H5F_SCOPE_GLOBAL)-
This function implements the HDF5-API function H5Fflush. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_f.html for details.
get_filename()-
This function implements the HDF5-API function H5Fget_name. Please see the documentation at https://docs.hdfgroup.org/hdf5/v1_10/group___h5_f.html for details.
dims()-
Get the dimension of the dataset
maxdims()-
Get the maximal dimension of the dataset
chunk_dims()-
Return the dimension of the chunks. NA if the dataset is not chunked
key_info()-
Returns the key types as a list, consisting of type, space, dataset_create_pl, type_size_raw, type_size_variable, dims and chunk_dims. type_size_raw versus variable differs for variable length types, which return
Inffor type_size_variable and the underlying size for type_size_raw
Author(s)
Holger Hoefling
Examples
# First create a file to create datasets in it
fname <- tempfile(fileext = ".h5")
file <- H5File$new(fname, mode = "a")
# Show the 3 different ways how to create a dataset
file[["directly"]] <- matrix(1:10, ncol=2)
file$create_dataset("from_robj", matrix(1:10, ncol=2))
dset <- file$create_dataset("basic", dtype=h5types$H5T_NATIVE_INT,
space=H5S$new("simple", dims=c(5, 2), maxdims=c(10,2)), chunk_dims=c(5,2))
# Different ways of reading the dataset
dset$read(args=list(1:5, 1))
dset$read(args=list(1:5, quote(expr=)))
dset$read(args=list(1:5, NULL))
dset[1:5, 1]
dset[1:5, ]
dset[1:5, NULL]
# Writing to the dataset
dset$write(args=list(1:3, 1:2), value=11:16)
dset[4:5, 1:2] <- -(1:4)
dset[,]
# Extract key information
dset$dims
dset$maxdims
dset$chunk_dims
dset$key_info
dset
file$close_all()
file.remove(fname)