ConfigFileOpen {s2dv} | R Documentation |
Functions To Create Open And Save Configuration File
Description
These functions help in creating, opening and saving configuration files.
Usage
ConfigFileOpen(file_path, silent = FALSE, stop = FALSE)
ConfigFileCreate(file_path, confirm = TRUE)
ConfigFileSave(configuration, file_path, confirm = TRUE)
Arguments
file_path |
Path to the configuration file to create/open/save. |
silent |
Flag to activate or deactivate verbose mode. Defaults to FALSE (verbose mode on). |
stop |
TRUE/FALSE whether to raise an error if not all the mandatory default variables are defined in the configuration file. |
confirm |
Flag to stipulate whether to ask for confirmation when
saving a configuration file that already exists. |
configuration |
Configuration object to save in a file. |
Details
ConfigFileOpen() loads all the data contained in the configuration file
specified as parameter 'file_path'.
Returns a configuration object with the variables needed for the
configuration file mechanism to work.
This function is called from inside the Load() function to load the
configuration file specified in 'configfile'.
ConfigFileCreate() creates an empty configuration file and saves it to
the specified path. It may be opened later with ConfigFileOpen() to be edited.
Some default values are set when creating a file with this function, you
can check these with ConfigShowDefinitions().
ConfigFileSave() saves a configuration object into a file, which may then
be used from Load().
Two examples of configuration files can be found inside the 'inst/config/'
folder in the package:
BSC.conf: configuration file used at BSC-CNS. Contains location data on several datasets and variables.
template.conf: very simple configuration file intended to be used as pattern when starting from scratch.
How the configuration file works:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It contains one list and two tables.
Each of these have a header that starts with '!!'. These are key lines and
should not be removed or reordered.
Lines starting with '#' and blank lines will be ignored.
The list should contains variable definitions and default value definitions.
The first table contains information about experiments.
The third table contains information about observations.
Each table entry is a list of comma-separated elements.
The two first are part of a key that is associated to a value formed by the
other elements.
The key elements are a dataset identifier and a variable name.
The value elements are the dataset main path, dataset file path, the
variable name inside the .nc file, a default suffix (explained below) and a
minimum and maximum vaues beyond which loaded data is deactivated.
Given a dataset name and a variable name, a full path is obtained
concatenating the main path and the file path.
Also the nc variable name, the suffixes and the limit values are obtained.
Any of the elements in the keys can contain regular expressions[1] that will
cause matching for sets of dataset names or variable names.
The dataset path and file path can contain shell globbing expressions[2]
that will cause matching for sets of paths when fetching the file in the
full path.
The full path can point to an OPeNDAP URL.
Any of the elements in the value can contain variables that will be replaced
to an associated string.
Variables can be defined only in the list at the top of the file.
The pattern of a variable definition is
VARIABLE_NAME = VARIABLE_VALUE
and can be accessed from within the table values or from within the variable
values as
$VARIABLE_NAME$
For example:
FILE_NAME = tos.nc
!!table of experiments
ecmwf, tos, /path/to/dataset/, $FILE_NAME$
There are some reserved variables that will offer information about the
store frequency, the current startdate Load() is fetching, etc:
$VAR_NAME$, $START_DATE$, $STORE_FREQ$, $MEMBER_NUMBER$
for experiments only: $EXP_NAME$
for observations only: $OBS_NAME$, $YEAR$, $MONTH$, $DAY$
Additionally, from an element in an entry value you can access the other
elements of the entry as:
$EXP_MAIN_PATH$, $EXP_FILE_PATH$,
$VAR_NAME$, $SUFFIX$, $VAR_MIN$, $VAR_MAX$
The variable $SUFFIX$ is useful because it can be used to take part in the
main or file path. For example: '/path/to$SUFFIX$/dataset/'.
It will be replaced by the value in the column that corresponds to the
suffix unless the user specifies a different suffix via the parameter
'suffixexp' or 'suffixobs'.
This way the user is able to load two variables with the same name in the
same dataset but with slight modifications, with a suffix anywhere in the
path to the data that advices of this slight modification.
The entries in a table will be grouped in 4 levels of specificity:
-
General entries:
- the key dataset name and variable name are both a regular expression matching any sequence of characters (.*) that will cause matching for any pair of dataset and variable names
Example: .*, .*, /dataset/main/path/, file/path, nc_var_name, suffix, var_min, var_max -
Dataset entries:
- the key variable name matches any sequence of characters
Example: ecmwf, .*, /dataset/main/path/, file/path, nc_var_name, suffix, var_min, var_max -
Variable entries:
- the key dataset name matches any sequence of characters
Example: .*, tos, /dataset/main/path/, file/path, nc_var_name, suffix, var_min, var_max -
Specific entries:
- both key values are specified
Example: ecmwf, tos, /dataset/main/path/, file/path, nc_var_name, suffix, var_min, var_max
Given a pair of dataset name and variable name for which we want to know the
full path, all the rules that match will be applied from more general to
more specific.
If there is more than one entry per group that match a given key pair,
these will be applied in the order of appearance in the configuration file
(top to bottom).
An asterisk (*) in any value element will be interpreted as 'leave it as is
or take the default value if yet not defined'.
The default values are defined in the following reserved variables:
$DEFAULT_EXP_MAIN_PATH$, $DEFAULT_EXP_FILE_PATH$, $DEFAULT_NC_VAR_NAME$,
$DEFAULT_OBS_MAIN_PATH$, $DEFAULT_OBS_FILE_PATH$, $DEFAULT_SUFFIX$,
$DEFAULT_VAR_MIN$, $DEFAULT_VAR_MAX$,
$DEFAULT_DIM_NAME_LATITUDES$, $DEFAULT_DIM_NAME_LONGITUDES$,
$DEFAULT_DIM_NAME_MEMBERS$
Trailing asterisks in an entry are not mandatory. For example
ecmwf, .*, /dataset/main/path/, *, *, *, *, *
will have the same effect as
ecmwf, .*, /dataset/main/path/
A double quote only (") in any key or value element will be interpreted as
'fill in with the same value as the entry above'.
Value
ConfigFileOpen() returns a configuration object with all the information for
the configuration file mechanism to work.
ConfigFileSave() returns TRUE if the file has been saved and FALSE otherwise.
ConfigFileCreate() returns nothing.
References
[1] https://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html
[2] https://tldp.org/LDP/abs/html/globbingref.html
See Also
ConfigApplyMatchingEntries, ConfigEditDefinition, ConfigEditEntry, ConfigFileOpen, ConfigShowSimilarEntries, ConfigShowTable
Examples
# Create an empty configuration file
config_file <- paste0(tempdir(), "/example.conf")
ConfigFileCreate(config_file, confirm = FALSE)
# Open it into a configuration object
configuration <- ConfigFileOpen(config_file)
# Add an entry at the bottom of 4th level of file-per-startdate experiments
# table which will associate the experiment "ExampleExperiment2" and variable
# "ExampleVariable" to some information about its location.
configuration <- ConfigAddEntry(configuration, "experiments",
"last", "ExampleExperiment2", "ExampleVariable",
"/path/to/ExampleExperiment2/",
"ExampleVariable/ExampleVariable_$START_DATE$.nc")
# Edit entry to generalize for any variable. Changing variable needs .
configuration <- ConfigEditEntry(configuration, "experiments", 1,
var_name = ".*",
file_path = "$VAR_NAME$/$VAR_NAME$_$START_DATE$.nc")
# Now apply matching entries for variable and experiment name and show the
# result
match_info <- ConfigApplyMatchingEntries(configuration, 'tas',
exp = c('ExampleExperiment2'), show_result = TRUE)
# Finally save the configuration file.
ConfigFileSave(configuration, config_file, confirm = FALSE)