edhw {sdam} | R Documentation |
Wrapper function for manipulation of the EDH dataset
Description
A function to obtain variable data and perform transformations on the EDH
dataset.
Usage
edhw(x = "EDH", vars, as = c("df", "list"), type = c("long", "wide", "narrow"),
split, select, addID, limit, id, na.rm, ldf, province, gender, rp, ...)
Arguments
x |
a list object name with fragments of the |
vars |
vector of variables of interest from |
as |
format to return the output; either as a |
type |
type format of data frame; either |
split |
divide the data into groups by id? (optional and logical) |
select |
vector with |
addID |
add identification to the output? (optional and logical) |
limit |
integer or vector to limit the returned output. Ignored if |
id |
select only |
na.rm |
remove entries with NA data? (optional and logical) |
ldf |
is |
province |
name or abbreviation of Roman province in |
gender |
gender of people in |
rp |
customized list of Roman provinces as in |
... |
optional arguments if needed. |
Details
This is an interface to extract attribute variables from the EDH
dataset attached to this package either as a built-in dataset or as external data.
EDH dataset is a built-in data set of Latin epigraphy retrieved from the Epigraphic Database Heidelberg API repository where epigraphs or inscriptions
in this dataset are recorded in a list object of 84701 items (until 10-11-2020) with at least one of the following 47 (or more) attribute names in the list:
"ID"
, "commentary"
, "fotos"
, "country"
,
"depth"
, "diplomatic_text"
, "edh_geography_uri"
, "findspot"
,
"findspot_ancient"
, "findspot_modern"
, "geography"
, "height"
, "id"
,
"language"
, "last_update"
, "letter_size"
, "literature"
,
"material"
, "military"
, "modern_region"
, "not_after"
,
"not_before"
, "people"
(which is a list with: "person_id"
, "nomen"
, "cognomen"
,
"praenomen"
, "name"
, "gender"
, "status"
, "tribus"
,
"origo"
, "occupation"
, "age: years"
, "age: months"
, "age: days"
),
"present_location"
, "province_label"
, "religion"
,
"responsible_individual"
,
"social_economic_legal_history"
, "transcription"
,
"trismegistos_uri"
,
"type_of_inscription"
, "type_of_monument"
, "uri"
, "width"
,
"work_status"
, and "year_of_find"
.
The input in x
, however, can be fragments of the EDH
dataset or from
the Epigraphic Database Heidelberg API obtained by functions get.edh
or get.edhw
with the "rjson
" format, or transformed data organized, for example, by provinces.
When x
is explicit, it must be at least a list object with a comparable structure to the EDH
dataset.
Argument ldf
is a flag when the input in x
is a created list of data frames that are
organised by variables rather than by records as in the EDH
dataset.
The return of the output is either as
a list with list
or by default as a data frame with option df
.
The extraction from EDH
is typically through argument vars
in the function, and in case that vars
is missing, then it takes all entries in x
.
Ad hoc arguments are the EDH
entries province
and gender
for entering a Roman province
and people's gender in x
as a data frame; otherwise, these arguments are ignored.
When province
is used, it is possible to refer to a customized list of provinces with argument "rp
";
otherwise, dataset rp
is the default where names and abbreviations are accepted.
By default, this wrapper returns a list object with or without a numerical ‘ID’ identification provided by the addID
argument.
When the output is a data frame, the ordering of the variables is alphabetically and, if desired, it is possible to remove missing data
from the output by activating na.rm
and work with complete cases.
Arguments id
and limit
serve to reduce the returned output either to some Epigraphic Database number or to numbers,
which are specified by hd_nr
, or else by limiting the amount of the returned output.
limit
here is like the limit
argument of function get.edh
, but in
this case the offset can be specified as a sequence.
While limit
is a faster way to get to entries in the EDH
dataset, argument id
is for
referring to precisely one or more hd_nr
s in the Epigraphic Database Heidelberg API.
Component "people"
is a separated list in the EDH
dataset, and it should be considered as
a separate case from the rest of the variables.
In the case that the output is a data frame, the default output is a ‘long’ type
table; that is records can
appear in different rows and each variable is assigned into a single column, and with this option is possible to
select
"people"
variables like gender and origin.
When choosing people variables with select
and a data frame output, then "people"
attribute must be in vars
.
By setting "wide"
in type
, it is possible to place the different people from a single entry
column by column in the data frame and each record has a single row. Finally, argument split
allows for
dividing the data in the data frame into groups by ‘id’, which corresponds to the HD number of inscription
in the EDH
dataset.
Value
A list or a data frame with a long or wide format, depending on the input arguments.
Argument province
with no vars
returns a list of lists.
Warning
EDH
is a built-in dataset in the development and legacy version of the package but,
because of its size, re not part of the CRAN distribution. Functions edhw
and edhwpd
download EDH
from another repository in References.
Note
Warning
messages are given for the EDH
dataset as the input, and when choosing
the province
argument alone.
Author(s)
Antonio Rivero Ostoic
References
Epigraphic Database Heidelberg – Data Reuse Options, (Online; retrieved on 16 June 2019). URL https://edh-www.adw.uni-heidelberg.de/data
https://edh-www.adw.uni-heidelberg.de/data/api (database retrieved on November 2020)
https://github.com/sdam-au/sdam/tree/master/data
https://github.com/mplex/cedhar/tree/master/pkg/sdam/data
See Also
get.edh
, get.edhw
, rp
, edhwpd
, prex
,
plot.dates
, cln
, rjson
Examples
## Not run:
# load dataset
data(EDH)
# make a list for three variables in 'EDH' for first 4 entries
edhw(vars=c("type_of_inscription", "not_after", "not_before"), limit=4 )
# as before, but also select 'gender' from 'people'
edhw(vars=c("people", "not_after", "not_before"), select="gender", limit=4 )
## End(Not run)