R: Thesaurus Set for 'zoolog'

zoologThesaurus {zoolog}

R Documentation

Thesaurus Set for zoolog

Description

The thesaurus set defined for the package zoolog. This is used to make the methods robust to different nomenclatures used in datasets created by different authors. The user can also use other thesaurus sets, or can modify the provided thesaurus set (see ThesaurusManagement and ThesaurusReaderWriter).

Usage

zoologThesaurus

Format

A thesaurus set is a list of thesauri with additional attributes:

names: Character vector with the name of each thesaurus.
applyToColNames: Logical vector indicating whether each thesaurus should be applied to the column names of the data frame.
applyToColValues: Logical vector indicating whether each thesaurus should be applied to the values in the corresponding column of the data frame.
filename: Character vector with the source file of each thesaurus.

The examples below show the list of four thesauri included in the provided zoologThesurus.

Each thesaurus is a data frame also with additional attributes. Each column of the data frame is a category of names with equivalent meaning in the intended application. The column name identifies the category and is used as the standard when applying StandardizeNomenclature.

The names in each column (category) must not be included in any other column, since this would make the thesaurus ambiguous (see ThesaurusAmbiguity).

Each thesaurus has the following attributes:

names: The standard name for the categories.
class: "data.frame"
row.names: Irrelevant
caseSensitive: Logical indicating whether the names in the thesaurus should be considered case-sensitive.
accentSensitive: Logical indicating whether the names in the thesaurus should be differentiated by the presence of accent marks.
punctuationSensitive: Logical indicating whether the names in the thesaurus should be differentiated by the presence of punctuation marks.

The examples below show the content and characteristics of the first thesaurus in zoologThesaurus.

File Structure

zoologThesaurus is an exported variable automatically loaded in memory. In addition, the source files generating it are included in the zoolog extdata folder. There is one file for the thesaurus set main structure and one file for each included thesaurus. All of them are in semicolon separated format. Thus, they can be examined in any text editor or imported into any spreadsheet application. The files are:

zoologThesaurusSet.csv: Defines the main structure of the thesaurus set. It has a row for each thesaurus and seven columns (ThesaurusName, FileName, CaseSensitive, AccentSensitive, PunctuationSensitive, ApplyToColNames, and ApplyToColValues). Their meaning coincides with the description above. Observe that the case, accent, and punctuation sensitiveness is stored here, instead of in each thesaurus.
identifierThesaurus.csv: Thesaurus for the identifiers used in LogRatios to identify the bone types and the measure names in the data and the references. It has for columns: Taxon, Element, Measure, and Standard.
taxonThesaurus.csv: Thesaurus for the taxa. There is one column for each category of taxon considered.
elementThesaurus.csv: Thesaurus for the skeletal elements. One column for each category.
measureThesaurus.csv: Thesaurus for the measure names. One column for each category.

Examples

## List of thesaurus names and characteristics in the thesaurus set:
attributes(zoologThesaurus)
## Content of the first thesaurus:
zoologThesaurus$identifier
attributes(zoologThesaurus$identifier)

[Package zoolog version 1.1.0 Index]