read.toolbox {interlineaR}R Documentation

Parse a Toolbox (SIL) text file

Description

Parse a Toolbox (SIL) text file

Usage

read.toolbox(path, text.fields.suppl = NULL, sentence.fields.suppl = c("tx",
  "nt", "ft"), word.fields.suppl = NULL, morpheme.fields.suppl = NULL)

Arguments

path

length-1 character vector: the path to a toolbox text file.

text.fields.suppl

character vector: the code of supplementary fields to be searched for each text (genre, ...). "id" is mandatory and need not to be listed here.

sentence.fields.suppl

character vector: the code of supplementary fields to be searched for each sentence (such as ft, nt). "ref" is mandatory and need not to be listed here.

word.fields.suppl

character vector: the code of supplementary fields to be searched for each word. "tx" is mandatory and need not to be listed here.

morpheme.fields.suppl

character vector: the code of supplementary fields to be searched for each morpheme. "mb", "ge", "ps" are mandatory and need not to be listed here.

Value

a list with four slots "texts", "sentences", "words" and "morphemes", each one containing a data frame. In these data frame, each row describe an occurrence of the corresponding unit.

References

https://software.sil.org/toolbox/

See Also

read.emeld (XML vocabulary for interlinearized glossed texts)

Examples

corpuspath <- system.file("exampleData", "tuwariToolbox.txt", package="interlineaR")
corpus <- read.toolbox(corpuspath)

[Package interlineaR version 1.0 Index]