read.toolbox {interlineaR} | R Documentation |
Parse a Toolbox (SIL) text file
Description
Parse a Toolbox (SIL) text file
Usage
read.toolbox(path, text.fields.suppl = NULL, sentence.fields.suppl = c("tx",
"nt", "ft"), word.fields.suppl = NULL, morpheme.fields.suppl = NULL)
Arguments
path |
length-1 character vector: the path to a toolbox text file. |
text.fields.suppl |
character vector: the code of supplementary fields to be searched for each text (genre, ...). "id" is mandatory and need not to be listed here. |
sentence.fields.suppl |
character vector: the code of supplementary fields to be searched for each sentence (such as ft, nt). "ref" is mandatory and need not to be listed here. |
word.fields.suppl |
character vector: the code of supplementary fields to be searched for each word. "tx" is mandatory and need not to be listed here. |
morpheme.fields.suppl |
character vector: the code of supplementary fields to be searched for each morpheme. "mb", "ge", "ps" are mandatory and need not to be listed here. |
Value
a list with four slots "texts", "sentences", "words" and "morphemes", each one containing a data frame. In these data frame, each row describe an occurrence of the corresponding unit.
References
https://software.sil.org/toolbox/
See Also
read.emeld (XML vocabulary for interlinearized glossed texts)
Examples
corpuspath <- system.file("exampleData", "tuwariToolbox.txt", package="interlineaR")
corpus <- read.toolbox(corpuspath)