R: Save a tokenised dataset as nametagger train data

write_nametagger {nametagger}

R Documentation

Save a tokenised dataset as nametagger train data

Description

Save a tokenised dataset as nametagger train data

Usage

write_nametagger(x, file = tempfile(fileext = ".txt", pattern = "nametagger_"))

Arguments

`x`	a tokenised data.frame with columns doc_id, sentence_id, token containing 1 row per token. In addition it can have columns lemma and pos representing the lemma and the parts-of-speech tag of the token
`file`	the path to the file where the training data will be saved

Value

invisibly an object of class nametagger_traindata which is a list with elements

data: a character vector of text in the nametagger format
file: the path to the file where data is saved to

Examples

data(europeananews)
x <- subset(europeananews, doc_id %in% "enp_NL.kb.bio")
x <- head(x, n = 250)

path <- "traindata.txt" 

bio  <- write_nametagger(x, file = path)
str(bio)

[Package nametagger version 0.1.3 Index]