predict.crf {crfsuite} | R Documentation |
Predict the label sequence based on the Conditional Random Field
Description
Predict the label sequence based on the Conditional Random Field
Usage
## S3 method for class 'crf'
predict(
object,
newdata,
embeddings,
group,
type = c("marginal", "sequence"),
trace = FALSE,
...
)
Arguments
object |
an object of class crf as returned by |
newdata |
a character matrix of data containing attributes about the label sequence |
embeddings |
a matrix with the same number of rows as |
group |
an integer or character vector of the same length as nrow |
type |
either 'marginal' or 'sequence' to get predictions at the level of |
trace |
a logical indicating to show the trace of the labelling output. Defaults to |
... |
not used |
Value
If type
is 'marginal': a data.frame with columns label and marginal containing the viterbi decoded predicted label and marginal probability.
If type
is 'sequence': a data.frame with columns group and probability containing for each sequence group the probability of the sequence.
See Also
Examples
library(udpipe)
data(airbnb_chunks, package = "crfsuite")
udmodel <- udpipe_download_model("dutch-lassysmall")
udmodel <- udpipe_load_model(udmodel$file_model)
airbnb_tokens <- unique(airbnb_chunks[, c("doc_id", "text")])
airbnb_tokens <- udpipe_annotate(udmodel,
x = airbnb_tokens$text,
doc_id = airbnb_tokens$doc_id)
airbnb_tokens <- as.data.frame(airbnb_tokens)
x <- merge(airbnb_chunks, airbnb_tokens)
x <- crf_cbind_attributes(x, terms = c("upos", "lemma"), by = "doc_id")
model <- crf(y = x$chunk_entity,
x = x[, grep("upos|lemma", colnames(x))],
group = x$doc_id,
method = "lbfgs", options = list(max_iterations = 5))
scores <- predict(model,
newdata = x[, grep("upos|lemma", colnames(x))],
group = x$doc_id, type = "marginal")
head(scores)
scores <- predict(model,
newdata = x[, grep("upos|lemma", colnames(x))],
group = x$doc_id, type = "sequence")
head(scores)
## cleanup for CRAN
file.remove(model$file_model)
file.remove("modeldetails.txt")
file.remove(udmodel$file)