identify_language {labourR} | R Documentation |
Detect Language
Description
This function performs language detection by using Compact Language Detector 2 from CRAN library cld2
.
It is vectorised and guesses the language of each string. Note that it is not designed to do well on very short text,
lists of proper names, part numbers, etc. CLD2 has the highest F1 score and is an order of magnitude faster than CLD3.
Usage
identify_language(text)
Arguments
text |
A string with text to classify or a connection to read from.
|
Value
A character vector with ISO-639-1 two-letter language codes.
Examples
txt <- c("English is a West Germanic language ", "In espaniol, le lingua castilian")
identify_language(txt)
[Package labourR version 1.0.0 Index]