lma_process {lingmatch} | R Documentation |
Process Text
Description
A wrapper to other pre-processing functions, potentially from read.segments
, to lma_dtm
or lma_patcat
, to lma_weight
, then lma_termcat
or lma_lspace
,
and optionally including lma_meta
output.
Usage
lma_process(input = NULL, ..., meta = TRUE, coverage = FALSE)
Arguments
input |
A vector of text, or path to a text file or folder. |
... |
arguments to be passed to |
meta |
Logical; if |
coverage |
Logical; if |
Value
A matrix with texts represented by rows, and features in columns, unless there are multiple rows per output (e.g., when a latent semantic space is applied without terms being mapped) in which case only the special output is returned (e.g., a matrix with terms as rows and latent dimensions in columns).
See Also
If you just want to compare texts, see the lingmatch()
function.
Examples
# starting with some texts in a vector
texts <- c(
"Firstly, I would like to say, and with all due respect...",
"Please, proceed. I hope you feel you can speak freely...",
"Oh, of course, I just hope to be clear, and not cause offense...",
"Oh, no, don't monitor yourself on my account..."
)
# by default, term counts and metastatistics are returned
lma_process(texts)
# add dictionary and percent arguments for standard dictionary-based results
lma_process(texts, dict = lma_dict(), percent = TRUE)
# add space and weight arguments for standard word-centroid vectors
lma_process(texts, space = lma_lspace(texts), weight = "tfidf")