dfm {quanteda} | R Documentation |
Create a document-feature matrix
Description
Construct a sparse document-feature matrix from a tokens or dfm object.
Usage
dfm(
x,
tolower = TRUE,
remove_padding = FALSE,
verbose = quanteda_options("verbose"),
...
)
Arguments
x |
|
tolower |
convert all features to lowercase. |
remove_padding |
logical; if |
verbose |
display messages if |
... |
not used. |
Value
a dfm object
Changes in version 3
In quanteda v4, many convenience functions formerly available in
dfm()
were removed.
See Also
Examples
## for a corpus
toks <- data_corpus_inaugural |>
corpus_subset(Year > 1980) |>
tokens()
dfm(toks)
# removal options
toks <- tokens(c("a b c", "A B C D")) |>
tokens_remove("b", padding = TRUE)
toks
dfm(toks)
dfm(toks) |>
dfm_remove(pattern = "") # remove "pads"
# preserving case
dfm(toks, tolower = FALSE)
[Package quanteda version 4.0.2 Index]