filterWord {tosca} | R Documentation |
Subcorpus With Word Filter
Description
Generates a subcorpus by restricting it to texts containing specific filter words.
Usage
filterWord(...)
## Default S3 method:
filterWord(
text,
search,
ignore.case = FALSE,
out = c("text", "bin", "count"),
...
)
## S3 method for class 'textmeta'
filterWord(
object,
search,
ignore.case = FALSE,
out = c("text", "bin", "count"),
filtermeta = TRUE,
...
)
Arguments
... |
Not used. |
text |
Not necessary if |
search |
List of data frames. Every List element is an 'or'
link, every entry in a data frame is linked by an 'and'. The dataframe must have following tree variables: |
ignore.case |
Logical: Lower and upper case will be ignored. |
out |
Type of output: |
object |
A |
filtermeta |
Logical: Should the meta component be filtered, too? |
Value
textmeta
object if object
is specified,
else only the filtered text
. If a textmeta
object is
returned its meta data are filtered to those texts which appear in the corpus
by default (filtermeta
).
Examples
texts <- list(A="Give a Man a Fish, and You Feed Him for a Day.
Teach a Man To Fish, and You Feed Him for a Lifetime",
B="So Long, and Thanks for All the Fish",
C="A very able manipulative mathematician, Fisher enjoys a real mastery
in evaluating complicated multiple integrals.")
# search for pattern "fish"
filterWord(text=texts, search="fish", ignore.case=TRUE)
# search for word "fish"
filterWord(text=texts, search=data.frame(pattern="fish", word="word", count=1),
ignore.case=TRUE)
# pattern must appear at least two times
filterWord(text=texts, search=data.frame(pattern="fish", word="pattern", count=2),
ignore.case=TRUE)
# search for "fish" AND "day"
filterWord(text=texts, search=data.frame(pattern=c("fish", "day"), word="word", count=1),
ignore.case=TRUE)
# search for "Thanks" OR "integrals"
filterWord(text=texts, search=list(data.frame(pattern="Thanks", word="word", count=1),
data.frame(pattern="integrals", word="word", count=1)))