subcorpus {polmineR} | R Documentation |
The S4 subcorpus class.
Description
Class to manage subcorpora derived from a CWB corpus.
Usage
## S4 method for signature 'subcorpus'
summary(object)
## S4 replacement method for signature 'subcorpus'
name(x) <- value
## S4 method for signature 'subcorpus'
get_corpus(x)
## S4 method for signature 'subcorpus'
size(x, s_attribute = NULL, ...)
Arguments
object |
A |
x |
A |
value |
A |
s_attribute |
A |
... |
Arguments passed into |
Methods (by generic)
-
summary(subcorpus)
: Get named list with basic information forsubcorpus
object. -
name(subcorpus) <- value
: Assign name to asubcorpus
object. -
get_corpus(subcorpus)
: Get the corpus ID from thesubcorpus
object. -
size(subcorpus)
: Get the size of asubcorpus
object from the respective slot of the object.
Slots
s_attributes
A named
list
with the structural attributes defining the subcorpus.cpos
A
matrix
with left and right corpus positions defining regions (two column matrix withinteger
values).annotations
Object of class
list
.size
Total size (number of tokens) of the
subcorpus
object (a length-oneinteger
vector). The value is accessible by calling thesize
-method on thesubcorpus
-object (see examples).metadata
Object of class
data.frame
, metadata information.strucs
Object of class
integer
, the strucs defining the subcorpus.xml
Object of class
character
, whether the xml is "flat" or "nested".s_attribute_strucs
Object of class
character
, the base node.user
If the corpus on the server requires authentication, the username.
password
If the corpus on the server requires authentication, the password.
See Also
Most commonly, a subcorpus
is derived from a corpus
or
a subcorpus
using the subset
method. See
size
for detailed documentation on how to use the
size
-method. The subcorpus
class shares many features with
the partition
class, but it is more parsimonious and does not
include information on statistical properties of the subcorpus (i.e. a
count table). In line with this logic, the subcorpus
class inherits
from the corpus
class, whereas the partition
class inherits
from the textstat
class.
Other classes to manage corpora:
corpus-class
,
phrases-class
,
ranges-class
,
regions
Examples
use("polmineR")
# basic example
r <- corpus("REUTERS")
k <- subset(r, grepl("kuwait", places))
name(k) <- "kuwait"
y <- summary(k)
s <- size(k)
# the same with a magrittr pipe
corpus("REUTERS") %>%
subset(grepl("kuwait", places)) %>%
summary()
# subsetting a subcorpus in a pipe
stone <- corpus("GERMAPARLMINI") %>%
subset(date == "2009-11-10") %>%
subset(speaker == "Frank-Walter Steinmeier")
# perform count for subcorpus
n <- corpus("REUTERS") %>% subset(grep("kuwait", places)) %>% count(p_attribute = "word")
n <- corpus("REUTERS") %>% subset(grep("saudi-arabia", places)) %>% count('"Saudi" "Arabia"')
# keyword-in-context analysis (kwic)
k <- corpus("REUTERS") %>% subset(grep("kuwait", places)) %>% kwic("oil")