subcorpus_bundle-class {polmineR}R Documentation

Bundled subcorpora

Description

A subcorpus_bundle object combines a set of subcorpus objects in a list in the the slot objects. The class inherits from the partition_bundle and the bundle class. Typically, a subcorpus_bundle is generated by applying the split-method on a corpus or subcorpus.

Usage

## S4 method for signature 'subcorpus_bundle'
show(object)

## S4 method for signature 'subcorpus_bundle'
merge(x, name = "", verbose = FALSE)

## S4 method for signature 'subcorpus'
merge(x, y, ...)

## S4 method for signature 'subcorpus'
split(
  x,
  s_attribute,
  values,
  prefix = "",
  mc = getOption("polmineR.mc"),
  verbose = TRUE,
  progress = FALSE,
  type = get_type(x)
)

## S4 method for signature 'corpus'
split(
  x,
  s_attribute,
  values,
  prefix = "",
  mc = getOption("polmineR.mc"),
  verbose = TRUE,
  progress = FALSE,
  type = get_type(x),
  xml = "flat"
)

## S4 method for signature 'subcorpus_bundle'
split(
  x,
  s_attribute,
  prefix = "",
  progress = TRUE,
  mc = getOption("polmineR.mc")
)

Arguments

object

An object of class subcorpus_bundle.

x

A corpus, subcorpus, or subcorpus_bundle object.

name

The name of the new subcorpus object.

verbose

Logical, whether to provide progress information.

y

A subcorpus to be merged with x.

...

Further subcorpus objects to be merged with x and y.

s_attribute

The s-attribute to vary.

values

Either a character vector with values used for splitting, or a logical value: If TRUE, changes of s-attribute values will be the basis for generating subcorpora. If FALSE, a new subcorpus is generated for every struc of the s-attribute. If missing (default), TRUE/FALSE is assigned depending on whether s-attribute has values, or not.

prefix

A character vector that will be attached as a prefix to partition names.

mc

Logical, whether to use multicore parallelization.

progress

Logical, whether to show progress bar.

type

The type of partition to generate.

xml

A logical value.

Details

Applying the split-method to a subcorpus_bundle-object will iterate through the subcorpus, and apply split on each subcorpus object in the bundle, splitting it up by the s-attribute provided by the argument s_attribute. The return value is a subcorpus_bundle, the names of which will be the names of the incoming partition_bundle concatenated with the s-attribute values used for splitting. The argument prefix can be used to achieve a more descriptive name.

Examples

corpus("REUTERS") %>% split(s_attribute = "id") %>% summary()

# Merge multiple subcorpus objects
a <- corpus("GERMAPARLMINI") %>% subset(date == "2009-10-27")
b <- corpus("GERMAPARLMINI") %>% subset(date == "2009-10-28")
c <- corpus("GERMAPARLMINI") %>% subset(date == "2009-11-10")
y <- merge(a, b, c)
s_attributes(y, "date")
sc <- subset("GERMAPARLMINI", date == "2009-11-11")
b <- split(sc, s_attribute = "speaker")

p <- partition("GERMAPARLMINI", date = "2009-11-11")
y <- partition_bundle(p, s_attribute = "speaker")
gparl <- corpus("GERMAPARLMINI")
b <- split(gparl, s_attribute = "date")
# split up objects in partition_bundle by using partition_bundle-method
use("polmineR")
y <- corpus("GERMAPARLMINI") %>%
  split(s_attribute = "date") %>%
  split(s_attribute = "speaker")

summary(y)

[Package polmineR version 0.8.9 Index]