cpos {polmineR} | R Documentation |
Get corpus positions for a query or queries.
Description
Get matches for a query in a CQP corpus (subcorpus, partition etc.), optionally using the CQP syntax of the Corpus Workbench (CWB).
Usage
cpos(.Object, ...)
## S4 method for signature 'corpus'
cpos(
.Object,
query,
p_attribute = getOption("polmineR.p_attribute"),
cqp = is.cqp,
regex = FALSE,
check = TRUE,
verbose = TRUE,
...
)
## S4 method for signature 'character'
cpos(
.Object,
query,
p_attribute = getOption("polmineR.p_attribute"),
cqp = is.cqp,
check = TRUE,
verbose = TRUE,
...
)
## S4 method for signature 'slice'
cpos(
.Object,
query,
cqp = is.cqp,
check = TRUE,
p_attribute = getOption("polmineR.p_attribute"),
verbose = TRUE,
...
)
## S4 method for signature 'partition'
cpos(
.Object,
query,
cqp = is.cqp,
check = TRUE,
p_attribute = getOption("polmineR.p_attribute"),
verbose = TRUE,
...
)
## S4 method for signature 'subcorpus'
cpos(
.Object,
query,
cqp = is.cqp,
check = TRUE,
p_attribute = getOption("polmineR.p_attribute"),
verbose = TRUE,
...
)
## S4 method for signature 'matrix'
cpos(.Object)
## S4 method for signature 'hits'
cpos(.Object)
## S4 method for signature ''NULL''
cpos(.Object)
Arguments
.Object |
A length-one |
... |
Used for reasons of backwards compatibility to
process arguments that have been renamed (e.g. |
query |
A |
p_attribute |
The p-attribute to search. Needs to be stated only if query
is not a CQP query. Defaults to |
cqp |
Either logical ( |
regex |
Interpret |
check |
A |
verbose |
A |
Details
The cpos()
-method returns a two-column matrix
with the ranges (start end
end corpus positions of the matches) matched by a query. CQP syntax can be
used. The encoding of the query is adjusted to conform to the encoding of the
CWB corpus. If there are not matches, NULL
is returned.
Previous polmineR versions defined the cpos()
-method for matrix
and
hits
objects to obtain an integer vector with unfolded individual corpus
positions. This usage is deprecated starting with polmineR v0.8.8
Value
A matrix
with two columns. The first column reports the
left/starting corpus positions (cpos) of the hits obtained. The second
column reports the right/ending corpus positions of the respective hit. The
number of rows is the number of hits. If there are no hits, NULL
is
returned.
Examples
use(pkg = "RcppCWB", corpus = "REUTERS")
# look up single tokens
cpos("REUTERS", query = "oil")
corpus("REUTERS") %>% cpos(query = "oil")
corpus("REUTERS") %>%
subset(grepl("saudi-arabia", places)) %>%
cpos(query = "oil")
partition("REUTERS", places = "saudi-arabia", regex = TRUE) %>%
cpos(query = "oil")
# use CQP query syntax
cpos("REUTERS", query = '"Saudi" "Arabia"')
corpus("REUTERS") %>% cpos(query = '"Saudi" "Arabia"')
corpus("REUTERS") %>%
subset(grepl("saudi-arabia", places)) %>%
cpos(query = '"Saudi" "Arabia"', cqp = TRUE)
partition("REUTERS", places = "saudi-arabia", regex = TRUE) %>%
cpos(query = '"Saudi" "Arabia"', cqp = TRUE)