search-class {act}R Documentation

Search object

Description

This object defines the properties of a search in act. It also contains the results of this search in a specific corpus, if the search has already been run. (Note that you can also create a search without running it immediately). A search object can be run on different corpora.

Some of the slots are defined by the user. Other slots are [READ ONLY], which means that they can be accessed by the user but should not be changed. They contain values that are filled when you execute functions on the object.

Slots

name

Character string; name of the search. Will be used, for example, as name of the sub folder when creating media cuts

pattern

Character string; search pattern as a regular expression.

search.mode

Character string; defines if the original contents of the annotations should be searched or if the full texts should be searched. Slot takes the following values: content, fulltext (=default, includes both full text modes), fulltext.byTime, fulltext.byTier.

search.normalized

logical. if TRUE the normalized annotations will be used for searching.

resultid.prefix

Character string; search results will be numbered consecutively; This character string will be placed before the consecutive numbers.

resultid.start

Integer; search results will be numbered consecutively; This is the start number of the identifiers.

filter.transcript.names

Vector of character strings; names of transcripts to include in the search. If the value is character() or "" filter will be ignored.

filter.transcript.includeRegEx

Character string; Regular expression that defines which transcripts should be INcluded in the search (matching the name of the transcript).

filter.transcript.excludeRegEx

Character string; Regular expression that defines which transcripts should be EXcluded in the search (matching the name of the transcript).

filter.tier.names

Vector of character strings; names of tiers to include in the search. If the value is character() or "" filter will be ignored.

filter.tier.includeRegEx

Character string; Regular expression that defines which tiers should be INcluded in the search (matching the name of the tier).

filter.tier.excludeRegEx

Character string; Regular expression that defines which tiers should be EXcluded in the search (matching the name of the tier).

filter.section.startsec

Double; Time value in seconds, limiting the search to a certain time span in each transcript, defining the start of the search window.

filter.section.endsec

Double; Time value in seconds, limiting the search to a certain time span in each transcript, defining the end of the search window.

concordance.make

Logical; If a concordance should be created when the search is run.

concordance.width

Integer; number of characters to include in the concordance.

cuts.span.beforesec

Double; Seconds how much the cuts (media and print transcripts) should start before the start of the search hit.

cuts.span.aftersec

Double; Seconds how much the cuts (media and print transcripts) should end after the end of the search hit.

cuts.column.srt

Character string; name of destination column in the search results data frame where the srt substitles will be inserted; column will be created if not present in data frame; set to "" for no insertion.

cuts.column.printtranscript

Character string; name of destination column in the search results data frame where the print transcripts will be inserted; column will be created if not present in data frame; set to "" for no insertion.

cuts.printtranscripts

Character string; [READ ONLY] All print transcripts for the search results (if generated previously)

cuts.cutlist.mac

Character string; [READ ONLY] 'FFmpeg' cut list for use on a Mac, to cut the media files for the search results.

cuts.cutlist.win

Character string; [READ ONLY] 'FFmpeg' cut list for use on Windows, to cut the media files for the search results.

results

Data.frame; Results of the search.1

results.nr

Integer; [READ ONLY] Number of search results.

results.tiers.nr

Integer; [READ ONLY] Number of tiers over which the search results are distrubuted.

results.transcripts.nr

Integer; [READ ONLY] Number of transcripts over which the search results are distrubuted.

x.name

Character string; [READ ONLY] name of the corpus object on which the search has been run.

Examples

library(act)

# Search for the 1. Person Singular Pronoun in Spanish.
mysearch <- act::search_new(examplecorpus, pattern= "yo")
mysearch
# Search in normalized content vs. original content
mysearch.norm  <- act::search_new(examplecorpus, pattern="yo", searchNormalized=TRUE)
mysearch.org   <- act::search_new(examplecorpus, pattern="yo", searchNormalized=FALSE)
mysearch.norm@results.nr
mysearch.org@results.nr

# The difference is because during normalization capital letters will be converted
# to small letters. One annotation in the example corpus contains a "yo" with a
# capital letter:
mysearch <- act::search_new(examplecorpus, pattern="yO", searchNormalized=FALSE)
mysearch@results$hit

# Search in full text vs. original content.
# Full text search will find matches across annotations.
# Let's define a regular expression with a certain span.
# Search for the word "no" 'no' followed by a "pero" 'but'
# in a distance ranging from 1 to 20 characters.
myRegEx <- "\\bno\\b.{1,20}pero"
mysearch <- act::search_new(examplecorpus, pattern=myRegEx, searchMode="fulltext")
mysearch
mysearch@results$hit


[Package act version 1.3.1 Index]