parse_source {rock}R Documentation

Parsing sources

Description

These function parse one (parse_source) or more (parse_sources) sources and the contained identifiers, sections, and codes.

Usage

parse_source(
  text,
  file,
  utteranceLabelRegexes = NULL,
  ignoreOddDelimiters = FALSE,
  checkClassInstanceIds = rock::opts$get(checkClassInstanceIds),
  postponeDeductiveTreeBuilding = FALSE,
  filesWithYAML = NULL,
  removeSectionBreakRows = rock::opts$get("removeSectionBreakRows"),
  removeIdentifierRows = rock::opts$get("removeIdentifierRows"),
  removeEmptyRows = rock::opts$get("removeEmptyRows"),
  rlWarn = rock::opts$get("rlWarn"),
  encoding = rock::opts$get("encoding"),
  silent = rock::opts$get("silent")
)

## S3 method for class 'rock_parsedSource'
print(x, prefix = "### ", ...)

parse_sources(
  path,
  extension = "rock|dct",
  regex = NULL,
  recursive = TRUE,
  removeSectionBreakRows = rock::opts$get("removeSectionBreakRows"),
  removeIdentifierRows = rock::opts$get("removeIdentifierRows"),
  removeEmptyRows = rock::opts$get("removeEmptyRows"),
  ignoreOddDelimiters = FALSE,
  checkClassInstanceIds = rock::opts$get(checkClassInstanceIds),
  mergeInductiveTrees = FALSE,
  encoding = rock::opts$get(encoding),
  silent = rock::opts$get(silent)
)

## S3 method for class 'rock_parsedSources'
print(x, prefix = "### ", ...)

## S3 method for class 'rock_parsedSources'
plot(x, ...)

Arguments

text, file

As text or file, you can specify a file to read with encoding encoding, which will then be read using base::readLines(). If the argument is named text, whether it is the path to an existing file is checked first, and if it is, that file is read. If the argument is named file, and it does not point to an existing file, an error is produced (useful if calling from other functions). A text should be a character vector where every element is a line of the original source (like provided by base::readLines()); although if a character vector of one element and including at least one newline character (⁠\\n⁠) is provided as text, it is split at the newline characters using base::strsplit(). Basically, this behavior means that the first argument can be either a character vector or the path to a file; and if you're specifying a file and you want to be certain that an error is thrown if it doesn't exist, make sure to name it file.

utteranceLabelRegexes

Optionally, a list with two-element vectors to preprocess utterances before they are stored as labels (these 'utterance perl regular expression!

ignoreOddDelimiters

If an odd number of YAML delimiters is encountered, whether this should result in an error (FALSE) or just be silently ignored (TRUE).

checkClassInstanceIds

Whether to check for the occurrence of class instance identifiers specified in the attributes.

postponeDeductiveTreeBuilding

Whether to imediately try to build the deductive tree(s) based on the information in this file (FALSE) or whether to skip that. Skipping this is useful if the full tree information is distributed over multiple files (in which case you should probably call parse_sources instead of parse_source).

filesWithYAML

Any additional files to process to look for YAML fragments.

removeSectionBreakRows, removeIdentifierRows, removeEmptyRows

Whether to remove from the QDT, respectively: rows containing section breaks; rows containing only (class instance) identifiers; and empty rows.

rlWarn

Whether to let readLines() warn, e.g. if files do not end with a newline character.

encoding

The encoding of the file to read (in file).

silent

Whether to provide (FALSE) or suppress (TRUE) more detailed progress updates.

x

The object to print.

prefix

The prefix to use before the 'headings' of the printed result.

...

Any additional arguments are passed on to the default print method.

path

The path containing the files to read.

extension

The extension of the files to read; files with other extensions will be ignored. Multiple extensions can be separated by a pipe (|).

regex

Instead of specifing an extension, it's also possible to specify a regular expression; only files matching this regular expression are read. If specified, regex takes precedece over extension,

recursive

Whether to also process subdirectories (TRUE) or not (FALSE).

mergeInductiveTrees

Merge multiple inductive code trees into one; this functionality is currently not yet implemented.

Value

For rock::parse_source(), an object of class rock_parsedSource; for rock::parse_sources(), an object of class rock_parsedSources. These objects contain the original source(s) as well as the final data frame with utterances and codes, as well as the code structures.

Examples

### Get path to example source
examplePath <-
  system.file("extdata", package="rock");

### Get a path to one example file
exampleFile <-
  file.path(examplePath, "example-1.rock");

### Parse single example source
parsedExample <- rock::parse_source(exampleFile);

### Show inductive code tree for the codes
### extracted with the regular expression specified with
### the name 'codes':
parsedExample$inductiveCodeTrees$codes;

### If you want `rock` to be chatty, use:
parsedExample <- rock::parse_source(exampleFile,
                                    silent=FALSE);

### Parse as selection of example sources in that directory
parsedExamples <-
  rock::parse_sources(
    examplePath,
    regex = "(test|example)(.txt|.rock)"
  );

### Show combined inductive code tree for the codes
### extracted with the regular expression specified with
### the name 'codes':
parsedExamples$inductiveCodeTrees$codes;

### Show a souce coded with the Qualitative Network Approach
qnaExample <-
  rock::parse_source(
    file.path(
      examplePath,
      "network-example-1.rock"
    )
  );


[Package rock version 0.8.1 Index]