| matchQuote {Ecfun} | R Documentation |
Match isolated quotes across records
Description
Look for unmatched quotes in a character vector. If found, look for a matching quote starting the next character string in the vector, possibly after a blank line. If found, merge the two strings and return the resulting shortened character vector.
Usage
matchQuote(x, Quote='"', sep=' ',
maxChars2append=2, ...)
Arguments
x |
a character vector to scan for unmatched
|
Quote |
the |
sep |
|
maxChars2append |
maximum number of characters in the
following string to concatenate two
adjacent strings (possibly separated by
a blank line) with unmatched
|
... |
optional arguments for
|
Details
This function was written to help parse
data from the US Department of Health and
Human Services on
cyber-security breaches affecting 500 or
more individuals. As of 2014-06-03 the
csv version of these data included
commas in quotes that are not sep
characters, quotes that are not matched,
lines with zero characters, followed by
lines with 3 characters being a quote and
a comma. This function was written
to drop the blank lines and append the
quote-comma line to the preceding line so
it contained matching quotes.
Value
The input character vector possibly shortened with the following attributes explaining what was found:
indices of the input x with
an unmatched
unmatchedQuotesQuote.blankLinesDroppedindices of the inputxthat were dropped because they (1) followed an unmatchedQuoteand (2) contained no non-blank characters.quoteLinesAppendedindices of the inputxthat were concatenated with a preceding line because the two lines contained unmatchedQuotecharacters, and concatenating them produced a line with allQuotes matched.ncharsAppendedan integer vector of the same length asquoteLinesConcatenatedgiving the number of characters in the second line concatenated onto the previous line.
Author(s)
Spencer Graves
See Also
Examples
chvec <- c('abc', 'de"f', ' ', '",', 'g"h',
'matched"quotes"', '')
ch. <- matchQuote(chvec)
# check
chv. <- c('abc', 'de"f ",', 'g"h',
'matched"quotes"', '')
attr(chv., 'unmatchedQuotes') <- c(2, 4, 5)
attr(chv., 'blankLinesDropped') <- 3
attr(chv., 'quoteLinesAppended') <- 4
attr(chv., 'ncharsAppended') <- 2
all.equal(ch., chv.)