| matchQuote {Ecfun} | R Documentation | 
Match isolated quotes across records
Description
Look for unmatched quotes in a character vector. If found, look for a matching quote starting the next character string in the vector, possibly after a blank line. If found, merge the two strings and return the resulting shortened character vector.
Usage
matchQuote(x,  Quote='"', sep=' ', 
          maxChars2append=2, ...) 
Arguments
| x | a character vector to scan for unmatched 
 | 
| Quote | the  | 
| sep | 
 | 
| maxChars2append | maximum number of characters in the 
following string to concatenate two 
adjacent strings (possibly separated by 
a blank line) with unmatched 
 | 
| ... | optional arguments for 
 | 
Details
This function was written to help parse 
data from the US Department of Health and 
Human Services on 
cyber-security breaches affecting 500 or 
more individuals.  As of 2014-06-03 the 
csv version of these data included 
commas in quotes that are not sep 
characters, quotes that are not matched, 
lines with zero characters, followed by 
lines with 3 characters being a quote and 
a comma.  This function was written 
to drop the blank lines and append the 
quote-comma line to the preceding line so 
it contained matching quotes.  
Value
The input character vector possibly shortened with the following attributes explaining what was found:
indices of the input x with 
an unmatched 
- unmatchedQuotes- Quote.
- blankLinesDroppedindices of the input- xthat were dropped because they (1) followed an unmatched- Quoteand (2) contained no non-blank characters.
- quoteLinesAppendedindices of the input- xthat were concatenated with a preceding line because the two lines contained unmatched- Quotecharacters, and concatenating them produced a line with all- Quotes matched.
- ncharsAppendedan integer vector of the same length as- quoteLinesConcatenatedgiving the number of characters in the second line concatenated onto the previous line.
Author(s)
Spencer Graves
See Also
Examples
chvec <- c('abc', 'de"f', ' ', '",', 'g"h',
            'matched"quotes"', '')
ch. <- matchQuote(chvec)
# check 
chv. <- c('abc', 'de"f ",', 'g"h', 
          'matched"quotes"', '')
attr(chv., 'unmatchedQuotes') <- c(2, 4, 5)
attr(chv., 'blankLinesDropped') <- 3
attr(chv., 'quoteLinesAppended') <- 4
attr(chv., 'ncharsAppended') <- 2 
all.equal(ch., chv.)