funchir-io {funchir} | R Documentation |
Convenient Functions for Data Reading
Description
Functions which come in particular handy for process of reading in data which can turn verbose code into readable, clean code.
Usage
abbr_to_colClass(inits, counts)
Arguments
inits |
Initials of data types to be passed to a |
counts |
Corresponding counts (as an unbroken string) of each type given in |
Details
abbr_to_colClass
was designed specifically for reading in large (read: wide, i.e., with many fields) data files when it is also necessary to specify the types to expect to the reader for speed or for accuracy.
Currently recognized types are blank
, character
, factor
, logical
, integer
, numeric
, Date
, date
, text
and skip
, which are abbreviated to their first initials: "b"
, "c"
, "f"
, "l"
, "i"
, "n"
, "D"
, "d"
, "t"
and "s"
, respectively.
Since like types are often found in sequence, the counts
argument can condense the call considerably–if three integer columns appear in a row, for example, we could specify inits="i"
and counts="3"
instead of the breathier inits="iii"
, counts="111"
.
Note that since counts
is read digit-by-digit, sequences of length greater than 9 must be broken up into size-9 (or smaller) chunks, e.g., if there are 20 Date
fields in a row, we could set inits="ddd"
, counts="992"
. This approach was taken (rather than, say, requiring counts
to be an integer vector of counts) as I find it speedier and more concise, and the direct parallel to inits
can elucidate issues which arise directly in the code instead of, say, checking cbind(strsplit(inits, split = "")[[1L]], counts)
.
Examples
abbr_to_colClass(inits = "ncifdfd", counts = "1234567")