R: Convenient Functions for Data Reading

funchir-io {funchir}

R Documentation

Convenient Functions for Data Reading

Description

Functions which come in particular handy for process of reading in data which can turn verbose code into readable, clean code.

Usage

   abbr_to_colClass(inits, counts)

Arguments

`inits`	Initials of data types to be passed to a `colClasses` argument (most typically in `fread` from `data.table` for me). See details.
`counts`	Corresponding counts (as an unbroken string) of each type given in `inits`. See details

Details

abbr_to_colClass was designed specifically for reading in large (read: wide, i.e., with many fields) data files when it is also necessary to specify the types to expect to the reader for speed or for accuracy.

Currently recognized types are blank, character, factor, logical, integer, numeric, Date, date, text and skip, which are abbreviated to their first initials: "b", "c", "f", "l", "i", "n", "D", "d", "t" and "s", respectively.

Since like types are often found in sequence, the counts argument can condense the call considerably–if three integer columns appear in a row, for example, we could specify inits="i" and counts="3" instead of the breathier inits="iii", counts="111".

Note that since counts is read digit-by-digit, sequences of length greater than 9 must be broken up into size-9 (or smaller) chunks, e.g., if there are 20 Date fields in a row, we could set inits="ddd", counts="992". This approach was taken (rather than, say, requiring counts to be an integer vector of counts) as I find it speedier and more concise, and the direct parallel to inits can elucidate issues which arise directly in the code instead of, say, checking cbind(strsplit(inits, split = "")[[1L]], counts).

Examples

  abbr_to_colClass(inits = "ncifdfd", counts = "1234567")

[Package funchir version 0.2.2 Index]