importExcelData {exceldata}R Documentation

Import Excel Data based on the specifications in a data dictionary

Description

This function reads in a data dictionary and data entry table and converts code and category variables to factors as outlined in the dictionary. See the examples.

Usage

importExcelData(
  excelFile,
  dictionarySheet = "DataDictionary",
  dataSheet = "DataEntry",
  id,
  saveWarnings = TRUE,
  setErrorsMissing = TRUE,
  use_labels = TRUE,
  range,
  colnames,
  origin,
  timeUnit = "month"
)

Arguments

excelFile

path and filename of the data file containing the data and dictionary

dictionarySheet

name of the sheet containing the data dictionary, defaults to 'DataDictionary'

dataSheet

name of the data entry sheet within the file, defaults to 'DataEntry'

id

String indicating the ID variable, to display errors by ID instead of row number

saveWarnings

Boolean, if TRUE and there are any warnings then the function will return a list with the data frame and the import warnings

setErrorsMissing

Boolean, if TRUE all values out of range will be set to NA

use_labels

should variable descriptions be added as variable label attributes, default is TRUE

range

Optional, Range of Excel sheet to restrict import to (ie. range="A1:F6")

colnames

Optional, Column names of the dictionary, defaults to those used in the Excel template: c('VariableName', 'Description (optional)', 'Type', 'Minimum', 'Maximum', 'Levels')

origin

Optional, the date origin of Excel dates, defaults to 30 December 1899

timeUnit

Character specifying the unit of time that should be used when creating survival type variables. Allowed values are from lubridate (ex: 'day' 'week' 'month' 'year')

Details

The exceldata package was designed around the DataDictionary.xlsm template. More documentation and the current downloadable template can be found at:

https://github.com/biostatsPMH/exceldata#readme Note that as of release 0.1.1.1 the log file will give row numbers corresponding to the row number in Excel, as opposed to the row number in the data frame

Warning: If SetErrorsMissing = TRUE then a subsequent call to checkData will not return any errors, because the errors have already been set to missing.

NOTE: This function will only read in those columns present in the DataDictionary

Value

A list containing two data frames: the data dictionary and the data table

Examples

exampleDataFile <- system.file("extdata", "exampleData.xlsx", package = "exceldata")
import <- importExcelData(exampleDataFile,
dictionarySheet = 'DataDictionary',dataSheet = 'DataEntry')

# The imported data dictionary
dictionary <- import$dictionary
head(dictionary)

# The imported data, with calculated variables
data <- import$data
head(data)

# Simple univariate plots with outliers
plots <- plotVariables(data=data,dictionary=dictionary,IDvar = 'ID')


[Package exceldata version 0.1.1.3 Index]