dataInfo {esmtools}R Documentation

Display information regarding the dataset in a succinct way.

Description

The 'dataInfo()' function displays detailed information about a dataset in a similar style as 'sessionInfo()'. It provides various details such as size, creation and update times, number of columns and rows, number of participants, variable names, and more. This information is useful for reproducibility, tracking the dataset, and ensuring transparency in data analysis workflows.

Usage

dataInfo(
  file_path = NULL,
  read_fun = NULL,
  idvar = NULL,
  timevar = NULL,
  validvar = NULL,
  citation = NULL,
  URL = NULL,
  DOI = NULL,
  path = TRUE,
  variables = TRUE
)

Arguments

file_path

The path or URL of the dataset file.

read_fun

The function used to read the dataset file.

idvar

The identifier variable(s) in the dataset, represented as a character vector.

timevar

A time variable(s) name in the dataset. Preference is to use the sent timestamp variable (the time when the beep was sent to the participant).

validvar

The validation variable name in the dataset, represented as a numerical vector. If NULL, the function do not display compliance rate information.

citation

A character element to cite the article or document associated with the script.

URL

The citation information for the dataset (article associated), represented as a character string. If NULL, the function will not display the citation information.

DOI

The Digital Object Identifier (DOI) of the dataset, if applicable. If NULL, the function will not display the DOI information.

path

If TRUE, the function will display the path information.

variables

A logical value indicating whether to display the names of the dataset's variables. Set to TRUE to display variable information, and FALSE to omit it. The default is TRUE.

Details

The 'dataInfo()' function provides a comprehensive summary of information about the dataset. The information returned includes:

Value

The 'dataInfo()' function displays detailed information about the dataset. It can also be store as a list in a variable.

A kable object that summarizes the information on the data, the current R session, and the article or document associated with the script.

Examples

library(dplyr)

# Load data
file_path <- system.file("extdata", "esmdata_sim.csv", package = "esmtools")

# Create a function to read the data
read_fun <- function(x) read.csv2(x) %>% 
    mutate(sent = as.POSIXct(as.character(sent), format="%Y-%m-%d %H:%M:%S"))

# Get data information
dataInfo(
  file_path = file_path, read_fun = read_fun,
  idvar = "id", timevar = "sent"
)

[Package esmtools version 1.0.1 Index]