R: Detect incorrect formatting of a dataset

view.errors.umbrella {metaumbrella}

R Documentation

Detect incorrect formatting of a dataset

Description

Check the formatting of a dataset to ensure it can be passed to the functions of the metaumbrella package.

Usage

view.errors.umbrella(data, return = "data_and_messages")

Arguments

`data`	a dataframe
`return`	the type of information returned by the function. Must be either "messages", "data_and_messages", or "data".

Details

The functions included in the metaumbrella package require very specific formatting of the dataset (see metaumbrella-package). The view.errors.umbrella() function checks that a dataframe meets all requirements of the functions of the metaumbrella package. If this function finds some formatting issues, error messages describing the issues are produced and the rows / columns in which the issues occurred are identified.

Value

Depending on the value passed to the return argument, different information is returned:

`"messages"`	return global messages describing the different formatting issues.

`"data"`	return the rows of the original dataset with formatting issues (see below).

`"data_and_messages"`	return both (i) global messages describing the different formatting issues and
	(ii) the rows of the original dataset with formatting issues (see below).

When returning a dataset (i.e., when "data" or "data_and_messages" are indicated in the return argument), the rows with problematic formatting are identified and two new columns are added to the original dataset (column_type_errors and column_errors). These columns help to understand formatting issues.

A WARNING value in the column_type_errors column indicates a potential issue that should be checked but that do not prevent calculations.
An ERROR value in the column_type_errors column indicates an issue that must be solved before running calculations.
The text in the column_errors describes the issues encountered for each problematic row.

Examples

df.errors1 <- df.errors2 <- df.errors3 <- df.errors4 <- df.OR

### include some unknown measures
df.errors1$measure[c(1,4,12)] <- "unknown_measure"
view.errors.umbrella(df.errors1, return = "data_and_messages")

### include some not numeric inputs while expected
df.errors2$value[c(2,13,15)] <- c("a", "b", "c")
view.errors.umbrella(df.errors2, return = "data")

### make the lower bound of a confidence interval > to the value
df.errors3$ci_lo[c(12,14,21)] <- c(5,6,7)
view.errors.umbrella(df.errors3, return = "messages")

### create errors in sample sizes
df.errors4$n_cases_exp[c(5,10,15)] <- c(100, 200, 300)
view.errors.umbrella(df.errors4, return = "data_and_messages")

[Package metaumbrella version 1.0.11 Index]