mergeAttr {eatTools}R Documentation

Merge Two Data Frames with additional messages and maintain variable attributes

Description

This is a wrapper for the merge function. merge does not maintain variable attributes. mergeAttr might be useful if variable attributes should be maintained. For example, if SPSS data are imported via read.spss, variable and value labels are stored as attributes which get lost if data are merged subsequently. Moreover, function gives additional messages if (combinations of) by-variables are not unique in at least one data.frame, or if by-variables have different classes, or if some units of the by-variables are missing in one of the data sets. Users are free to specify which kind of messages are desirable.

Usage

mergeAttr(x, y, by = intersect(names(x), names(y)),
      by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,
      sort = TRUE, suffixes = c(".x",".y"), setAttr = TRUE, onlyVarValLabs = TRUE,
      homoClass = TRUE, unitName = "unit", xName = "x", yName = "y",
      verbose = c("match", "unique", "class", "dataframe", "common"))

Arguments

x

first data frame to be merged.

y

second data frame to be merged.

by

specifications of the columns used for merging

by.x

specifications of the columns used for merging

by.y

specifications of the columns used for merging

all

logical; all = L is shorthand for all.x = L and all.y = L, where L is either TRUE or FALSE.

all.x

logical; if TRUE, then extra rows will be added to the output, one for each row in x that has no matching row in y. These rows will have NAs in those columns that are usually filled with values from y. The default is FALSE, so that only rows with data from both x and y are included in the output.

all.y

logical; analogous to all.x.

sort

logical. Should the result be sorted on the by columns?

suffixes

a character vector of length 2 specifying the suffixes to be used for making unique the names of columns in the result which not used for merging (appearing in by etc).

setAttr

Logical: restore the variable attributes? If FALSE, the behavior of mergeAttr equals the behavior of merge.

onlyVarValLabs

Logical: If TRUE, only the variable and value labels as captured by read.spss and stored by convertLabel from the eatAnalysis package will be restored. If FALSE, all variable attributes will be restored.

homoClass

Logical: Beginning with R version 3.5, merge may give an error if the class of the by-variables differs in both data.frames. If TRUE, class of by-variable(s) will be homogenized before merging.

unitName

Optional: Set the name for the unit variable to get more informative messages. This is mainly relevant if mergeAttr is called from other functions.

xName

Optional: Set the name for the x data.frame to get more informative messages. This is mainly relevant if mergeAttr is called from other functions.

yName

Optional: Set the name for the y data.frame to get more informative messages. This is mainly relevant if mergeAttr is called from other functions.

verbose

Optional: Choose whether messages concerning missing levels in by-variables should be printed on console ("match"), or messages concerning uniqueness of by-variables ("unique"), or messages concerning different classes of by-variables ("class"), or messages concerning appropriate class (data.frame) of x and y ("dataframe"), or messages concerning additional common variables (except by-variables; "common")). Multiple choices are possible, e.g. verbose = c("match", "class"). If verbose = TRUE, all messages are printed, if verbose = FALSE, no messages are printed at all. The default is equivalent to verbose = TRUE.

Value

data frame. See the help page of merge for further details.

Examples

### data frame 1, variable 'y' with variable.label 'test participation'
df1 <- data.frame ( id = 1:3, sex = factor ( c("male", "male", "female")),
       happy = c("low", "low", "medium"))
attr(df1[,"happy"], "variable.label") <- "happieness in the workplace"

### data frame 2 without labels 
df2 <- data.frame ( id = as.factor(c(2,2,4)), status = factor ( c("married", "married", "single")),
       convicted = c(FALSE, FALSE, TRUE))

### lost label after merging
df3 <- merge(df1, df2, all = TRUE)
attr(df3[,"happy"], "variable.label")

### maintain label
df4 <- mergeAttr(df1, df2, all = TRUE, onlyVarValLabs = FALSE)
attr(df4[,"happy"], "variable.label")

### adapt messages
df5 <- mergeAttr(df1, df2, all = TRUE, onlyVarValLabs = FALSE, unitName = "student",
       xName = "student questionnaire", yName = "school questionnaire",
       verbose = c("match", "unique"))

[Package eatTools version 0.7.6 Index]