add_name_labs {labelr}R Documentation

Add or Modify Data Frame Variable Name Labels

Description

Add descriptive variable name labels (up to one per column) to the columns of a data.frame.

Usage

add_name_labs(
  data,
  name.labs = NULL,
  vars = NULL,
  labs = NULL,
  init.max = NULL
)

anl(data, name.labs = NULL, vars = NULL, labs = NULL, init.max = NULL)

Arguments

data

the data.frame you wish to begin labeling.

name.labs

a named character vector, where names are current data.frame column (variable) names, and where values are the proposed labels. If this is NULL, vars and labs arguments may not be NULL. If latter are not NULL, this (name.labs) argument must be NULL.

vars

the names of the columns (variables) to which name labels will be applied. If NULL, labs arg must also be NULL and name.labs cannot be NULL.

labs

the proposed variable name labels to applied to the columns (variables). If non-NULL, vars arg must also be non-NULL.

init.max

If non-NULL, this must be a 1L integer, indicating the maximum number of unique values that a variable may have for it to receive placeholder value labels, which will consist of the variable's actual values coerced to character values. If NULL, or if the variable is numeric with decimal values, the variable will not be given initialized variable value labels.

Details

Note: anl is a compact alias for add_name_labs: they do the same thing, and the former is easier to type.

add_name_labs works with get_name_labs, use_name_labs, and use_var_names to facilitate the creation, accessing, and substitution of variable name labels for variable names.

Each variable (column) of a data.frame can receive one and only one "name label," which typically is a noun phrase that expounds the meaning of contents of the variable's name (e.g., "Weight in ounces at birth" might be a name label for a column called "wgt"). add_name_labs takes a data.frame and either a named character vector (names are current variable names, values are proposed name labels) supplied to the name.labs arg or two separate character vectors (one each for current variable names and proposed variable name labels, respectively) supplied to vars and labs args, respectively. If using the second approach, the order of each entry matters (e.g., the first variable name entry to the vars argument will be given the label of the first name label entry to the labs argument, and so on).

Note that any non-name-labeled columns will receive their own names as default name labels (e.g., if var "mpg" of mtcars is not assigned a name label, it will be given the default name label of "mpg"). Note also that other labelr functions (e.g., add_val_labs) will initialize name labels and other labelr attribute meta-data in this same fashion. Name labels can be removed with drop_name_labs.

Value

A data.frame, with new name labels added (call get_name_labs to see them), other provisional/default labelr label information added, and previous user-added labelr label information preserved.

Examples

# create a data set
df <- mtcars

# variable names and their labels
names_labs_vec <- c(
  "mpg" = "Miles/(US) gallon",
  "cyl" = "Number of cylinders",
  "disp" = "Displacement (cu.in.)",
  "hp" = "Gross horsepower",
  "drat" = "Rear axle ratio",
  "wt" = "Weight (1000 lbs)",
  "qsec" = "1/4 mile time",
  "vs" = "Engine (0 = V-shaped, 1 = straight)",
  "am" = "Transmission (0 = automatic, 1 = manual)",
  "gear" = "Number of forward gears",
  "carb" = "Number of carburetors"
)

# assign variable labels
df <- add_name_labs(df,
  vars = names(names_labs_vec),
  labs = names_labs_vec
)

# see what we have
get_name_labs(df)

# use these
df_labs_as_names <- use_name_labs(df)
head(df_labs_as_names)[1:3] # these are verbose, so, only show first three
head(df)[1:3]

# now revert back
df_names_as_before <- use_var_names(df_labs_as_names)
head(df_names_as_before)[1:3] # indeed, they are as before
identical(head(df), head(df_names_as_before))

# strip name label meta-data information from df
# NOT same as use_var_names(), which preserves the info but "turns it off"
# this strips the name labels meta-data from df altogether
df <- drop_name_labs(df)

# see what we have
get_name_labs(df) # they're gone

# alternative syntax (if you have a named vector like names_labs_vec)
# assign variable name labels
df <- add_name_labs(df,
  name.labs = c(
    "mpg" = "Miles/(US) gallon",
    "cyl" = "Number of cylinders",
    "disp" = "Displacement (cu.in.)",
    "hp" = "Gross horsepower",
    "drat" = "Rear axle ratio",
    "wt" = "Weight (1000 lbs)",
    "qsec" = "1/4 mile time",
    "vs" = "Engine (0 = V-shaped, 1 = straight)",
    "am" = "Transmission (0 = automatic, 1 = manual)",
    "gear" = "Number of forward gears",
    "carb" = "Number of carburetors"
  )
)

# replace two variable name labels, keeping the others
df <- add_name_labs(df,
  name.labs = c(
    "disp" = toupper("displacement"),
    "mpg" = toupper("miles per gallon")
  )
)

attributes(df) # show all attributes
get_name_labs(df) # show only the variable name labels
get_name_labs(df, var = c("disp", "mpg"))

# again, strip name label meta-data information from df
# NOT same as use_var_names(), which preserves the info but "turns it off"
df <- drop_name_labs(df)

# see what we have
get_name_labs(df) # they're gone

# alternative syntax to add name labels
df <- add_name_labs(df,
  vars = c("carb", "am"),
  labs = c("how many carburetors?", "automatic or stick?")
)

# see what we have
get_name_labs(df) # they're back! (and placeholders for others)

# add another
df <- add_name_labs(df,
  vars = c("mpg"),
  labs = c("miles per gallon, of course")
)

# see what we have
get_name_labs(df) # it's been added, and others preserved

head(use_name_labs(df)[c(1, 9, 11)]) # verbose, but they're there

[Package labelr version 0.1.5 Index]