R: Demographics

dems_neat {neatStats}

R Documentation

Demographics

Description

Prints participant count, age mean and SD, and gender ratio, from given dataset.

Usage

dems_neat(
  data_per_subject,
  group_by = NULL,
  gender_col = NULL,
  age_col = NULL,
  male = "male",
  female = "female",
  percent = FALSE,
  round_perc = 0,
  show_fem = NULL,
  age_range = FALSE,
  age_min = NULL,
  age_max = NULL
)

Arguments

`data_per_subject`	Data frame from which demographics are to be calculated. Should contain columns named as "`age`" and as "`gender`" (or, alternatively, "`sex`"). Alternatively, these columns can be specified via the `gender_col` and `age_col` parameters. The `age` column must contain only numbers or `NA`, while `gender` column must contain only `1` (= male) or `2` (= female), either as numbers or as strings, or `NA`. Alternatively, different gender coding can be set via the parameters `male` and `female` (but `1`/`2` will be checked for first in any case).
`group_by`	Optionally the name(s) of column(s) from the data frame provided as `data_per_subject` to group by.
`gender_col`	Optionally the name of column from the data frame that contains the gender (sex) information.
`age_col`	Optionally the name of column from the data frame that contains the age information.
`male`	Alternative code for male: by default, it is the string `"male"`. Whatever string is given, its abbreviations will also be accepted (e.g. `"m"`). (Lettercases do not matter, e.g. `Male` or `MALE` are both evaluated same as `male`.)
`female`	Alternative code for female: by default, it is the string `"female"`. Whatever string is given, its abbreviations will also be accepted (e.g. `"fem"`). (Lettercases do not matter.)
`percent`	Logical. If `TRUE`, gender ratios (and the "unknown" ratios based on `NA` values) are presented as percentage. If `FALSE`, they are presented as counts (i.e., numbers of subjects).
`round_perc`	Number `to round` to, when using percentages.
`show_fem`	Logical or `NULL`. If `TRUE`, the numbers of both male and female are displayed. If `FALSE`, only the number of males is displayed. If `NULL` (default), only the number of males is displayed when there are no unknown cases, but both numbers are displayed when there are any unknown cases.
`age_range`	Logical, `FALSE` by default. If `TRUE`, also displays age range per group (minimum and maximum ages).
`age_min`	If numeric given, removes all ages below (exclusive!) the given number before any age calculation.#'
`age_max`	If numeric given, removes all ages above (exclusive!) the given number before any age calculation.

Details

If gender_col and/or age_col are not specified, the function will first look for columns named precisely "age" and as "gender". If either is not found, the function looks for the same names but with any lettercase (e.g. "AGE" or "Gender"). If still no "gender" column is found, the function looks for "sex" column in the same manner. If no column is found for either, all related values will be counted as "unknown" (NA).

If NA values are found in either the age or gender column, the ratio (or count) of unknown cases will be displayed everywhere. Otherwise it will simply not be displayed anywhere.

Examples

# below is an illustrative example dataset
# (the "subject" and "measure_x" columns are not used in the function)
dat = data.frame(
    subject = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    conditions = c('x', 'y', 'x', 'y', 'y', 'x', 'x', 'x', 'y', 'x'),
    gender = c(2, 2, 1, 2, 1, 2, 2, 2, 1, 1),
    age = c(6, 7, 8.5, 6, 5, 16.5, 17, 16, 45.8, 77),
    measure_x = c(83, 71, 111, 70, 92, 75, 110, 111, 110, 85),
    stringsAsFactors = TRUE
)

# print demographics (age and gender) per "conditions":
dems_neat(dat, group_by = 'conditions')

# replace unlikely ages with NAs
dems_neat(dat,
          group_by = 'conditions',
          age_min = 8,
          age_max = 50)

# remove only high values, and display age ranges
dems_neat(dat,
          group_by = 'conditions',
          age_max = 45,
          age_range = TRUE)

# another dataset, with some missing values
dat = data.frame(
    subject = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    conditions = c('x', 'y', 'x', 'y', 'y', 'x', 'x', 'x', 'y', 'x'),
    gender = c(2, 2, NA, NA, 1, 1, 1, 2, NA, NA),
    age = c(6, 7, 8.5, 6, 5, 16, NA, 16, 45, 77),
    measure_x = c(83, 71, 111, 70, 92, 75, 110, 111, 110, 85),
    stringsAsFactors = TRUE
)
# again print demographics per "conditions":
dems_neat(dat, group_by = 'conditions')

# another dataset, with no "age"/"gender" columns
dat = data.frame(
    subject = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    conditions = c('x', 'y', 'x', 'y', 'y', 'x', 'x', 'x', 'y', 'x'),
    geschlecht = c(2, 2, NA, NA, 1, 1, 1, 2, NA, NA),
    alter = c(6, 7, 8.5, 6, 5, 16, NA, 16, 45, 77),
    measure_y = c(83, 71, 111, 70, 92, 75, 110, 111, 110, 85),
    stringsAsFactors = TRUE
)

# the following will return "unknowns"
dems_neat(dat, group_by = 'conditions')

# gender column specified
dems_neat(dat, group_by = 'conditions', gender_col = 'geschlecht')

# both columns specified
dems_neat(dat,
          group_by = 'conditions',
          age_col = 'alter',
          gender_col = 'geschlecht')

[Package neatStats version 1.13.3 Index]