impute {drugprepr}R Documentation

Impute missing or implausible values

Description

This is a workhorse function used by impute_ndd, impute_qty and others.

Usage

impute(
  data,
  variable,
  method = c("ignore", "mean", "median", "mode", "replace", "min", "max", "sum"),
  where = is.na,
  group,
  ...,
  replace_with = NA_real_
)

Arguments

data

A data frame containing columns prodcode, pracid, patid

variable

Unquoted name of the column in dataset to be imputed

method

Method for imputing the values. See details.

where

Logical vector, or function applied to variable returning such a vector, indicating which elements to impute. Defaults to is.na

group

Level of structure for imputation. Defaults to whole study population.

...

Extra arguments, currently ignored

replace_with

if the method 'replace' is selected, which value should be inserted?

  • ignore. Do nothing, leaving input unchanged.

  • mean. Replace values with the mean by group

  • median. Replace values with the median by group

  • mode. Replace values with the most common value by group

  • replace. Replace values with replace_with, which defaults to NA (i.e. mark as missing)

  • min. Replace with minimum value.

  • max. Replace with maximum value.

  • sum. Replace with sum of values.

Details

The argument where indicates which values are to be imputed. It can be specified as either a vector or as a function. Thus you can specify, for example, is.na to impute all missing values, or you can pass in a vector, if it depends on something else rather than just the current values of the variable to imputed. This design may change in future. In particular, if we want to impute implausible values and impute missing values separately, it's important that these steps are independent.

Value

A data frame of the same structure as data, with values imputed


[Package drugprepr version 0.0.4 Index]