Screen.data.frame {COINr} | R Documentation |
Screen units based on data availability
Description
Screens units (rows) based on a data availability threshold and presence of zeros. Units can be optionally "forced" to be included or excluded, making exceptions for the data availability threshold.
Usage
## S3 method for class 'data.frame'
Screen(
x,
id_col = NULL,
unit_screen,
dat_thresh = NULL,
nonzero_thresh = NULL,
Force = NULL,
...
)
Arguments
x |
A data frame |
id_col |
Name of column of the data frame to be used as the identifier, e.g. normally this would be |
unit_screen |
Specifies whether and how to screen units based on data availability or zero values.
|
dat_thresh |
A data availability threshold ( |
nonzero_thresh |
As |
Force |
A data frame with any additional units to force inclusion or exclusion. Required columns |
... |
arguments passed to or from other methods. |
Details
The two main criteria of interest are NA
values, and zeros. The summary table gives percentages of
NA
values for each unit, across indicators, and percentage zero values (as a percentage of non-NA
values).
Each unit is flagged as having low data or too many zeros based on thresholds.
See also vignette("screening")
.
Value
Missing data stats and screened data as a list.
Examples
# example data
iData <- ASEM_iData[40:51, c("uCode", "Research", "Pat", "CultServ", "CultGood")]
# screen to 75% data availability (by row)
l_scr <- Screen(iData, unit_screen = "byNA", dat_thresh = 0.75)
# summary of screening
head(l_scr$DataSummary)