acc_margins {dataquieR} R Documentation

## Function to estimate marginal means, see emmeans::emmeans

### Description

margins does calculations for quality indicator Unexpected distribution wrt location (link). Therefore we pursue a combined approach of descriptive and model-based statistics to investigate differences across the levels of an auxiliary variable.

CAT: Unexpected distribution w.r.t. location

Marginal means

Marginal means rests on model based results, i.e. a significantly different marginal mean depends on sample size. Particularly in large studies, small and irrelevant differences may become significant. The contrary holds if sample size is low.

### Usage

```acc_margins(
resp_vars = NULL,
group_vars = NULL,
co_vars = NULL,
threshold_type = NULL,
threshold_value,
min_obs_in_subgroup,
study_data,
meta_data,
label_col
)
```

### Arguments

 `resp_vars` variable the name of the continuous measurement variable `group_vars` variable list the name of the observer, device or reader variable `co_vars` variable list a vector of covariables, e.g. age and sex for adjustment `threshold_type` enum empirical | user | none. In case empirical is chosen a multiplier of the scale measure is used, in case of user a value of the mean or probability (binary data) has to be defined see Implementation and use of thresholds. In case of none, no thresholds are displayed and no flagging of unusual group levels is applied. `threshold_value` numeric a multiplier or absolute value see Implementation and use of thresholds `min_obs_in_subgroup` integer from=0. optional argument if a "group_var" is used. This argument specifies the minimum no. of observations that is required to include a subgroup (level) of the "group_var" in the analysis. Subgroups with less observations are excluded. The default is 5. `study_data` data.frame the data frame that contains the measurements `meta_data` data.frame the data frame that contains metadata attributes of study data `label_col` variable attribute the name of the column in the metadata with labels of variables

### Details

Limitations

Selecting the appropriate distribution is complex. Dozens of continuous, discrete or mixed distributions are conceivable in the context of epidemiological data. Their exact exploration is beyond the scope of this data quality approach. The function above uses the help function util_dist_selection which discriminates four cases:

• continuous data

• binary data

• count data with <= 20 categories

• count data with > 20 categories

Nonetheless, only three different plot types are generated. The fourth case is treated as continuous data. This is in fact a coarsening of the original data but for the purpose of clarity this approach is chosen.

### Value

a list with:

• SummaryTable: data frame underlying the plot

• SummaryData: data frame

• SummaryPlot: ggplot2 margins plot

### Examples

```## Not run:
# runs spuriously slow on rhub
co_vars <- c("AGE_0")
label_col <- LABEL
rvs <- c("DBP_0")
group_vars <- prep_map_labels(rvs, meta_data = meta_data, from = label_col,
to = VAR_NAMES)
group_vars <- prep_map_labels(group_vars, meta_data = meta_data,
to = KEY_OBSERVER)
group_vars <- prep_map_labels(group_vars, meta_data = meta_data)
acc_margins(resp_vars = rvs,
study_data = study_data,
meta_data = meta_data,
group_vars = group_vars,
label_col = label_col,
co_vars = co_vars)

## End(Not run)
```

[Package dataquieR version 1.0.5 Index]