aggregate_df {metaConvert} | R Documentation |
Aggregate a dataframe containing dependent effect sizes according to Borenstein's formulas
Description
Aggregate a dataframe containing dependent effect sizes according to Borenstein's formulas
Usage
aggregate_df(
x,
dependence = "outcomes",
cor_outcomes = 0.8,
agg_fact,
es = "es",
se = "se",
col_mean = NA,
col_weighted_mean = NA,
weights = NA,
col_sum = NA,
col_min = NA,
col_max = NA,
col_fact = NA,
na.rm = TRUE
)
Arguments
x |
a dataframe that should be aggregated (must contain effect size values and standard errors). |
dependence |
The type of dependence in your dataframe (can be either "outcomes" or "subgroups"). See details. |
cor_outcomes |
The correlation between effect sizes coming from the same clustering unit (only used when |
agg_fact |
A character string identifying the column name that contains the clustering units (all rows with the same |
es |
A character string identifying the column name containing the effect size values. Default is "es". |
se |
A character string identifying the column name containing the standard errors of the effect size. Default is "se". |
col_mean |
a vector of character strings identifying the column names for which the dependent values are summarized by taking their mean. |
col_weighted_mean |
a vector of character strings identifying the column names for which the dependent values are summarized by taking their weighted mean. |
weights |
The weights that will be used to estimated the weighted means. |
col_sum |
a vector of character strings identifying the column names for which the dependent values are summarized by taking their sum. |
col_min |
a vector of character strings identifying the column names for which the dependent values are summarized by taking their minimum. |
col_max |
a vector of character strings identifying the column names for which the dependent values are summarized by taking their maximum. |
col_fact |
a vector of character strings identifying the column names that are factors (different values will be separated by a "/" character). |
na.rm |
a logical vector indicating whether missing values should be ignored in the calculations for the |
Details
In the
dependence
argument, you should indicate "outcomes" if the dependence within the same clustering unit (e.g., study) is due to the presence of multiple effect sizes produced from the same participants (e.g., multiple outcomes, or multiple time-points)In the
dependence
argument, you should indicate "subgroups" if the dependence within the same clustering unit (e.g., study) is due to the presence of multiple effect sizes produced by independent subgroups (e.g., one effect size for boys, and one for girls).
If you are working with ratio measures, make sure that the information on the effect size estimates (i.e., the column passed to the es argument of the function) is presented on the log scale.
Value
The object returned by the aggregate_df
contains, is a dataframe containing at the very least,
the aggregating factor, and the aggregated effect size values and standard errors. All columns indicated in the col_*
arguments
will also be included in this dataframe.
row_id | the row number in the original dataset. |
es | the aggregated effect size value. |
se | the standard error of the aggregated effect size. |
... | any columns indicated in the col_* arguments. |
Examples
res <- summary(convert_df(df.haza, measure = "d"))
aggregate_df(res, dependence = "outcomes", cor_outcomes = 0.8,
agg_fact = "study_id", es = "es_crude", se = "se_crude",
col_fact = c("outcome", "type_publication"))