R: normal data

distdicho {distdichoR}

R Documentation

normal data

Description

The distributional method for dichotomising normal data allowing for assumptions of unequal variances (based on Sauzet et al. 2014 and Peacock et al. 2012).

Usage

distdicho(x, ...)

## Default S3 method:
distdicho(x, y, cp = 0, tail = c("lower", "upper"),
  R = 1, correction = FALSE, unequal = FALSE, conf.level = 0.95,
  bootci = FALSE, nrep = 2000, ...)

## S3 method for class 'formula'
distdicho(formula, data, exposed, ...)

Arguments

`x`	A numeric vector of data values.
`...`	Further arguments to be passed to or from methods.
`y`	A numeric vector of data values.
`cp`	A numeric value specifying the cut point under which the distributional proportions are computed.
`tail`	A character string specifying the tail of the distribution in which the proportions are computed. Must be either 'lower' (default) or 'upper'.
`R`	A numeric value indicating the true ratio of variances (R = Var(x)/Var(y)). A value of 0 specifies that the true ratio of variances is unknown.
`correction`	A logical indicating whether to use a correction factor for large effect sizes (>0.7) (valid for difference in proportions only).
`unequal`	A logical variable indicating if a correction for an unknown variance ratio should be used if no assumption can be made about the variance ratio.
`conf.level`	Confidence level of the interval.
`bootci`	A logical variable indicating whether bootstrap bias-corrected confidence intervals are calculated instead of distributional ones.
`nrep`	A numeric value specifying the number of bootstrap replications (nrep must be higher than the number of observations).
`formula`	A formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding exposed and unexposed groups.
`data`	An optional matrix or data frame containing the variables. in the formula. By default, the variables are taken from `environment(formula)`.
`exposed`	A character string specifying the grouping value of the exposed group.

Details

distdicho first returns the results of a two-group unpaired t-test (allowing for unequal variances in the unequal variances cases). Followed by the distributional estimates and their standard errors (see Sauzet et al. 2014 and Peacock et al. 2012) for a difference in proportions, risk ratio and odds ratio. It also provides the distributional confidence intervals for the statistics estimated (this assumes an asymptotic normal distribution of estimates and might not be valid for small sample sizes (see Sauzet et al. 2014 for details)). Estimates are calculated using either assumption of equal variances in both groups (default R = 1) or assumption of unequal variance ratio (R != 1 & R !=0 for known variance ratio and R=0 for correction for unknown variance ratio). The data can either be given as two variables, which provide the outcome in each group or specified as a formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding exposed and unexposed groups. In all cases, it is assumed that there are only two groups.

Value

A list with class 'distdicho' containing the following components:

`data.name`	The names of the data.
`arguments`	A list with the specified arguments.
`parameter`	The mean, standard error and number of observations for both groups.
`prop`	The estimated proportions below / above the cut point for both groups.
`dist.estimates`	The difference in proportions, risk ratio and odds ratio of the groups.
`se`	The estimated standard error of the difference in proportions, the risk ratio and the odds ratio.
`ci`	The confidence intervals of the difference in proportions, the risk ratio and the odds ratio.
`method`	A character string indicating the used method.
`ttest`	A list containing the results of a t-test.

References

Peacock J.L., Sauzet O., Ewings S.M., Kerry S.M. Dichotomising continuous data while retaining statistical power using a distributional approach. Statist. Med; 2012;26:3089-3103. Sauzet, O., Peacock, J. L. Estimating dichotomised outcomes in two groups with unequal variances: a distributional approach. Statist. Med; 2014 33 4547-4559 ;DOI: 10.1002/sim.6255. Peacock, J.L., Bland, J.M., Anderson, H.R.: Preterm delivery: effects of socioeconomic factors, psychological stress, smoking, alcohol, and caffeine. BMJ 311(7004), 531-535 (1995).

Examples

## Proportions of low birth weight babies among smoking and non-smoking mothers
## (data from Peacock et al. 1995). Returns distributional estimates, standard 
## errors and distributional confidence intervals for differences in proportions,
## RR and OR of babies having a birth weight under 2500g (low birth weight)
## for group smoker (mother smokes) over the odds of LBW in group non-smoker 
## (mother doesn't smoke)
# Formula interface
distdicho(birthwt ~ smoke, cp = 2500, data = bwsmoke, exposed = 'smoker')
# Data stored in two vectors
bw_smoker <- bwsmoke$birthwt[bwsmoke$smoke == 'smoker']
bw_nonsmoker <- bwsmoke$birthwt[bwsmoke$smoke == 'non-smoker']
distdicho(x = bw_smoker, y = bw_nonsmoker, cp = 2500)


## Inverse Body Mass Index (transformation required to have a normal outcome)
## and parity (data from Peacock et al. 1995). Returns distributional estimates,
## standard errors and distributional confidence intervals for differences in 
## proportions, RR and OR of obese mothers (BMI of >30 kg/m^2) for multiparas 
## (group_par=1) over the odds of obesity in group primiparity (group_par=0).
distdicho(inv_bmi ~ group_par, cp = 0.033, data = bmi, exposed = '1')


## Inverse Body Mass Index (BMI) and employment. Returns distributional estimates,
## standard errors and distributional confidence intervals for differences in
## proportions, RR and OR with correction for unknown variance ratio of obese 
## mothers (BMI of >30 kg/m^2) for group_emp = 2 (mother unemployed) over
## the odds of obesity in group_emp = 1 (mother employed)
distdicho(inv_bmi ~ group_emp, cp = 0.033, R = 0, data = bmi2, exposed = '2')


## Inverse Body Mass Index (BMI) and employment. Returns distributional estimates,
## standard errors and distributional confidence intervals for differences in
## proportions, RR and OR computed under the hypothesis that the ratio of variances
## is equal to 1.3 of obese mothers (BMI of >30 kg/m^2) for group_emp = 2
## (mother unemployed) over the odds of obesity in group_emp = 1 (mother employed)
distdicho(inv_bmi ~ group_emp, cp = 0.033, R = 1.3, data = bmi2, exposed = '2')

[Package distdichoR version 0.1-1 Index]