R: The required sample size for testing a null hyphotesis for a...

ss4dmH {samplesize4surveys}

R Documentation

The required sample size for testing a null hyphotesis for a single difference of proportions

Description

This function returns the minimum sample size required for testing a null hyphotesis regarding a single difference of proportions.

Usage

ss4dmH(
  N,
  mu1,
  mu2,
  sigma1,
  sigma2,
  D,
  DEFF = 1,
  conf = 0.95,
  power = 0.8,
  T = 0,
  R = 1,
  plot = FALSE
)

Arguments

`N`	The maximun population size between the groups (strata) that we want to compare.
`mu1`	The value of the estimated mean of the variable of interes for the first population.
`mu2`	The value of the estimated mean of the variable of interes for the second population.
`sigma1`	The value of the estimated variance of the variable of interes for the first population.
`sigma2`	The value of the estimated mean of a variable of interes for the second population.
`D`	The minimun effect to test.
`DEFF`	The design effect of the sample design. By default `DEFF = 1`, which corresponds to a simple random sampling design.
`conf`	The statistical confidence. By default `conf = 0.95`.
`power`	The statistical power. By default `power = 0.80`.
`T`	The overlap between waves. By default `T = 0`.
`R`	The correlation between waves. By default `R = 1`.
`plot`	Optionally plot the effect against the sample size.

Details

We assume that it is of interest to test the following set of hyphotesis:

H_0: mu_1 - mu_2 = 0 \ \ \ \ vs. \ \ \ \ H_a: mu_1 - mu_2 = D \neq 0

Note that the minimun sample size, restricted to the predefined power \beta and confidence 1-\alpha, is defined by:

n = \frac{S^2}{\frac{D^2}{(z_{1-\alpha} + z_{\beta})^2}+\frac{S^2}{N}}

where S^2=(\sigma_1^2 + \sigma_2^2) * (1 - (T * R)) * DEFF

Author(s)

Hugo Andres Gutierrez Rojas <hagutierrezro at gmail.com>

References

Gutierrez, H. A. (2009), Estrategias de muestreo: Diseno de encuestas y estimacion de parametros. Editorial Universidad Santo Tomas

Examples

ss4dmH(N = 100000, mu1=50, mu2=55, sigma1 = 10, sigma2 = 12, D=3)
ss4dmH(N = 100000, mu1=50, mu2=55, sigma1 = 10, sigma2 = 12, D=1, plot=TRUE)
ss4dmH(N = 100000, mu1=50, mu2=55, sigma1 = 10, sigma2 = 12, D=0.5, DEFF = 2, plot=TRUE)
ss4dmH(N = 100000, mu1=50, mu2=55, sigma1 = 10, sigma2 = 12, D=0.5, DEFF = 2, conf = 0.99, 
       power = 0.9, plot=TRUE)

#############################
# Example with BigLucy data #
#############################
data(BigLucy)
attach(BigLucy)

N1 <- table(SPAM)[1]
N2 <- table(SPAM)[2]
N <- max(N1,N2)

BigLucy.yes <- subset(BigLucy, SPAM == 'yes')
BigLucy.no <- subset(BigLucy, SPAM == 'no')
mu1 <- mean(BigLucy.yes$Income)
mu2 <- mean(BigLucy.no$Income)
sigma1 <- sd(BigLucy.yes$Income)
sigma2 <- sd(BigLucy.no$Income)

# The minimum sample size for testing 
# H_0: mu_1 - mu_2 = 0   vs.   H_a: mu_1 - mu_2 = D = 3
D = 3
ss4dmH(N, mu1, mu2, sigma1, sigma2, D, DEFF = 2, plot=TRUE)

# The minimum sample size for testing 
# H_0: mu_1 - mu_2 = 0   vs.   H_a: mu_1 - mu_2 = D = 3
D = 3
ss4dmH(N, mu1, mu2, sigma1, sigma2, D, conf = 0.99, power = 0.9, DEFF = 3.45, plot=TRUE)

[Package samplesize4surveys version 4.1.1 Index]