R: The required sample size for estimating a single difference...

ss4dm {samplesize4surveys}

R Documentation

The required sample size for estimating a single difference of proportions

Description

This function returns the minimum sample size required for estimating a single proportion subjecto to predefined errors.

Usage

ss4dm(
  N,
  mu1,
  mu2,
  sigma1,
  sigma2,
  DEFF = 1,
  conf = 0.95,
  cve = 0.05,
  rme = 0.03,
  T = 0,
  R = 1,
  plot = FALSE
)

Arguments

`N`	The maximun population size between the groups (strata) that we want to compare.
`mu1`	The value of the estimated mean of the variable of interes for the first population.
`mu2`	The value of the estimated mean of the variable of interes for the second population.
`sigma1`	The value of the estimated variance of the variable of interes for the first population.
`sigma2`	The value of the estimated mean of a variable of interes for the second population.
`DEFF`	The design effect of the sample design. By default `DEFF = 1`, which corresponds to a simple random sampling design.
`conf`	The statistical confidence. By default conf = 0.95. By default `conf = 0.95`.
`cve`	The maximun coeficient of variation that can be allowed for the estimation.
`rme`	The maximun relative margin of error that can be allowed for the estimation.
`T`	The overlap between waves. By default `T = 0`.
`R`	The correlation between waves. By default `R = 1`.
`plot`	Optionally plot the errors (cve and margin of error) against the sample size.

Details

Note that the minimun sample size to achieve a relative margin of error \varepsilon is defined by:

n = \frac{n_0}{1+\frac{n_0}{N}}

Where

n_0=\frac{z^2_{1-\frac{alpha}{2}}S^2}{\varepsilon^2 (\mu_1 - \mu_2)^2}

and S^2=(\sigma_1^2 + \sigma_2^2) * (1 - (T * R)) * DEFF Also note that the minimun sample size to achieve a coefficient of variation cve is defined by:

n = \frac{S^2}{|\bar{y}_1-\bar{y}_2|^2 cve^2 + \frac{S^2}{N}}

Author(s)

Hugo Andres Gutierrez Rojas <hagutierrezro at gmail.com>

References

Gutierrez, H. A. (2009), Estrategias de muestreo: Diseno de encuestas y estimacion de parametros. Editorial Universidad Santo Tomas

Examples

ss4dm(N=100000, mu1=50, mu2=55, sigma1 = 10, sigma2 = 12, cve=0.05, rme=0.03)
ss4dm(N=100000, mu1=50, mu2=55, sigma1 = 10, sigma2 = 12, cve=0.05, rme=0.03, plot=TRUE)
ss4dm(N=100000, mu1=50, mu2=55, sigma1 = 10, sigma2 = 12, DEFF=3.45, conf=0.99, cve=0.03, 
     rme=0.03, plot=TRUE)

#############################
# Example with BigLucy data #
#############################
data(BigLucy)
attach(BigLucy)

N1 <- table(SPAM)[1]
N2 <- table(SPAM)[2]
N <- max(N1,N2)

BigLucy.yes <- subset(BigLucy, SPAM == 'yes')
BigLucy.no <- subset(BigLucy, SPAM == 'no')
mu1 <- mean(BigLucy.yes$Income)
mu2 <- mean(BigLucy.no$Income)
sigma1 <- sd(BigLucy.yes$Income)
sigma2 <- sd(BigLucy.no$Income)

# The minimum sample size for simple random sampling
ss4dm(N, mu1, mu2, sigma1, sigma2, DEFF=1, conf=0.99, cve=0.03, rme=0.03, plot=TRUE)
# The minimum sample size for a complex sampling design
ss4dm(N, mu1, mu2, sigma1, sigma2, DEFF=3.45, conf=0.99, cve=0.03, rme=0.03, plot=TRUE)

[Package samplesize4surveys version 4.1.1 Index]