epi.ssclus1estb {epiR}R Documentation

Sample size to estimate a binary outcome using one-stage cluster sampling

Description

Sample size to estimate a binary outcome using one-stage cluster sampling.

Usage

epi.ssclus1estb(b, Py, epsilon.r, rho, nfractional = FALSE, conf.level = 0.95)

Arguments

b

scalar integer or vector of length two, the number of individual listing units in each cluster to be sampled. See details, below.

Py

scalar number, an estimate of the unknown population proportion.

epsilon.r

the maximum relative difference between the estimate and the unknown population value.

rho

scalar number, the intracluster correlation.

nfractional

logical, return fractional sample size.

conf.level

scalar, defining the level of confidence in the computed result.

Details

b as a scalar integer represents the total number of individual listing units from each cluster to be sampled. If b is a vector of length two the first element represents the mean number of individual listing units to be sampled from each cluster and the second element represents the standard deviation of the number of individual listing units to be sampled from each cluster.

At least 25 primary sampling units are recommended for one-stage cluster sampling designs. If less than 25 clusters are returned by the function a warning is issued.

Value

A list containing the following:

n.psu

the total number of primary sampling units (clusters) to be sampled for the specified level of confidence and relative error.

n.ssu

the total number of secondary sampling units to be sampled for the specified level of confidence and relative error.

DEF

the design effect.

rho

the intracluster correlation, as entered by the user.

References

Levy PS, Lemeshow S (1999). Sampling of Populations Methods and Applications. Wiley Series in Probability and Statistics, London, pp. 258.

Machin D, Campbell MJ, Tan SB, Tan SH (2018). Sample Sizes for Clinical, Laboratory ad Epidemiological Studies, Fourth Edition. Wiley Blackwell, London, pp. 195 - 214.

Examples

## EXAMPLE 1:
## An aid project has distributed cook stoves in a single province in a 
## resource-poor country. At the end of three years, the donors would like 
## to know what proportion of households are still using their donated  
## stove. A cross-sectional study is planned where villages in the province 
## will be sampled and all households (approximately 75 per village) will be 
## visited to determine whether or not the donated stove is still in use.
## A pilot study of the prevalence of stove usage in five villages 
## showed that 0.46 of householders were still using their stove. The 
## intracluster correlation for a study of this type is unknown, but thought
## to be relatively high.

# If the donor wanted to be 90% confident that the survey estimate of stove
## usage was within 10% of the true population value, how many villages 
## (i.e. clusters) would need to be sampled?

epi.ssclus1estb(b = 75, Py = 0.46, epsilon.r = 0.10, rho = 0.20, 
   nfractional = FALSE, conf.level = 0.90)

## A total of 67 villages need to be sampled to meet the specifications 
## of this study.

## Now imagine the situation where the number of households per village 
## varies. We are told that the average number of households per village is
## 75 with the 0.025 quartile 40 households and the 0.975 quartile 180 
## households. The expected standard deviation of the number of households
## per village is (180 - 40) / 4 = 35. How many villages need to be sampled?

epi.ssclus1estb(b = c(75,35), Py = 0.46, epsilon.r = 0.10, rho = 0.20, 
   nfractional = FALSE, conf.level = 0.90)

## A total of 81 villages need to be sampled to meet the specifications 
## of this study.


[Package epiR version 2.0.31 Index]